The goal of this vignette is to describe the basic functionality of
the wdman
package.
wdman
(Webdriver Manager) is an R package that allows
the user to manage the downloading/running of third party binaries
relating to the webdriver/selenium projects. The package was inspired by
a similar node package webdriver-manager.
The checking/downloading of binaries is handled by the binman package and the running of the binaries as processes is handled by the processx package.
The wdman
package currently manages the following
binaries:
Associated with the above are five functions to download/manage the binaries:
selenium(...)
chrome(...)
phantomjs(...)
gecko(...)
iedriver(...)
The driver functions take a number of common arguments (verbose, check, retcommand) which we describe:
Each of the driver functions has a verbose
argument
which controls message output to the user. If
verbose = TRUE
then messages are relayed to the user to
inform them when drivers are checked/downloaded/ran. The default value
for the driver functions is TRUE
.
<- selenium(verbose = TRUE) selServ
## checking Selenium Server versions:
## BEGIN: PREDOWNLOAD
## BEGIN: DOWNLOAD
## BEGIN: POSTDOWNLOAD
## checking chromedriver versions:
## BEGIN: PREDOWNLOAD
## BEGIN: DOWNLOAD
## BEGIN: POSTDOWNLOAD
## checking geckodriver versions:
## BEGIN: PREDOWNLOAD
## BEGIN: DOWNLOAD
## BEGIN: POSTDOWNLOAD
## checking phantomjs versions:
## BEGIN: PREDOWNLOAD
## BEGIN: DOWNLOAD
## BEGIN: POSTDOWNLOAD
selServ$stop()
## TRUE
versus’s
<- selenium(verbose = FALSE)
selServ $stop() selServ
## TRUE
Each driver function has a check
argument. If
check= TRUE
the function will liaise with the driver
repository for any updates. If new driver versions are available these
will be downloaded. The binman package is used for
this purpose.
For diagnostic purposes each driver function has a
retcommand
argument. If retcommand = TRUE
the
command that would have been launched as a process is instead returned
as a string. As an example:
<- selenium(retcommand = TRUE, verbose = FALSE, check = FALSE)
selCommand selCommand
## [1] "/usr/bin/java -Dwebdriver.chrome.driver='/Users/jkim/Library/Application Support/binman_chromedriver/mac64/80.0.3987.16/chromedriver' -Dwebdriver.gecko.driver='/Users/jkim/Library/Application Support/binman_geckodriver/macos/0.26.0/geckodriver' -Dphantomjs.binary.path='/Users/jkim/Library/Application Support/binman_phantomjs/macosx/2.1.1/phantomjs-2.1.1-macosx/bin/phantomjs' -jar '/Users/jkim/Library/Application Support/binman_seleniumserver/generic/4.0.0-alpha-2/selenium-server-standalone-4.0.0-alpha-2.jar' -port 4567"
<- chrome(retcommand = TRUE, verbose = FALSE, check = FALSE)
chromeCommand chromeCommand
## [1] "/Users/jkim/Library/Application Support/binman_chromedriver/mac64/80.0.3987.16/chromedriver --port=4567 --url-base=wd/hub --verbose"
The selenium
function manages the Selenium Standalone
binary. It can check for updates at http://selenium-release.storage.googleapis.com/index.html
and run the resulting binaries as processes.
The binary takes a port argument which defaults to
port = 4567L
. There are a number of optional arguments to
use a particular version of the binaries related to browsers selenium
may control. By default the selenium
function will look to
use the latest version of each.
<- selenium(verbose = FALSE, check = FALSE)
selServ $process selServ
## PROCESS 'file50e6163b37b8.sh', running, pid 21289.
The selenium function returns a list of functions and a handle representing the running process.
The returned output
, error
and
log
functions give access to the stdout/stderr pipes and
the cumulative stdout/stderr messages respectively.
$log() selServ
## $stderr
## [1] "13:25:51.744 INFO [GridLauncherV3.parse] - Selenium server version: 4.0.0-alpha-2, revision: f148142cf8"
## [2] "13:25:52.174 INFO [GridLauncherV3.lambda$buildLaunchers$3] - Launching a standalone Selenium Server on port 4567"
## [3] "13:25:54.018 INFO [WebDriverServlet.<init>] - Initialising WebDriverServlet"
## [4] "13:25:54.539 INFO [SeleniumServer.boot] - Selenium Server is up and running on port 4567"
## $stdout
## character(0)
The stop
function sends a signal that terminates the
process:
$stop() selServ
## TRUE
By default the selenium
function includes paths to
chromedriver/geckodriver/ phantomjs so that the Chrome/Firefox and
PhantomJS browsers are available respectively. All versions (chromever,
geckover etc) are given as “latest”. If the user passes a value of NULL
for any driver, it will be excluded.
On Windows operating systems the option to included the Internet
Explorer driver is also given. This is set to
iedrver = NULL
so not ran by default. Set it to
iedrver = "latest"
or a specific version string to include
it on your Windows.
The chrome
function manages the Chrome Driver binary. It
can check for updates at https://chromedriver.storage.googleapis.com/index.html
and run the resulting binaries as processes.
The chrome
function runs the Chrome Driver
binary as a standalone process. It takes a default port
argument port = 4567L
. Users can then connect directly to
the chrome driver to drive a chrome browser.
Similarly to the selenium
function, the
chrome
function returns a list of four functions and a
handle to the underlying running process.
<- chrome(verbose = FALSE, check = FALSE)
cDrv $process cDrv
## PROCESS 'file534c4e940dd8.sh', running, pid 21386.
$log() cDrv
## $stderr
## character(0)
## $stdout
## [1] "Starting ChromeDriver 80.0.3987.16 (320f6526c1632ad4f205ebce69b99a062ed78647-refs/branch-heads/3987@{#185}) on port 4567"
## [2] "Only local connections are allowed."
## [3] "Please protect ports used by ChromeDriver and related test frameworks to prevent access by malicious code."
$stop() cDrv
## TRUE
The phantomjs
function manages the PhantomJS binary. It
can check for updates at https://bitbucket.org/ariya/phantomjs/downloads and run
the resulting binaries as processes.
The phantomjs
function runs the PhantomJS binary
as a standalone process in webdriver mode. It takes a default
port
argument port = 4567L
. Users can then
connect directly to the “ghostdriver” to drive a PhantomJS browser.
Currently the default version
is set to
version = "2.1.1"
. At the time of writing
2.5.0-beta
has been released. It currently does not have an
up-to-date version of ghostdriver associated with it. For this reason it
will be unstable/unpredictable to use it in webdriver mode.
Similarly to the selenium
function, the
phantomjs
function returns a list of four functions and a
handle to the underlying running process.
<- phantomjs(verbose = FALSE, check = FALSE)
pjsDrv $process pjsDrv
## PROCESS 'file5394b74d790.sh', running, pid 21443.
$log() pjsDrv
## $stderr
## character(0)
## $stdout
## [1] "[INFO - 2020-01-31T21:32:04.538Z] GhostDriver - Main - running on port 4567"
$stop() pjsDrv
## TRUE
The gecko
function manages the Gecko Driver binary. It
can check for updates at https://github.com/mozilla/geckodriver/releases and run
the resulting binaries as processes.
The gecko
function runs the Gecko Driver binary
as a standalone process. It takes a default port
argument
port = 4567L
. Users can then connect directly to the gecko
driver to drive a firefox browser. Currently the default
version
is set to
version = "2.1.1"
.
A very IMPORTANT point to note is that geckodriver implements the W3C webdriver protocol which as at the time of writing is not finalised. Currently packages such as RSelenium implement the JSONwireprotocol which whilst similar expects different return from the underlying driver.
The geckodriver implementation like the W3C webdriver specification is incomplete at this point in time.
Similarly to the selenium
function, the
gecko
function returns a list of four functions and a
handle to the underlying running process.
<- gecko(verbose = FALSE, check = FALSE)
gDrv $process gDrv
## PROCESS 'file53946017eccb.sh', running, pid 21458.
$log() gDrv
## $stderr
## character(0)
## $stdout
## character(0)
$stop()
gDrv
## TRUE
The iedriver
function manages the Internet Explorer
Driver binary. It can check for updates at http://selenium-release.storage.googleapis.com/index.html
and run the resulting binaries as processes (the iedriver is distributed
currently with the Selenium standalone binary amongst other files).
The chrome
function runs the Chrome Driver
binary as a standalone process. It takes a default port
argument port = 4567L
. Users can then connect directly to
the chrome driver to drive a chrome browser.
Please note that additional settings are required to drive an Internet Explorer browser. Security settings and zoom level need to be set correctly in the browser. The author of this document needed to set a registry entry (for ie 11). This is outlined at https://github.com/SeleniumHQ/selenium/wiki/InternetExplorerDriver in the required configuration section.
Similarly to the selenium
function the
gecko
function returns a list of four functions and a
handle to the underlying running process.
<- iedriver(verbose = FALSE, check = FALSE)
ieDrv $process ieDrv
## Process Handle
## command : C:\Users\john\AppData\Local\binman\binman_iedriverserver\win64\3.0.0\IEDriverServer.exe /port=4567 /log-level=FATAL /log-file=C:\Users\john\AppData\Local\Temp\RtmpqSdw94\file5247395f2a.txt
## system id : 7484
## state : running
$log() ieDrv
## $stderr
## character(0)
##
## $stdout
## [1] "Started InternetExplorerDriver server (64-bit)"
## [2] "3.0.0.0"
## [3] "Listening on port 4567"
## [4] "Log level is set to FATAL"
## [5] "Log file is set to C:\\Users\\john\\AppData\\Local\\Temp\\RtmpqSdw94\\file5247395f2a.txt"
## [6] "Only local connections are allowed"
$stop() ieDrv
## [1] TRUE
If you experience issues or problems running one of the
drivers/functions, please try running the command in a terminal on your
OS initially. You can access the command to run by using the
retcommand
argument in each of the main package functions.
If you continue to have problems, consider posting an issue at https://github.com/ropensci/wdman/issues