sqliter takes out the misery of your life helping you with SQLite queries without turing your code into a mess. Once you have many SQLite files to handle it simplifies your query calls by structuring the way you connect to your files, the databases.
Let’s suppose you have some SQLite files spread on your computer.
Some files you are working on now, and some others you have worked in
other projects before. The standard approach would be copy the files
into your new project or point to the files wherever they are. For those
files you would simply create their connections, execute your queries
creating a brand new data.frame
and transform your data to
get your job done.
When you have many files, you find yourself creating a bunch of annoying repeated code. I don’t like repeated code, it smells bad and I take seriously the DRY principle. That is the reason why I created sqliter, to take away this misery of my life.
Use devtools
.
In order to use sqliter you must declare the path where your SQLite files are hidden.
DBM <- sqliter(path=c('data', '../project2/data', '/path/to/project3/data'))
and query the databases.
ds <- DBM$query_database_dummy('select count(*) from dummytable')
where database_dummy
is the name of SQLite file, without
extension, lying inside some directory declared in the
path
. So, it should stand for
data/database_dummy.db
,
../project2/data/database_dummy.db
or
/path/to/project3/data/database_dummy.db
.
If you have multiple files with the same name the priority is given
accordingly the path order, exactly the same way shells like bash and
csh do. Then, in our example, data/database_dummy.db
would
be the selected database.
The returned object, ds
, is a data.frame
with a column named count(*)
. The column names can be
manipulated like any other sql call, appending a label after the
variable.
For parameterized queries we have prepared queries.
You simply create queries with placeholders for the parameters and
fulfill its values passing additional arguments to query_*
function.
ds <- DBM$query_database_dummy('select name, country from dummytable where name = :name',
name='Macunaima')
Note the placeholder :name
, it is related to the
argument name='Macunaima'
. These arguments accept multiple
values like name=c('Macunaima', 'Borba Gato')
. The above
example would return a data.frame
with two columns named
name
and country
.
You can get your data transformed the way you want by using the
argument post_proc
. This argument must have a function
which expects to receive a data.frame
and returns whatever
you want. I usually use post_proc
for renaming columns and
converting strings into datetime objects.
ds <- DBM$query_database_dummy('select birthday, name, country from dummytable where name = :name',
name='Macunaima', post_proc=function(ds) {
ds <- transform(ds, Birthday=as.Date(birthday, format='%d/%m/%Y'))
ds
})
sqliter is deeply intended to research purposes, mainly data munging. I understand that by no means it should be used in any kind of production code.