Introduction#
JupySQL allows you to run SQL in Jupyter/IPython via a %sql
and %%sql
magics.
%load_ext sql
%%sql sqlite://
CREATE TABLE languages (name, rating, change);
INSERT INTO languages VALUES ('Python', 14.44, 2.48);
INSERT INTO languages VALUES ('C', 13.13, 1.50);
INSERT INTO languages VALUES ('Java', 11.59, 0.40);
INSERT INTO languages VALUES ('C++', 10.00, 1.98);
Done.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
[]
Note: data from the TIOBE index
%sql SELECT * FROM languages
* sqlite://
Done.
name | rating | change |
---|---|---|
Python | 14.44 | 2.48 |
C | 13.13 | 1.5 |
Java | 11.59 | 0.4 |
C++ | 10.0 | 1.98 |
result = _
print(result)
+--------+--------+--------+
| name | rating | change |
+--------+--------+--------+
| Python | 14.44 | 2.48 |
| C | 13.13 | 1.5 |
| Java | 11.59 | 0.4 |
| C++ | 10.0 | 1.98 |
+--------+--------+--------+
result.keys
RMKeyView(['name', 'rating', 'change'])
result[0][0]
'Python'
result[0].rating
14.44
After the first connection, connect info can be omitted::
%sql select count(*) from languages
* sqlite://
Done.
count(*) |
---|
4 |
Connections to multiple databases can be maintained. You can refer to an existing connection by username@database
%%sql will@shakes
select charname, speechcount from character
where speechcount = (select max(speechcount)
from character);
print(_)
If no connect string is supplied, %sql
will provide a list of existing connections;
however, if no connections have yet been made and the environment variable DATABASE_URL
is available, that will be used.
For secure access, you may dynamically access your credentials (e.g. from your system environment or getpass.getpass
) to avoid storing your password in the notebook itself. Use the $
before any variable to access it in your %sql
command.
user = os.getenv('SOME_USER')
password = os.getenv('SOME_PASSWORD')
connection_string = "postgresql://{user}:{password}@localhost/some_database".format(user=user, password=password)
%sql $connection_string
You may use multiple SQL statements inside a single cell, but you will only see any query results from the last of them, so this really only makes sense for statements with no output
%%sql sqlite://
CREATE TABLE writer (first_name, last_name, year_of_death);
INSERT INTO writer VALUES ('William', 'Shakespeare', 1616);
INSERT INTO writer VALUES ('Bertold', 'Brecht', 1956);
As a convenience, dict-style access for result sets is supported, with the leftmost column serving as key, for unique values.
result = %sql select * from work
result['richard2']
Results can also be retrieved as an iterator of dictionaries (result.dicts()
)
or a single dictionary with a tuple of scalar values per key (result.dict()
)
Assignment#
Ordinary IPython assignment works for single-line %sql
queries:
lang = %sql SELECT * FROM languages
* sqlite://
Done.
The <<
operator captures query results in a local variable, and
can be used in multi-line %%sql
:
%%sql lang << SELECT *
FROM languages
* sqlite://
Done.
Considerations#
Because jupysql accepts --
-delimited options like --persist
, but --
is also the syntax to denote a SQL comment, the parser needs to make some assumptions.
If you try to pass an unsupported argument, like
--lutefisk
, it will be interpreted as a SQL comment and will not throw an unsupported argument exception.If the SQL statement begins with a first-line comment that looks like one of the accepted arguments - like
%sql --persist is great!
- it will be parsed like an argument, not a comment. Moving the comment to the second line or later will avoid this.