Introduction#

JupySQL allows you to run SQL in Jupyter/IPython via a %sql and %%sql magics.

%load_ext sql
%%sql sqlite://
CREATE TABLE languages (name, rating, change);
INSERT INTO languages VALUES ('Python', 14.44, 2.48);
INSERT INTO languages VALUES ('C', 13.13, 1.50);
INSERT INTO languages VALUES ('Java', 11.59, 0.40);
INSERT INTO languages VALUES ('C++', 10.00, 1.98);
Done.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
[]

Note: data from the TIOBE index

%sql SELECT * FROM languages
*  sqlite://
Done.
name rating change
Python 14.44 2.48
C 13.13 1.5
Java 11.59 0.4
C++ 10.0 1.98
result = _
print(result)
+--------+--------+--------+
|  name  | rating | change |
+--------+--------+--------+
| Python | 14.44  |  2.48  |
|   C    | 13.13  |  1.5   |
|  Java  | 11.59  |  0.4   |
|  C++   |  10.0  |  1.98  |
+--------+--------+--------+
result.keys
RMKeyView(['name', 'rating', 'change'])
result[0][0]
'Python'
result[0].rating
14.44

After the first connection, connect info can be omitted::

%sql select count(*) from languages
*  sqlite://
Done.
count(*)
4

Connections to multiple databases can be maintained. You can refer to an existing connection by username@database

%%sql will@shakes
select charname, speechcount from character
where  speechcount = (select max(speechcount)
from character);
print(_)

If no connect string is supplied, %sql will provide a list of existing connections; however, if no connections have yet been made and the environment variable DATABASE_URL is available, that will be used.

For secure access, you may dynamically access your credentials (e.g. from your system environment or getpass.getpass) to avoid storing your password in the notebook itself. Use the $ before any variable to access it in your %sql command.

user = os.getenv('SOME_USER')
password = os.getenv('SOME_PASSWORD')
connection_string = "postgresql://{user}:{password}@localhost/some_database".format(user=user, password=password)
%sql $connection_string

You may use multiple SQL statements inside a single cell, but you will only see any query results from the last of them, so this really only makes sense for statements with no output

%%sql sqlite://
CREATE TABLE writer (first_name, last_name, year_of_death);
INSERT INTO writer VALUES ('William', 'Shakespeare', 1616);
INSERT INTO writer VALUES ('Bertold', 'Brecht', 1956);

As a convenience, dict-style access for result sets is supported, with the leftmost column serving as key, for unique values.

result = %sql select * from work
result['richard2']

Results can also be retrieved as an iterator of dictionaries (result.dicts()) or a single dictionary with a tuple of scalar values per key (result.dict())

Assignment#

Ordinary IPython assignment works for single-line %sql queries:

lang = %sql SELECT * FROM languages
*  sqlite://
Done.

The << operator captures query results in a local variable, and can be used in multi-line %%sql:

%%sql lang << SELECT *
FROM languages
*  sqlite://
Done.

Considerations#

Because jupysql accepts ---delimited options like --persist, but -- is also the syntax to denote a SQL comment, the parser needs to make some assumptions.

  • If you try to pass an unsupported argument, like --lutefisk, it will be interpreted as a SQL comment and will not throw an unsupported argument exception.

  • If the SQL statement begins with a first-line comment that looks like one of the accepted arguments - like %sql --persist is great! - it will be parsed like an argument, not a comment. Moving the comment to the second line or later will avoid this.