%sql/%%sql#

-l / --connections List all active connections (example)

-x / --close <session-name/alias> Close named connection (example)

-c / --creator <creator-function> Specify creator function for new connection

-s / --section <section-name> Section of dsn_file to be used for generating a connection string

-p / --persist Create a table name in the database from the named DataFrame (example)

--append Like --persist, but appends to the table if it already exists (example)

-a / --connection_arguments <"{connection arguments}"> Specify dictionary of connection arguments to pass to SQL driver

-f / --file <path> Run SQL from file at this path (example)

New in version 0.4.2.

-n / --no-index Do not persist data frame’s index (used with -p/--persist) (example)

New in version 0.4.3.

-S / --save <name> Save this query for later use (example)

-w / --with <name> Use a previously saved query (used after -S/--save) (example)

New in version 0.5.2.

-A / --alias <alias> Assign an alias when establishing a connection (example)

Initialization#

%load_ext sql

Connect to database#

%sql sqlite:///db_one.db

Assign an alias to the connection (added 0.5.2):

%sql sqlite:///db_two.db --alias db-two

List connections#

%sql --list
*  (db-two) sqlite:///db_two.db
   sqlite:///db_one.db

Close connection#

%sql --close sqlite:///db_one.db

Or pass an alias (added in 0.5.2):

%sql --close db-two

Create table#

%sql sqlite://
import pandas as pd

my_data = pd.DataFrame({"x": range(3), "y": range(3)})
%sql --persist my_data
*  sqlite://
'Persisted my_data'
%sql SELECT * FROM my_data
*  sqlite://
Done.
index x y
0 0 0
1 1 1
2 2 2

Create table without DataFrame index#

my_chars = pd.DataFrame({"char": ["a", "b", "c"]})
my_chars
char
0 a
1 b
2 c
%sql --persist my_chars --no-index
*  sqlite://
'Persisted my_chars'
%sql SELECT * FROM my_chars
*  sqlite://
Done.
char
a
b
c

Append to table#

my_data = pd.DataFrame({"x": range(3, 6), "y": range(3, 6)})
%sql --append my_data
*  sqlite://
'Persisted my_data'
%sql SELECT * FROM my_data
*  sqlite://
Done.
index x y
0 0 0
1 1 1
2 2 2
0 3 3
1 4 4
2 5 5

Query#

%sql SELECT * FROM my_data LIMIT 2
*  sqlite://
Done.
index x y
0 0 0
1 1 1
%%sql
SELECT * FROM my_data LIMIT 2
*  sqlite://
Done.
index x y
0 0 0
1 1 1

Programmatic SQL queries#

QUERY = """
SELECT *
FROM my_data
LIMIT 3
"""

%sql $QUERY
*  sqlite://
(sqlite3.OperationalError) near "$QUERY": syntax error
[SQL: $QUERY]
(Background on this error at: https://sqlalche.me/e/14/e3q8)

Templated SQL queries#

from string import Template

template = Template(
    """
SELECT *
FROM my_data
LIMIT $limit
"""
)

limit_one = template.substitute(limit=1)
limit_two = template.substitute(limit=2)

Important: Ensure you sanitize the input parameters; as malicious parameters will be able to run arbitrary SQL queries.

%sql $limit_one
*  sqlite://
(sqlite3.OperationalError) near "$limit_one": syntax error
[SQL: $limit_one]
(Background on this error at: https://sqlalche.me/e/14/e3q8)
%sql $limit_two
*  sqlite://
(sqlite3.OperationalError) near "$limit_two": syntax error
[SQL: $limit_two]
(Background on this error at: https://sqlalche.me/e/14/e3q8)

Compose large queries#

%%sql --save larger_than_one --no-execute
SELECT x, y
FROM my_data
WHERE x > 1
*  sqlite://
Skipping execution...
%%sql --with larger_than_one
SELECT x, y
FROM larger_than_one
WHERE y < 5
*  sqlite://
Done.
x y
2 2
3 3
4 4

Convert result to pandas.DataFrame#

result = %sql SELECT * FROM my_data
df = result.DataFrame()
print(type(df))
df.head()
*  sqlite://
Done.
<class 'pandas.core.frame.DataFrame'>
index x y
0 0 0 0
1 1 1 1
2 2 2 2
3 0 3 3
4 1 4 4

Store as CSV#

result = %sql SELECT * FROM my_data
result.csv(filename="my_data.csv")
*  sqlite://
Done.

Run query from file#

from pathlib import Path

# generate sql file
Path("my-query.sql").write_text(
    """
SELECT *
FROM my_data
LIMIT 3
"""
)
31
%sql --file my-query.sql
*  sqlite://
Done.
index x y
0 0 0
1 1 1
2 2 2