Connecting to a database#

JupySQL offers several ways to configure a database connection. In this guide, we discuss the pros and cons of each.

Using the connector widget#

The easiest way to connect to a database, is via the connector widget. To learn more, see: %sqlcmd connect

create

Connecting with a .ini file#

Changed in version 0.10.0.

Using a .ini file is the recommended way to connect to databases. By default, JupySQL reads the ~/.jupysql/connections.ini file, but you can change this setting. A .ini file looks like this:

[mydb]
drivername = postgresql
username = person
password = mypass
host = localhost
port = 5432
database = db

To learn more, see: Using a connection file.

Connect with a URL string#

Important

If you connect using a URL string, do not hardcode your password in your notebook, see: Building URL strings securely

Connection strings follow the SQLAlchemy URL format.

Database URLs have the following format:

dialect+driver://username:password@host:port/database

In-memory databases have the following format:

sqlite://
duckdb://

Building URL strings securely#

To connect more securely, you can dynamically build your URL string so your password isn’t hardcoded; you can use the getpass function so you’re prompted for your password whenever you want to connect:

from getpass import getpass

password = getpass()

When you execute the cell above in a notebook, a text box will appear and whatever you type will be stored in the password variable.

Then, you can build your connection string:

db_url = f"postgresql://user:{password}@localhost/database"

Create an engine and connect:

from sqlalchemy import create_engine

engine = create_engine(db_url)
%load_ext sql
%sql engine

Important

Unlike ipython-sql, JupySQL doesn’t allow expanding your database URL with the $ character:

# this doesn't work in JupySQL!
db_url = "dialect+driver://username:password@host:port/database"
%sql $db_url

Securely storing your password#

Using a .ini file has the advantage of not having to hardcode your password. However, it’s still stored in a file in plain text. On the other hand, using getpass will always prompt you for your password, which isn’t ideal when running scheduled notebooks.

The most secure way to store your password is to use keyring, a library that uses the operating system credentials manager to securely store your password. The caveat is that the configuration settings depend on your operating system.

%pip install keyring --quiet

Once keyring is configured. Execute the following in your notebook:

import keyring

keyring.set_password("my_database", "my_username", "my_password")

Then, delete the cell above (so your password isn’t hardcoded!). Now, you can retrieve your password with:

from sqlalchemy import create_engine
import keyring

password = keyring.get_password("my_database", "my_username")
db_url = f"postgresql://user:{password}@localhost/database"

Create an engine and connect:

engine = create_engine(db_url)
%load_ext sql
%sql engine

Tip

If you have issues using keyring, send us a message on Slack.

Passing custom arguments to a URL#

Connection arguments not whitelisted by SQLALchemy can be provided with --connection_arguments. See SQLAlchemy Args.

Here’s an example using SQLite:

%load_ext sql
%sql --connection_arguments '{"timeout":10}' sqlite://

Connecting via an environment variable#

Set the DATABASE_URL environment variable, and %sql will automatically load it. You can do this either by setting the environment variable from your terminal or in your notebook:

from getpass import getpass
from os import environ

password = getpass()
environ["DATABASE_URL"] = f"postgresql://user:{password}@localhost/database"
%load_ext sql
%sql
   duckdb://
*  sqlite://

Using an existing sqlalchemy.engine.Engine#

You can use an existing Engine by passing the variable name to %sql.

import pandas as pd
from sqlalchemy.engine import create_engine
engine = create_engine("sqlite://")
df = pd.DataFrame({"x": range(5)})
df.to_sql("numbers", engine)
5
%load_ext sql
%sql engine
%%sql
SELECT * FROM numbers
   duckdb://
*  sqlite://
Done.
index x
0 0
1 1
2 2
3 3
4 4

DBAPI connections#

New in version 0.7.2.

If you are using a database that is not supported by SQLAlchemy but follows the DB API 2.0 specification, you can still use JupySQL.

Note

We currently support %sql, %sqlplot, and the ggplot API when using custom connection. However, please be advised that there may be some features/functionalities that won’t be fully compatible with JupySQL.

For this example we’ll generate a DuckDB connection, using its native connect method.

First, let’s import the library and initiazlie a new connection

import duckdb

conn = duckdb.connect()

Now, load %sql and initialize it with our DuckDB connection.

%sql conn

Download some data

import urllib

urllib.request.urlretrieve(
    "https://raw.githubusercontent.com/mwaskom/seaborn-data/master/penguins.csv",
    "penguins.csv",
)
('penguins.csv', <http.client.HTTPMessage at 0x7f3b19a220e0>)

You’re all set

%sql select * from penguins.csv limit 3
*  '<duckdb.DuckDBPyConnection object at 0x7f3b19a6b7f0>'
   duckdb://
   sqlite://
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex
Adelie Torgersen 39.1 18.7 181 3750 MALE
Adelie Torgersen 39.5 17.4 186 3800 FEMALE
Adelie Torgersen 40.3 18.0 195 3250 FEMALE

For a more detailed example, see QuestDB tutorial