Table Explorer#

New in version 0.7.6:

pip install jupysql --upgrade

In this guide, we demonstrate how to use JupySQL’s table explorer to visualize SQL tables in HTML format and interact with them efficiently. By running SQL queries in the background instead of loading the data into memory, we minimize the resource consumption and processing time required for handling large datasets, making the interaction with the SQL tables faster and more streamlined.

Let’s start by preparing our dataset. We’ll be using the NYC taxi dataset.

Download the data#

from pathlib import Path
from urllib.request import urlretrieve

url = "https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2021-01.parquet"

if not Path("yellow_tripdata_2021-01.parquet").is_file():
    urlretrieve(url, "yellow_tripdata_2021.parquet")

Installation#

%pip install jupysql --upgrade --quiet
Note: you may need to restart the kernel to use updated packages.

Set connection#

After our dataset is ready, we should set our connection.

For this demonstration, we’ll be using the DuckDB connection.

%load_ext sql
%sql duckdb://
Connecting to 'duckdb://'

Create the table#

To create the table, use the explore attribute and specify the name of the table that was just downloaded.

%sqlcmd explore --table "yellow_tripdata_2021.parquet"

Parametrizing arguments#

JupySQL supports variable expansion of arguments in the form of {{variable}}. This allows the user to specify arguments with placeholders that can be replaced by variables dynamically.

table_name = "yellow_tripdata_2021.parquet"
%sqlcmd explore --table {{table_name}}