MySQL#
In this tutorial, we’ll see how to query MySQL from Jupyter. Optionally, you can spin up a testing server.
Tip
If you encounter issues, feel free to join our community and we’ll be happy to help!
Installing the MySQL driver#
To run this tutorial, you need to install the mysqlclient
package.
Note
We highly recommend you that you install it using conda
, since it’ll also install mysql-connector-c
; if you want to use pip
, then you need to install mysql-connector-c
and then mysqlclient
.
%conda install mysqlclient -c conda-forge --quiet
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done
## Package Plan ##
environment location: /Users/eduardo/miniconda3/envs/jupysql
added / updated specs:
- mysqlclient
The following NEW packages will be INSTALLED:
mysql-connector-c pkgs/main/osx-arm64::mysql-connector-c-6.1.11-h4a942e0_1
mysqlclient pkgs/main/osx-arm64::mysqlclient-2.0.3-py39hc377ac9_1
The following packages will be SUPERSEDED by a higher-priority channel:
ca-certificates pkgs/main::ca-certificates-2023.01.10~ --> conda-forge::ca-certificates-2022.12.7-h4653dfc_0
certifi pkgs/main/osx-arm64::certifi-2022.12.~ --> conda-forge/noarch::certifi-2022.12.7-pyhd8ed1ab_0
openssl pkgs/main::openssl-1.1.1t-h1a28f6b_0 --> conda-forge::openssl-1.1.1t-h03a7124_0
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
Note: you may need to restart the kernel to use updated packages.
Starting a MySQL server with Docker#
If you don’t have a MySQL Server running or you want to spin up one for testing, you can do it with the official Docker image.
To start the server:
%%bash
docker run --name mysql -e MYSQL_DATABASE=db \
-e MYSQL_USER=user \
-e MYSQL_PASSWORD=password \
-e MYSQL_ROOT_PASSWORD=password \
-p 3306:3306 -d mysql
7c8acff152e0f2eac2a5c954ca01f1ddc99419d2e342560d94c6955c036d7d93
Ensure that the container is running:
%%bash
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7c8acff152e0 mysql "docker-entrypoint.s…" 4 seconds ago Up 4 seconds 0.0.0.0:3306->3306/tcp, 33060/tcp mysql
We need to make a small configuration change, so do the following:
Open a new terminal and execute: docker exec -it mysql bash
Then: mysql --user=root --password
When prompted for a password, type: password
Once the MySQL console appears, execute:
ALTER USER user
IDENTIFIED WITH mysql_native_password
BY 'password';
Exit the MySQL console with: exit
Exit the container with: exit
The session should look like this:
docker exec -it mysql bash
bash-4.4# mysql --user=root --password
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 9
Server version: 8.0.31 MySQL Community Server - GPL
Copyright (c) 2000, 2022, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> ALTER USER user
-> IDENTIFIED WITH mysql_native_password
-> BY 'password';
Query OK, 0 rows affected (0.01 sec)
mysql> exit
Bye
bash-4.4# exit
exit
Load sample data#
Now, let’s fetch some sample data. We’ll be using the NYC taxi dataset:
%pip install pandas pyarrow --quiet
import pandas as pd
df = pd.read_parquet(
"https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2021-01.parquet"
)
df.shape
(1369769, 19)
As you can see, this chunk of data contains ~1.4M rows, loading the data will take about a minute:
from sqlalchemy import create_engine
engine = create_engine("mysql+mysqldb://user:password@127.0.0.1:3306/db")
df.to_sql(name="taxi", con=engine, chunksize=100_000)
engine.dispose()
Query#
Now, let’s install JupySQL, authenticate and start querying the data!
%pip install jupysql --quiet
%load_ext sql
%sql mysql+mysqldb://user:password@127.0.0.1:3306/db
Note: you may need to restart the kernel to use updated packages.
The sql extension is already loaded. To reload it, use:
%reload_ext sql
Important
If the cell above fails, you might have some missing packages. Message us on Slack and we’ll help you!
%%sql
SELECT COUNT(*) FROM taxi
* mysql+mysqldb://user:***@127.0.0.1:3306/db
1 rows affected.
COUNT(*) |
---|
1369769 |
%%sql
SELECT * FROM taxi
LIMIT 3
* mysql+mysqldb://user:***@127.0.0.1:3306/db
3 rows affected.
index | VendorID | tpep_pickup_datetime | tpep_dropoff_datetime | passenger_count | trip_distance | RatecodeID | store_and_fwd_flag | PULocationID | DOLocationID | payment_type | fare_amount | extra | mta_tax | tip_amount | tolls_amount | improvement_surcharge | total_amount | congestion_surcharge | airport_fee |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 2021-01-01 00:30:10 | 2021-01-01 00:36:12 | 1.0 | 2.1 | 1.0 | N | 142 | 43 | 2 | 8.0 | 3.0 | 0.5 | 0.0 | 0.0 | 0.3 | 11.8 | 2.5 | None |
1 | 1 | 2021-01-01 00:51:20 | 2021-01-01 00:52:19 | 1.0 | 0.2 | 1.0 | N | 238 | 151 | 2 | 3.0 | 0.5 | 0.5 | 0.0 | 0.0 | 0.3 | 4.3 | 0.0 | None |
2 | 1 | 2021-01-01 00:43:30 | 2021-01-01 01:11:06 | 1.0 | 14.7 | 1.0 | N | 132 | 165 | 1 | 42.0 | 0.5 | 0.5 | 8.65 | 0.0 | 0.3 | 51.95 | 0.0 | None |
Clean up#
To stop and remove the container:
%%bash
docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7c8acff152e0 mysql "docker-entrypoint.s…" 9 minutes ago Up 9 minutes 0.0.0.0:3306->3306/tcp, 33060/tcp mysql
%%bash
docker container stop mysql
mysql
%%bash
docker container rm mysql
mysql
%%bash
docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES