Oracle Database#

This tutorial will show you how to get an Oracle instance up and running locally to test JupySQL. You can run this in a Jupyter notebook.

Pre-requisites#

To run this tutorial, you need to install following Python packages:

%pip install oracledb --quiet
Note: you may need to restart the kernel to use updated packages.

Start Oracle instance#

We use the non-official image gvenzl/oracle-free to initial the instance, and database users (this will take 1-2 minutes):

%%bash
docker run --name oracle \
  -e ORACLE_PASSWORD=ploomber_app_admin_password \
  -e APP_USER=ploomber_app \
  -e APP_USER_PASSWORD=ploomber_app_password \
  -p 1521:1521 -d gvenzl/oracle-free
cc531fdc8802c40aa666a8b3eb52debda71fc0e64bc00ef956da22314dc9b971

Our database is running, let’s load some data!

Load sample data#

Now, let’s fetch some sample data. We’ll be using the iris.csv:

import pandas as pd

df = pd.read_csv(
    "https://github.com/Muhd-Shahid/Write-Raw-File-into-Database-Server/raw/main/iris.csv",  # noqa: E501
    index_col=False,
)
df.shape
(150, 5)
from sqlalchemy import create_engine
from sqlalchemy.types import Float

engine = create_engine(
    "oracle+oracledb://ploomber_app:ploomber_app_password@localhost:1521/?service_name=FREEPDB1"  # noqa: E501
)
df.to_sql(
    name="iris",
    con=engine,
    chunksize=1000,
    if_exists="replace",
    index=False,
    dtype={
        "sepal_length": Float(),
        "sepal_width": Float(),
        "petal_length": Float(),
        "petal_width": Float(),
    },
)
engine.dispose()

Query#

Now, let’s start JupySQL, authenticate, and start querying the data!

%load_ext sql
%sql oracle+oracledb://ploomber_app:ploomber_app_password@localhost:1521/?service_name=FREEPDB1

List the tables in the database:

%sqlcmd tables
Name
iris

Query some data in iris table

%sql SELECT * FROM iris FETCH NEXT 5 ROWS ONLY
*  oracle+oracledb://ploomber_app:***@localhost:1521/?service_name=FREEPDB1
0 rows affected.
sepal_length sepal_width petal_length petal_width species
6.7 2.5 5.8 1.8 virginica
7.2 3.6 6.1 2.5 virginica
6.5 3.2 5.1 2.0 virginica
6.4 2.7 5.3 1.9 virginica
6.8 3.0 5.5 2.1 virginica

Query our data:

%%sql
SELECT COUNT(*) FROM iris
*  oracle+oracledb://ploomber_app:***@localhost:1521/?service_name=FREEPDB1
0 rows affected.
COUNT(*)
150

Parametrize queries#

threshold = 5.0
%%sql
SELECT COUNT(*) FROM iris
WHERE sepal_length < {{threshold}}
*  oracle+oracledb://ploomber_app:***@localhost:1521/?service_name=FREEPDB1
0 rows affected.
COUNT(*)
22

CTEs#

%%sql --save saved_cte --no-execute
SELECT * FROM iris
WHERE sepal_length > 6.0
*  oracle+oracledb://ploomber_app:***@localhost:1521/?service_name=FREEPDB1
Skipping execution...
%sql --with saved_cte SELECT * FROM saved_cte FETCH NEXT 5 ROWS ONLY
*  oracle+oracledb://ploomber_app:***@localhost:1521/?service_name=FREEPDB1
0 rows affected.
sepal_length sepal_width petal_length petal_width species
6.7 2.5 5.8 1.8 virginica
7.2 3.6 6.1 2.5 virginica
6.5 3.2 5.1 2.0 virginica
6.4 2.7 5.3 1.9 virginica
6.8 3.0 5.5 2.1 virginica

This is what JupySQL executes:

query = %sqlcmd snippets saved_cte
print(query)
WITH
SELECT * FROM iris
WHERE sepal_length > 6.0

Clean up#

To stop and remove the container:

! docker container ls
CONTAINER ID   IMAGE              COMMAND                  CREATED         STATUS         PORTS                                       NAMES
cc531fdc8802   gvenzl/oracle-free   "container-entrypoin…"   7 minutes ago   Up 7 minutes   0.0.0.0:1521->1521/tcp, :::1521->1521/tcp   oracle
%%capture out
! docker container ls --filter name=oracle --quiet
container_id = out.stdout.strip()
print(f"Container id: {container_id}")
Container id: cc531fdc8802

Remove the container

! docker container stop {container_id}
! docker container rm {container_id}
cc531fdc8802
cc531fdc8802
! docker container ls
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES