Testing with sqlcmd#

Note

This example uses SQLite but the same commands work for other databases.

%load_ext sql
%sql sqlite://
Connecting to 'sqlite://'

Let’s create a sample table:

%%sql sqlite://
CREATE TABLE writer (first_name, last_name, year_of_death);
INSERT INTO writer VALUES ('William', 'Shakespeare', 1616);
INSERT INTO writer VALUES ('Bertold', 'Brecht', 1956);
Hide code cell output
1 rows affected.
1 rows affected.

Run Tests on Column#

Use %sqlcmd test to run quantitative tests on your dataset.

For example, to see if all the values in the column birth_year are less than 2000, we can use:

%sqlcmd test --table writer --column year_of_death --less-than 2000
True

Because both William Shakespeare and Bertold Brecht died before the year 2000, this command will return True.

However, if we were to run:

%sqlcmd test --table writer --column year_of_death --greater 1700
greater:

+------------+-------------+---------------+
| first_name |  last_name  | year_of_death |
+------------+-------------+---------------+
|  William   | Shakespeare |      1616     |
+------------+-------------+---------------+
UsageError: The above values do not match your test requirements.

We see that a value that failed our test was William Shakespeare, as he died in 1616.

We can also pass several comparator arguments to test:

%sqlcmd test --table writer --column year_of_death --greater-or-equal 1616 --less-than-or-equal 1956
True

Here, because Shakespeare died in 1616 and Brecht in 1956, our test passes.

However, if we search for a window between 1800 and 1900:

%sqlcmd test --table writer --column year_of_death --greater 1800 --less-than 1900
greater:

+------------+-------------+---------------+
| first_name |  last_name  | year_of_death |
+------------+-------------+---------------+
|  William   | Shakespeare |      1616     |
+------------+-------------+---------------+

less_than:

+------------+-----------+---------------+
| first_name | last_name | year_of_death |
+------------+-----------+---------------+
|  Bertold   |   Brecht  |      1956     |
+------------+-----------+---------------+
UsageError: The above values do not match your test requirements.

The test fails, returning both Shakespeare and Brecht.

Currently, 5 different comparator arguments are supported: greater, greater-or-equal, less-than, less-than-or-equal, and no-nulls.

Parametrizing arguments#

JupySQL supports variable expansion of arguments in the form of {{variable}}. Let’s see an example of running tests using parametrization:

table = "writer"
column = "year_of_death"
limit = "2000"
%sqlcmd test --table {{table}} --column {{column}} --less-than {{limit}}
True