Skip to content

Notebooks

DataSpoc Lens integrates with JupyterLab and Marimo notebooks. Tables from your registered buckets are automatically mounted when the notebook starts.

Terminal window
pip install dataspoc-lens[jupyter]

This installs JupyterLab, JupySQL, duckdb-engine, ipykernel, NumPy, and Pandas.

Terminal window
dataspoc-lens notebook

JupyterLab opens in your browser with all data lake tables pre-mounted as DuckDB views.

The %%sql magic is available in all notebook cells:

%%sql
SELECT * FROM orders LIMIT 10

A conn variable is available with the DuckDB connection:

# Direct DuckDB query
result = conn.execute("SELECT COUNT(*) FROM orders").fetchone()
print(f"Total orders: {result[0]}")
import pandas as pd
# Query to DataFrame
df = conn.execute("""
SELECT status, COUNT(*) as cnt, AVG(total) as avg_total
FROM orders
GROUP BY status
""").fetchdf()
# Plot
df.plot(x='status', y='avg_total', kind='bar', title='Average Order Value by Status')
# Cell 1: Explore available tables
%%sql
SHOW TABLES
# Cell 2: Check schema
%%sql
DESCRIBE orders
# Cell 3: Analyze data
%%sql
SELECT
DATE_TRUNC('month', order_date) as month,
COUNT(*) as order_count,
SUM(total) as revenue
FROM orders
GROUP BY month
ORDER BY month
# Cell 4: Visualize with Pandas
df = conn.execute("""
SELECT DATE_TRUNC('month', order_date) as month,
SUM(total) as revenue
FROM orders
GROUP BY month ORDER BY month
""").fetchdf()
df.plot(x='month', y='revenue', kind='line', title='Monthly Revenue')
Terminal window
pip install dataspoc-lens[marimo]

This installs Marimo, NumPy, and Pandas.

Terminal window
dataspoc-lens notebook --marimo

Marimo opens in your browser with tables pre-mounted. Marimo notebooks are reactive — changing a cell automatically re-runs dependent cells.

import marimo as mo
# Cell: Query data
result = conn.execute("""
SELECT status, COUNT(*) as cnt
FROM orders GROUP BY status
""").fetchdf()
mo.ui.table(result)
# Cell: Interactive filter
status_filter = mo.ui.dropdown(
options=["completed", "pending", "cancelled"],
label="Order Status"
)
status_filter
# Cell: Filtered results (reactive - updates when dropdown changes)
filtered = conn.execute(f"""
SELECT * FROM orders
WHERE status = '{status_filter.value}'
LIMIT 100
""").fetchdf()
mo.ui.table(filtered)