Ir al contenido

Notebooks

Esta página aún no está disponible en tu idioma.

DataSpoc Lens integrates with JupyterLab and Marimo notebooks. Tables from your registered buckets are automatically mounted when the notebook starts.

Terminal window
pip install dataspoc-lens[jupyter]

This installs JupyterLab, JupySQL, duckdb-engine, ipykernel, NumPy, and Pandas.

Terminal window
dataspoc-lens notebook

JupyterLab opens in your browser with all data lake tables pre-mounted as DuckDB views.

The %%sql magic is available in all notebook cells:

%%sql
SELECT * FROM orders LIMIT 10

A conn variable is available with the DuckDB connection:

# Direct DuckDB query
result = conn.execute("SELECT COUNT(*) FROM orders").fetchone()
print(f"Total orders: {result[0]}")
import pandas as pd
# Query to DataFrame
df = conn.execute("""
SELECT status, COUNT(*) as cnt, AVG(total) as avg_total
FROM orders
GROUP BY status
""").fetchdf()
# Plot
df.plot(x='status', y='avg_total', kind='bar', title='Average Order Value by Status')
# Cell 1: Explore available tables
%%sql
SHOW TABLES
# Cell 2: Check schema
%%sql
DESCRIBE orders
# Cell 3: Analyze data
%%sql
SELECT
DATE_TRUNC('month', order_date) as month,
COUNT(*) as order_count,
SUM(total) as revenue
FROM orders
GROUP BY month
ORDER BY month
# Cell 4: Visualize with Pandas
df = conn.execute("""
SELECT DATE_TRUNC('month', order_date) as month,
SUM(total) as revenue
FROM orders
GROUP BY month ORDER BY month
""").fetchdf()
df.plot(x='month', y='revenue', kind='line', title='Monthly Revenue')
Terminal window
pip install dataspoc-lens[marimo]

This installs Marimo, NumPy, and Pandas.

Terminal window
dataspoc-lens notebook --marimo

Marimo opens in your browser with tables pre-mounted. Marimo notebooks are reactive — changing a cell automatically re-runs dependent cells.

import marimo as mo
# Cell: Query data
result = conn.execute("""
SELECT status, COUNT(*) as cnt
FROM orders GROUP BY status
""").fetchdf()
mo.ui.table(result)
# Cell: Interactive filter
status_filter = mo.ui.dropdown(
options=["completed", "pending", "cancelled"],
label="Order Status"
)
status_filter
# Cell: Filtered results (reactive - updates when dropdown changes)
filtered = conn.execute(f"""
SELECT * FROM orders
WHERE status = '{status_filter.value}'
LIMIT 100
""").fetchdf()
mo.ui.table(filtered)