Notebooks
DataSpoc Lens integrates with JupyterLab and Marimo notebooks. Tables from your registered buckets are automatically mounted when the notebook starts.
JupyterLab
Section titled “JupyterLab”Install
Section titled “Install”pip install dataspoc-lens[jupyter]This installs JupyterLab, JupySQL, duckdb-engine, ipykernel, NumPy, and Pandas.
Launch
Section titled “Launch”dataspoc-lens notebookJupyterLab opens in your browser with all data lake tables pre-mounted as DuckDB views.
Using SQL magic
Section titled “Using SQL magic”The %%sql magic is available in all notebook cells:
%%sqlSELECT * FROM orders LIMIT 10Using the DuckDB connection
Section titled “Using the DuckDB connection”A conn variable is available with the DuckDB connection:
# Direct DuckDB queryresult = conn.execute("SELECT COUNT(*) FROM orders").fetchone()print(f"Total orders: {result[0]}")Working with Pandas
Section titled “Working with Pandas”import pandas as pd
# Query to DataFramedf = conn.execute(""" SELECT status, COUNT(*) as cnt, AVG(total) as avg_total FROM orders GROUP BY status""").fetchdf()
# Plotdf.plot(x='status', y='avg_total', kind='bar', title='Average Order Value by Status')Example notebook cells
Section titled “Example notebook cells”# Cell 1: Explore available tables%%sqlSHOW TABLES# Cell 2: Check schema%%sqlDESCRIBE orders# Cell 3: Analyze data%%sqlSELECT DATE_TRUNC('month', order_date) as month, COUNT(*) as order_count, SUM(total) as revenueFROM ordersGROUP BY monthORDER BY month# Cell 4: Visualize with Pandasdf = conn.execute(""" SELECT DATE_TRUNC('month', order_date) as month, SUM(total) as revenue FROM orders GROUP BY month ORDER BY month""").fetchdf()
df.plot(x='month', y='revenue', kind='line', title='Monthly Revenue')Marimo
Section titled “Marimo”Install
Section titled “Install”pip install dataspoc-lens[marimo]This installs Marimo, NumPy, and Pandas.
Launch
Section titled “Launch”dataspoc-lens notebook --marimoMarimo opens in your browser with tables pre-mounted. Marimo notebooks are reactive — changing a cell automatically re-runs dependent cells.
Example Marimo cells
Section titled “Example Marimo cells”import marimo as mo
# Cell: Query dataresult = conn.execute(""" SELECT status, COUNT(*) as cnt FROM orders GROUP BY status""").fetchdf()
mo.ui.table(result)# Cell: Interactive filterstatus_filter = mo.ui.dropdown( options=["completed", "pending", "cancelled"], label="Order Status")status_filter# Cell: Filtered results (reactive - updates when dropdown changes)filtered = conn.execute(f""" SELECT * FROM orders WHERE status = '{status_filter.value}' LIMIT 100""").fetchdf()
mo.ui.table(filtered)