Notebooks
Esta página aún no está disponible en tu idioma.
DataSpoc Lens integrates with JupyterLab and Marimo notebooks. Tables from your registered buckets are automatically mounted when the notebook starts.
JupyterLab
Section titled “JupyterLab”Install
Section titled “Install”pip install dataspoc-lens[jupyter]This installs JupyterLab, JupySQL, duckdb-engine, ipykernel, NumPy, and Pandas.
Launch
Section titled “Launch”dataspoc-lens notebookJupyterLab opens in your browser with all data lake tables pre-mounted as DuckDB views.
Using SQL magic
Section titled “Using SQL magic”The %%sql magic is available in all notebook cells:
%%sqlSELECT * FROM orders LIMIT 10Using the DuckDB connection
Section titled “Using the DuckDB connection”A conn variable is available with the DuckDB connection:
# Direct DuckDB queryresult = conn.execute("SELECT COUNT(*) FROM orders").fetchone()print(f"Total orders: {result[0]}")Working with Pandas
Section titled “Working with Pandas”import pandas as pd
# Query to DataFramedf = conn.execute(""" SELECT status, COUNT(*) as cnt, AVG(total) as avg_total FROM orders GROUP BY status""").fetchdf()
# Plotdf.plot(x='status', y='avg_total', kind='bar', title='Average Order Value by Status')Example notebook cells
Section titled “Example notebook cells”# Cell 1: Explore available tables%%sqlSHOW TABLES# Cell 2: Check schema%%sqlDESCRIBE orders# Cell 3: Analyze data%%sqlSELECT DATE_TRUNC('month', order_date) as month, COUNT(*) as order_count, SUM(total) as revenueFROM ordersGROUP BY monthORDER BY month# Cell 4: Visualize with Pandasdf = conn.execute(""" SELECT DATE_TRUNC('month', order_date) as month, SUM(total) as revenue FROM orders GROUP BY month ORDER BY month""").fetchdf()
df.plot(x='month', y='revenue', kind='line', title='Monthly Revenue')Marimo
Section titled “Marimo”Install
Section titled “Install”pip install dataspoc-lens[marimo]This installs Marimo, NumPy, and Pandas.
Launch
Section titled “Launch”dataspoc-lens notebook --marimoMarimo opens in your browser with tables pre-mounted. Marimo notebooks are reactive — changing a cell automatically re-runs dependent cells.
Example Marimo cells
Section titled “Example Marimo cells”import marimo as mo
# Cell: Query dataresult = conn.execute(""" SELECT status, COUNT(*) as cnt FROM orders GROUP BY status""").fetchdf()
mo.ui.table(result)# Cell: Interactive filterstatus_filter = mo.ui.dropdown( options=["completed", "pending", "cancelled"], label="Order Status")status_filter# Cell: Filtered results (reactive - updates when dropdown changes)filtered = conn.execute(f""" SELECT * FROM orders WHERE status = '{status_filter.value}' LIMIT 100""").fetchdf()
mo.ui.table(filtered)