Python SDK

The LensClient class provides programmatic access to all Lens capabilities from Python code. It connects to your registered buckets, mounts DuckDB views, and exposes query, AI, and cache operations as method calls.

Import

from dataspoc_lens import LensClient

Basic usage

from dataspoc_lens import LensClient

client = LensClient()

# List tables
tables = client.tables()
print(tables)  # ['customers', 'orders', 'products']

# Run a query
result = client.query("SELECT COUNT(*) as total FROM orders")
print(result["rows"])  # [[48680]]

# Clean up
client.close()

Context manager

Use LensClient as a context manager to ensure the connection is closed automatically:

from dataspoc_lens import LensClient

with LensClient() as client:
    tables = client.tables()
    result = client.query("SELECT * FROM orders LIMIT 5")
    print(result["columns"])
    print(result["rows"])

Methods

tables()

Return the list of mounted table names.

tables = client.tables()
# ['customers', 'orders', 'products']

Returns: list[str]

schema(table)

Return column metadata for a table.

columns = client.schema("orders")
# [
#     {"column_name": "order_id", "data_type": "INTEGER"},
#     {"column_name": "customer_id", "data_type": "INTEGER"},
#     {"column_name": "order_date", "data_type": "DATE"},
#     {"column_name": "total", "data_type": "DOUBLE"},
#     {"column_name": "status", "data_type": "VARCHAR"},
# ]

Parameters: table (str) — table name

Returns: list[dict] — each dict has column_name and data_type

query(sql)

Execute a SQL query and return results.

result = client.query("SELECT status, COUNT(*) as cnt FROM orders GROUP BY status")
print(result["columns"])   # ['status', 'cnt']
print(result["rows"])      # [['completed', 32100], ['pending', 8450], ...]
print(result["row_count"]) # 3
print(result["duration"])  # 0.089

Parameters: sql (str) — SQL query to execute

Returns: dict with keys:

columns — list of column names
rows — list of row lists
row_count — number of rows returned
duration — query time in seconds

ask(question, **kwargs)

Translate a natural language question to SQL, execute it, and return the result.

result = client.ask("What are the top 5 customers by spending?")
print(result["sql"])       # 'SELECT c.name, SUM(o.total) ...'
print(result["columns"])   # ['name', 'total_spent']
print(result["rows"])      # [['Alice Smith', 15420.5], ...]
print(result["error"])     # None

Parameters:

question (str) — natural language question
**kwargs — forwarded to the AI module (e.g., provider, api_key, model, debug)

Returns: dict with keys:

sql — generated SQL query
columns — list of column names
rows — list of row lists
duration — total time in seconds
error — error message or None

cache_status()

Return metadata for all cached tables.

cached = client.cache_status()
# [
#     {
#         "table": "orders",
#         "status": "fresh",
#         "cached_at": "2026-04-15T10:30:00",
#         "size_bytes": 12902400,
#     }
# ]

Returns: list[dict] — each dict has table, status, cached_at, size_bytes

cache_refresh(table)

Force-refresh the local cache for a table.

info = client.cache_refresh("orders")
print(info)
# {"cached_at": "2026-04-15T14:00:00", "size_bytes": 12902400, "file_count": 4}

Parameters: table (str) — table name

Returns: dict with cached_at, size_bytes, file_count

Raises: ValueError if the table is not found in any registered bucket.

cache_refresh_stale()

Refresh all stale cached tables.

refreshed = client.cache_refresh_stale()
# [{"table": "customers", "cached_at": "...", "size_bytes": 2200000, "file_count": 1}]

Returns: list[dict] — one entry per refreshed table

cache_clear(table=None)

Clear cached data. If table is given, only that table is cleared. Otherwise all caches are removed.

# Clear specific table
cleared = client.cache_clear("orders")
# ['orders']

# Clear all
cleared = client.cache_clear()
# ['orders', 'customers', 'products']

Parameters: table (str | None) — optional table name

Returns: list[str] — names of cleared tables

close()

Close the underlying DuckDB connection.

client.close()

Called automatically when using the context manager.

Full example

from dataspoc_lens import LensClient

with LensClient() as client:
    # Explore
    for table in client.tables():
        cols = client.schema(table)
        print(f"{table}: {len(cols)} columns")

    # Query
    result = client.query("""
        SELECT
            c.name,
            COUNT(o.order_id) as orders,
            SUM(o.total) as revenue
        FROM customers c
        JOIN orders o ON c.id = o.customer_id
        GROUP BY c.name
        ORDER BY revenue DESC
        LIMIT 10
    """)

    for row in result["rows"]:
        print(f"{row[0]}: {row[2]:,.2f}")

    # AI
    answer = client.ask("Which product category has the highest return rate?")
    if answer["error"] is None:
        print(f"SQL: {answer['sql']}")
        for row in answer["rows"]:
            print(row)

    # Cache management
    client.cache_refresh("orders")
    status = client.cache_status()
    for entry in status:
        print(f"{entry['table']}: {entry['status']}")