Skip to content

Python SDK

The LensClient class provides programmatic access to all Lens capabilities from Python code. It connects to your registered buckets, mounts DuckDB views, and exposes query, AI, and cache operations as method calls.

from dataspoc_lens import LensClient
from dataspoc_lens import LensClient
client = LensClient()
# List tables
tables = client.tables()
print(tables) # ['customers', 'orders', 'products']
# Run a query
result = client.query("SELECT COUNT(*) as total FROM orders")
print(result["rows"]) # [[48680]]
# Clean up
client.close()

Use LensClient as a context manager to ensure the connection is closed automatically:

from dataspoc_lens import LensClient
with LensClient() as client:
tables = client.tables()
result = client.query("SELECT * FROM orders LIMIT 5")
print(result["columns"])
print(result["rows"])

Return the list of mounted table names.

tables = client.tables()
# ['customers', 'orders', 'products']

Returns: list[str]

Return column metadata for a table.

columns = client.schema("orders")
# [
# {"column_name": "order_id", "data_type": "INTEGER"},
# {"column_name": "customer_id", "data_type": "INTEGER"},
# {"column_name": "order_date", "data_type": "DATE"},
# {"column_name": "total", "data_type": "DOUBLE"},
# {"column_name": "status", "data_type": "VARCHAR"},
# ]

Parameters: table (str) — table name

Returns: list[dict] — each dict has column_name and data_type

Execute a SQL query and return results.

result = client.query("SELECT status, COUNT(*) as cnt FROM orders GROUP BY status")
print(result["columns"]) # ['status', 'cnt']
print(result["rows"]) # [['completed', 32100], ['pending', 8450], ...]
print(result["row_count"]) # 3
print(result["duration"]) # 0.089

Parameters: sql (str) — SQL query to execute

Returns: dict with keys:

  • columns — list of column names
  • rows — list of row lists
  • row_count — number of rows returned
  • duration — query time in seconds

Translate a natural language question to SQL, execute it, and return the result.

result = client.ask("What are the top 5 customers by spending?")
print(result["sql"]) # 'SELECT c.name, SUM(o.total) ...'
print(result["columns"]) # ['name', 'total_spent']
print(result["rows"]) # [['Alice Smith', 15420.5], ...]
print(result["error"]) # None

Parameters:

  • question (str) — natural language question
  • **kwargs — forwarded to the AI module (e.g., provider, api_key, model, debug)

Returns: dict with keys:

  • sql — generated SQL query
  • columns — list of column names
  • rows — list of row lists
  • duration — total time in seconds
  • error — error message or None

Return metadata for all cached tables.

cached = client.cache_status()
# [
# {
# "table": "orders",
# "status": "fresh",
# "cached_at": "2026-04-15T10:30:00",
# "size_bytes": 12902400,
# }
# ]

Returns: list[dict] — each dict has table, status, cached_at, size_bytes

Force-refresh the local cache for a table.

info = client.cache_refresh("orders")
print(info)
# {"cached_at": "2026-04-15T14:00:00", "size_bytes": 12902400, "file_count": 4}

Parameters: table (str) — table name

Returns: dict with cached_at, size_bytes, file_count

Raises: ValueError if the table is not found in any registered bucket.

Refresh all stale cached tables.

refreshed = client.cache_refresh_stale()
# [{"table": "customers", "cached_at": "...", "size_bytes": 2200000, "file_count": 1}]

Returns: list[dict] — one entry per refreshed table

Clear cached data. If table is given, only that table is cleared. Otherwise all caches are removed.

# Clear specific table
cleared = client.cache_clear("orders")
# ['orders']
# Clear all
cleared = client.cache_clear()
# ['orders', 'customers', 'products']

Parameters: table (str | None) — optional table name

Returns: list[str] — names of cleared tables

Close the underlying DuckDB connection.

client.close()

Called automatically when using the context manager.

from dataspoc_lens import LensClient
with LensClient() as client:
# Explore
for table in client.tables():
cols = client.schema(table)
print(f"{table}: {len(cols)} columns")
# Query
result = client.query("""
SELECT
c.name,
COUNT(o.order_id) as orders,
SUM(o.total) as revenue
FROM customers c
JOIN orders o ON c.id = o.customer_id
GROUP BY c.name
ORDER BY revenue DESC
LIMIT 10
""")
for row in result["rows"]:
print(f"{row[0]}: {row[2]:,.2f}")
# AI
answer = client.ask("Which product category has the highest return rate?")
if answer["error"] is None:
print(f"SQL: {answer['sql']}")
for row in answer["rows"]:
print(row)
# Cache management
client.cache_refresh("orders")
status = client.cache_status()
for entry in status:
print(f"{entry['table']}: {entry['status']}")