Skip to content

AI Ask

DataSpoc Lens can translate natural language questions into SQL, execute the query, and return the results. No SQL knowledge required.

Option 1: Local AI with Ollama (free, no API key)

Section titled “Option 1: Local AI with Ollama (free, no API key)”
Terminal window
dataspoc-lens setup-ai

This installs and configures Ollama with a local model. Completely free, no data leaves your machine.

Install the AI extra:

Terminal window
pip install dataspoc-lens[ai]

Configure via environment variables:

Terminal window
# Anthropic (Claude)
export DATASPOC_LLM_PROVIDER=anthropic
export DATASPOC_LLM_API_KEY=sk-ant-...
# OpenAI (GPT)
export DATASPOC_LLM_PROVIDER=openai
export DATASPOC_LLM_API_KEY=sk-...

Or add to ~/.dataspoc-lens/config.yaml:

llm:
provider: anthropic # ollama, anthropic, or openai
api_key: sk-ant-... # not needed for ollama
model: "" # optional, uses provider default
ProviderCostPrivacyModel
ollamaFreeLocal (no data sent externally)Configurable (default set by setup-ai)
anthropicPaid (per token)Data sent to Anthropic APIClaude (latest)
openaiPaid (per token)Data sent to OpenAI APIGPT (latest)
  1. Environment variables (DATASPOC_LLM_PROVIDER, DATASPOC_LLM_API_KEY, DATASPOC_LLM_MODEL)
  2. ~/.dataspoc-lens/config.yaml
  3. Default: ollama
Terminal window
dataspoc-lens ask "What are the top 5 customers by total spending?"
SQL: SELECT c.name, SUM(o.total) as total_spent
FROM customers c JOIN orders o ON c.id = o.customer_id
GROUP BY c.name ORDER BY total_spent DESC LIMIT 5
┌──────────────┬─────────────┐
│ name │ total_spent │
├──────────────┼─────────────┤
│ Alice Smith │ 15420.50 │
│ Bob Johnson │ 12300.00 │
│ Carol White │ 9870.25 │
│ Dave Brown │ 8540.00 │
│ Eve Davis │ 7210.75 │
└──────────────┴─────────────┘
(5 row(s), 1.230s)

Use --debug to see the full prompt sent to the LLM:

Terminal window
dataspoc-lens ask "Monthly revenue trend" --debug
--- Prompt sent to LLM ---
You are a SQL assistant. Given the following table schemas, write a DuckDB SQL
query to answer the user's question.
Tables:
- customers (id INTEGER, name VARCHAR, email VARCHAR, ...)
- orders (order_id INTEGER, customer_id INTEGER, order_date DATE, total DOUBLE, ...)
- products (product_id INTEGER, name VARCHAR, price DOUBLE, ...)
Question: Monthly revenue trend
Return ONLY the SQL query, no explanation.
--- End of prompt ---
SQL: SELECT DATE_TRUNC('month', order_date) as month,
SUM(total) as revenue
FROM orders GROUP BY month ORDER BY month
┌────────────┬──────────┐
│ month │ revenue │
├────────────┼──────────┤
│ 2024-01-01 │ 125400 │
│ 2024-02-01 │ 138200 │
│ 2024-03-01 │ 152800 │
└────────────┴──────────┘
(3 row(s), 1.450s)
  1. Lens reads the schema of all mounted tables (column names and types)
  2. A sample of rows is included for context
  3. The schema, sample data, and your question are sent to the configured LLM
  4. The LLM returns a DuckDB-compatible SQL query
  5. Lens executes the SQL against your data lake
  6. Results are displayed (or exported with --export)
Terminal window
dataspoc-lens ask "Revenue by product category" --export revenue.csv
dataspoc-lens ask "Customer segments" --export segments.parquet
Terminal window
dataspoc-lens ask "Total orders" --output json
{
"sql": "SELECT COUNT(*) as total_orders FROM orders",
"columns": ["total_orders"],
"rows": [[48680]],
"duration": 0.95,
"error": null
}
Terminal window
# Aggregations
dataspoc-lens ask "What is the average order value?"
# Joins
dataspoc-lens ask "Which products have never been ordered?"
# Time series
dataspoc-lens ask "Show weekly order counts for the last 3 months"
# Filtering
dataspoc-lens ask "How many orders were cancelled in January?"