AI Ask
DataSpoc Lens can translate natural language questions into SQL, execute the query, and return the results. No SQL knowledge required.
Option 1: Local AI with Ollama (free, no API key)
Section titled “Option 1: Local AI with Ollama (free, no API key)”dataspoc-lens setup-aiThis installs and configures Ollama with a local model. Completely free, no data leaves your machine.
Option 2: Cloud providers
Section titled “Option 2: Cloud providers”Install the AI extra:
pip install dataspoc-lens[ai]Configure via environment variables:
# Anthropic (Claude)export DATASPOC_LLM_PROVIDER=anthropicexport DATASPOC_LLM_API_KEY=sk-ant-...
# OpenAI (GPT)export DATASPOC_LLM_PROVIDER=openaiexport DATASPOC_LLM_API_KEY=sk-...Or add to ~/.dataspoc-lens/config.yaml:
llm: provider: anthropic # ollama, anthropic, or openai api_key: sk-ant-... # not needed for ollama model: "" # optional, uses provider defaultProviders
Section titled “Providers”| Provider | Cost | Privacy | Model |
|---|---|---|---|
ollama | Free | Local (no data sent externally) | Configurable (default set by setup-ai) |
anthropic | Paid (per token) | Data sent to Anthropic API | Claude (latest) |
openai | Paid (per token) | Data sent to OpenAI API | GPT (latest) |
Configuration priority
Section titled “Configuration priority”- Environment variables (
DATASPOC_LLM_PROVIDER,DATASPOC_LLM_API_KEY,DATASPOC_LLM_MODEL) ~/.dataspoc-lens/config.yaml- Default:
ollama
dataspoc-lens ask "What are the top 5 customers by total spending?"SQL: SELECT c.name, SUM(o.total) as total_spent FROM customers c JOIN orders o ON c.id = o.customer_id GROUP BY c.name ORDER BY total_spent DESC LIMIT 5
┌──────────────┬─────────────┐│ name │ total_spent │├──────────────┼─────────────┤│ Alice Smith │ 15420.50 ││ Bob Johnson │ 12300.00 ││ Carol White │ 9870.25 ││ Dave Brown │ 8540.00 ││ Eve Davis │ 7210.75 │└──────────────┴─────────────┘
(5 row(s), 1.230s)Debug mode
Section titled “Debug mode”Use --debug to see the full prompt sent to the LLM:
dataspoc-lens ask "Monthly revenue trend" --debug--- Prompt sent to LLM ---You are a SQL assistant. Given the following table schemas, write a DuckDB SQLquery to answer the user's question.
Tables:- customers (id INTEGER, name VARCHAR, email VARCHAR, ...)- orders (order_id INTEGER, customer_id INTEGER, order_date DATE, total DOUBLE, ...)- products (product_id INTEGER, name VARCHAR, price DOUBLE, ...)
Question: Monthly revenue trend
Return ONLY the SQL query, no explanation.--- End of prompt ---
SQL: SELECT DATE_TRUNC('month', order_date) as month, SUM(total) as revenue FROM orders GROUP BY month ORDER BY month
┌────────────┬──────────┐│ month │ revenue │├────────────┼──────────┤│ 2024-01-01 │ 125400 ││ 2024-02-01 │ 138200 ││ 2024-03-01 │ 152800 │└────────────┴──────────┘
(3 row(s), 1.450s)How it works
Section titled “How it works”- Lens reads the schema of all mounted tables (column names and types)
- A sample of rows is included for context
- The schema, sample data, and your question are sent to the configured LLM
- The LLM returns a DuckDB-compatible SQL query
- Lens executes the SQL against your data lake
- Results are displayed (or exported with
--export)
Export results
Section titled “Export results”dataspoc-lens ask "Revenue by product category" --export revenue.csvdataspoc-lens ask "Customer segments" --export segments.parquetJSON output
Section titled “JSON output”dataspoc-lens ask "Total orders" --output json{ "sql": "SELECT COUNT(*) as total_orders FROM orders", "columns": ["total_orders"], "rows": [[48680]], "duration": 0.95, "error": null}Examples
Section titled “Examples”# Aggregationsdataspoc-lens ask "What is the average order value?"
# Joinsdataspoc-lens ask "Which products have never been ordered?"
# Time seriesdataspoc-lens ask "Show weekly order counts for the last 3 months"
# Filteringdataspoc-lens ask "How many orders were cancelled in January?"