Agent Quickstart
This guide gets you from zero to a working AI agent connected to your data lake in 5 minutes. By the end, you will have Claude Desktop (or any MCP-compatible client) managing your pipelines and querying your data through natural conversation.
1. Install
Section titled “1. Install”Install both DataSpoc products with MCP support:
pip install dataspoc-pipe[mcp] dataspoc-lens[mcp]This gives you two MCP servers: one for pipeline management (Pipe) and one for data querying (Lens).
2. Configure Claude Desktop
Section titled “2. Configure Claude Desktop”Open your Claude Desktop MCP configuration file:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Linux:
~/.config/Claude/claude_desktop_config.json
Paste this configuration with both servers:
{ "mcpServers": { "dataspoc-pipe": { "command": "dataspoc-pipe", "args": ["mcp"], "env": { "DATASPOC_BUCKET": "s3://my-data" } }, "dataspoc-lens": { "command": "dataspoc-lens", "args": ["mcp"], "env": { "DATASPOC_BUCKET": "s3://my-data" } } }}Replace s3://my-data with your actual bucket URI. For local testing, use file:///tmp/lake.
If dataspoc-pipe and dataspoc-lens are installed in a virtual environment, use the full path:
{ "mcpServers": { "dataspoc-pipe": { "command": "/home/you/.venv/bin/dataspoc-pipe", "args": ["mcp"] }, "dataspoc-lens": { "command": "/home/you/.venv/bin/dataspoc-lens", "args": ["mcp"] } }}Restart Claude Desktop after saving the file.
3. Your first agent query
Section titled “3. Your first agent query”Open Claude Desktop and ask:
You: What tables do I have?
Claude calls the Lens list_tables tool and responds:
Claude: You have 3 tables in your data lake:
- customers (4 columns: id, name, email, created_at)
- orders (5 columns: order_id, customer_id, order_date, total, status)
- products (6 columns: product_id, name, category, price, sku, active)
Try a follow-up:
You: How many orders were placed this week?
Claude (runs a SQL query): There were 2,480 orders this week with a total revenue of $225,120.
4. Your first agent pipeline run
Section titled “4. Your first agent pipeline run”Ask Claude to manage your pipelines:
You: What pipelines are configured?
Claude calls the Pipe list_pipelines tool:
Claude: You have 3 pipelines: orders, customers, and events.
You: Run the sales pipeline.
Claude (calls
run_pipeline): The orders pipeline completed successfully. 5,200 records were extracted.
You: Check if any pipelines failed recently.
Claude (calls
pipeline_status): All 3 pipelines are healthy. The last run was today at 02:00.
5. Combine both: ingest and query
Section titled “5. Combine both: ingest and query”The real power is using both servers together:
You: Run the orders pipeline to get fresh data, refresh the cache, then tell me today’s revenue.
Claude: I will run the pipeline, refresh the Lens cache, and query the results.
- Pipeline orders completed: 5,200 records extracted.
- Cache refreshed for orders table.
- Today’s revenue: $17,023.50 across 187 orders.
What’s next
Section titled “What’s next”Now that your agent is connected, explore the detailed guides for each product:
- Pipe Agent Integration — Full MCP tool reference, example conversations, CrewAI and LangGraph examples, and best practices for pipeline automation.
- Lens Agent Integration — Full MCP tool reference, example conversations, CrewAI, LangGraph, and AutoGen examples, and best practices for data analysis.
- MCP Server Setup — Detailed setup for Claude Desktop, Cursor, Windsurf, and Claude Code.
- Python SDK — Build custom agents with the PipeClient and LensClient Python classes.
- JSON Output — Use
--output jsonwith shell scripts and subprocess calls.