Agent Quickstart

This guide gets you from zero to a working AI agent connected to your data lake in 5 minutes. By the end, you will have Claude Desktop (or any MCP-compatible client) managing your pipelines and querying your data through natural conversation.

1. Install

Install both DataSpoc products with MCP support:

pip install dataspoc-pipe[mcp] dataspoc-lens[mcp]

This gives you two MCP servers: one for pipeline management (Pipe) and one for data querying (Lens).

2. Configure Claude Desktop

Open your Claude Desktop MCP configuration file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json

Paste this configuration with both servers:

{
  "mcpServers": {
    "dataspoc-pipe": {
      "command": "dataspoc-pipe",
      "args": ["mcp"],
      "env": {
        "DATASPOC_BUCKET": "s3://my-data"
      }
    },
    "dataspoc-lens": {
      "command": "dataspoc-lens",
      "args": ["mcp"],
      "env": {
        "DATASPOC_BUCKET": "s3://my-data"
      }
    }
  }
}

Replace s3://my-data with your actual bucket URI. For local testing, use file:///tmp/lake.

If dataspoc-pipe and dataspoc-lens are installed in a virtual environment, use the full path:

{
  "mcpServers": {
    "dataspoc-pipe": {
      "command": "/home/you/.venv/bin/dataspoc-pipe",
      "args": ["mcp"]
    },
    "dataspoc-lens": {
      "command": "/home/you/.venv/bin/dataspoc-lens",
      "args": ["mcp"]
    }
  }
}

Restart Claude Desktop after saving the file.

3. Your first agent query

Open Claude Desktop and ask:

You: What tables do I have?

Claude calls the Lens list_tables tool and responds:

Claude: You have 3 tables in your data lake:

customers (4 columns: id, name, email, created_at)

orders (5 columns: order_id, customer_id, order_date, total, status)

products (6 columns: product_id, name, category, price, sku, active)

Try a follow-up:

You: How many orders were placed this week?

Claude (runs a SQL query): There were 2,480 orders this week with a total revenue of $225,120.

4. Your first agent pipeline run

Ask Claude to manage your pipelines:

You: What pipelines are configured?

Claude calls the Pipe list_pipelines tool:

Claude: You have 3 pipelines: orders, customers, and events.

You: Run the sales pipeline.

Claude (calls run_pipeline): The orders pipeline completed successfully. 5,200 records were extracted.

You: Check if any pipelines failed recently.

Claude (calls pipeline_status): All 3 pipelines are healthy. The last run was today at 02:00.

5. Combine both: ingest and query

The real power is using both servers together:

You: Run the orders pipeline to get fresh data, refresh the cache, then tell me today’s revenue.

Claude: I will run the pipeline, refresh the Lens cache, and query the results.

Pipeline orders completed: 5,200 records extracted.

Cache refreshed for orders table.

Today’s revenue: $17,023.50 across 187 orders.

What’s next

Now that your agent is connected, explore the detailed guides for each product:

Pipe Agent Integration — Full MCP tool reference, example conversations, CrewAI and LangGraph examples, and best practices for pipeline automation.
Lens Agent Integration — Full MCP tool reference, example conversations, CrewAI, LangGraph, and AutoGen examples, and best practices for data analysis.
MCP Server Setup — Detailed setup for Claude Desktop, Cursor, Windsurf, and Claude Code.
Python SDK — Build custom agents with the PipeClient and LensClient Python classes.
JSON Output — Use --output json with shell scripts and subprocess calls.