AI Agent Integration
DataSpoc is built for humans AND AI agents. Every feature available in the CLI is also available programmatically, so agents can discover, query, and manage your data lake autonomously.
Three Ways to Connect
Section titled “Three Ways to Connect”1. MCP Server
Section titled “1. MCP Server”The Model Context Protocol (MCP) lets AI assistants call DataSpoc tools directly. Supported clients:
- Claude Desktop --- Ask questions about your data in conversation
- Claude Code --- Query your data lake from the terminal agent
- Cursor --- Access data context while coding
- Windsurf --- Integrate data lake queries into your workflow
See MCP Setup for configuration.
2. Python SDK
Section titled “2. Python SDK”Import LensClient and PipeClient directly in your Python agents:
- CrewAI --- Give your crew access to SQL and natural language queries
- LangGraph --- Add data lake nodes to your graph
- AutoGen --- Let agents query and analyze data autonomously
See Python SDK for full API reference.
3. JSON CLI Output
Section titled “3. JSON CLI Output”Every command supports --output json for machine-readable output. Use it in shell scripts, CI/CD pipelines, or any automation tool.
See JSON Output for examples.
What Agents Can Do
Section titled “What Agents Can Do”With Lens
Section titled “With Lens”| Capability | MCP Tool | SDK Method | CLI Flag |
|---|---|---|---|
| Discover tables | list_tables | client.tables() | dataspoc-lens catalog --output json |
| Describe schema | describe_table | client.schema(table) | dataspoc-lens catalog --table X --output json |
| Run SQL | query | client.query(sql) | dataspoc-lens query --output json |
| Ask natural language | ask | client.ask(question) | dataspoc-lens ask --output json |
| Check cache status | cache_status | client.cache_status() | dataspoc-lens cache --list --output json |
| Refresh cache | cache_refresh | client.cache_refresh() | dataspoc-lens cache --refresh |
| Refresh stale only | cache_refresh_stale | client.cache_refresh_stale() | dataspoc-lens cache --refresh-stale |
With Pipe
Section titled “With Pipe”| Capability | MCP Tool | SDK Method | CLI Flag |
|---|---|---|---|
| List pipelines | list_pipelines | client.pipelines() | dataspoc-pipe status --output json |
| View config | pipeline_config | client.config(name) | dataspoc-pipe config --output json |
| Run pipeline | run_pipeline | client.run(name) | dataspoc-pipe run |
| Check status | pipeline_status | client.status(name) | dataspoc-pipe status --output json |
| View logs | pipeline_logs | client.logs(name) | dataspoc-pipe logs --output json |
| Read manifest | show_manifest | client.manifest() | dataspoc-pipe manifest --output json |
| Validate config | validate_pipeline | client.validate(name) | dataspoc-pipe validate --output json |