Skip to content

AI Agent Integration

DataSpoc is built for humans AND AI agents. Every feature available in the CLI is also available programmatically, so agents can discover, query, and manage your data lake autonomously.

The Model Context Protocol (MCP) lets AI assistants call DataSpoc tools directly. Supported clients:

  • Claude Desktop --- Ask questions about your data in conversation
  • Claude Code --- Query your data lake from the terminal agent
  • Cursor --- Access data context while coding
  • Windsurf --- Integrate data lake queries into your workflow

See MCP Setup for configuration.

Import LensClient and PipeClient directly in your Python agents:

  • CrewAI --- Give your crew access to SQL and natural language queries
  • LangGraph --- Add data lake nodes to your graph
  • AutoGen --- Let agents query and analyze data autonomously

See Python SDK for full API reference.

Every command supports --output json for machine-readable output. Use it in shell scripts, CI/CD pipelines, or any automation tool.

See JSON Output for examples.

CapabilityMCP ToolSDK MethodCLI Flag
Discover tableslist_tablesclient.tables()dataspoc-lens catalog --output json
Describe schemadescribe_tableclient.schema(table)dataspoc-lens catalog --table X --output json
Run SQLqueryclient.query(sql)dataspoc-lens query --output json
Ask natural languageaskclient.ask(question)dataspoc-lens ask --output json
Check cache statuscache_statusclient.cache_status()dataspoc-lens cache --list --output json
Refresh cachecache_refreshclient.cache_refresh()dataspoc-lens cache --refresh
Refresh stale onlycache_refresh_staleclient.cache_refresh_stale()dataspoc-lens cache --refresh-stale
CapabilityMCP ToolSDK MethodCLI Flag
List pipelineslist_pipelinesclient.pipelines()dataspoc-pipe status --output json
View configpipeline_configclient.config(name)dataspoc-pipe config --output json
Run pipelinerun_pipelineclient.run(name)dataspoc-pipe run
Check statuspipeline_statusclient.status(name)dataspoc-pipe status --output json
View logspipeline_logsclient.logs(name)dataspoc-pipe logs --output json
Read manifestshow_manifestclient.manifest()dataspoc-pipe manifest --output json
Validate configvalidate_pipelineclient.validate(name)dataspoc-pipe validate --output json