MCP Server
Pipe includes an MCP (Model Context Protocol) server that lets AI agents manage and run data pipelines. Any MCP-compatible client — such as Claude Desktop — can list, run, and monitor pipelines through natural language.
Install
Section titled “Install”pip install dataspoc-pipe[mcp]Start the server
Section titled “Start the server”dataspoc-pipe mcpThe server runs on stdio, following the MCP transport protocol. It is designed to be launched by an MCP client, not run manually in a terminal.
Claude Desktop configuration
Section titled “Claude Desktop configuration”Add the following to your Claude Desktop MCP configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json
{ "mcpServers": { "dataspoc-pipe": { "command": "dataspoc-pipe", "args": ["mcp"] } }}If dataspoc-pipe is installed in a virtual environment, use the full path:
{ "mcpServers": { "dataspoc-pipe": { "command": "/home/you/.venv/bin/dataspoc-pipe", "args": ["mcp"] } }}Restart Claude Desktop after updating the configuration.
Available tools
Section titled “Available tools”The MCP server exposes the following tools:
list_pipelines
Section titled “list_pipelines”List all configured pipeline names.
Returns: One pipeline name per line, or “No pipelines configured.”
pipeline_config
Section titled “pipeline_config”Return the full configuration of a pipeline as JSON.
Parameters:
name(string, required): Pipeline name
Returns: JSON with source, destination, incremental, and schedule configuration.
run_pipeline
Section titled “run_pipeline”Run an extraction pipeline.
Parameters:
name(string, required): Pipeline namefull(boolean, optional): Force full extraction, ignoring incremental state. Default:false
Returns: JSON with success, streams (record counts per stream), and error.
pipeline_status
Section titled “pipeline_status”Return status for all configured pipelines.
Returns: JSON array, each entry with name, last_run, status, duration, records.
pipeline_logs
Section titled “pipeline_logs”Return the latest execution log for a pipeline.
Parameters:
name(string, required): Pipeline name
Returns: JSON with full execution log, or “No logs found”.
show_manifest
Section titled “show_manifest”Return the manifest (catalog) of a bucket.
Parameters:
bucket(string, required): Bucket URI (e.g.,s3://my-bucket,file:///tmp/lake)
Returns: JSON with table catalog including schemas, timestamps, and row counts.
validate_pipeline
Section titled “validate_pipeline”Validate bucket connectivity and tap availability for a pipeline.
Parameters:
name(string, required): Pipeline name
Returns: JSON with pipeline, bucket_ok, tap_ok, and errors.
Resources
Section titled “Resources”pipe://pipelines
Section titled “pipe://pipelines”An MCP resource that lists all pipeline names with their tap and bucket:
[ {"name": "orders", "tap": "tap-csv", "bucket": "s3://my-lake"}, {"name": "customers", "tap": "tap-postgres", "bucket": "s3://my-lake"}]Example agent interactions
Section titled “Example agent interactions”Once configured, you can interact with Pipe through Claude using natural language:
“What pipelines are configured?”
The agent calls
list_pipelinesand returns the list.
“Run the orders pipeline”
The agent calls
run_pipeline(name="orders")and reports: “The orders pipeline completed successfully. 5,200 records were extracted across 1 stream.”
“Show me the status of all pipelines”
The agent calls
pipeline_statusand presents a formatted summary of each pipeline’s last run, status, duration, and record count.
“Do a full re-extraction of customers”
The agent calls
run_pipeline(name="customers", full=True)to ignore incremental state and re-extract everything.
“Is the orders pipeline healthy? Check the bucket and tap.”
The agent calls
validate_pipeline(name="orders")and reports whether the bucket is writable and the tap is available.
“What tables are in the s3://my-lake bucket?”
The agent calls
show_manifest(bucket="s3://my-lake")and lists the available tables with their schemas and record counts.