cursormcpdata-analysisai-agentside

Analyze Your Data Lake from Cursor IDE with DataSpoc MCP

Michael San Martim · 2026-04-23

Cursor is an AI-powered code editor. DataSpoc Lens exposes a data lake as an MCP server. Connect them and Cursor becomes a data analysis environment — explore schemas, write SQL, generate charts, and build reports without leaving your editor.

Setup

1. Install DataSpoc Lens

Terminal window
pip install dataspoc-lens[mcp]

2. Configure Cursor’s MCP Settings

Open Cursor’s settings and navigate to the MCP configuration. Add the DataSpoc Lens server.

Create or edit .cursor/mcp.json in your project root:

{
"mcpServers": {
"dataspoc-lens": {
"command": "dataspoc-lens",
"args": ["mcp"],
"env": {
"DATASPOC_BUCKET": "s3://my-company-data"
}
}
}
}

For GCS:

{
"mcpServers": {
"dataspoc-lens": {
"command": "dataspoc-lens",
"args": ["mcp"],
"env": {
"DATASPOC_BUCKET": "gs://my-company-data"
}
}
}
}

3. Restart Cursor

After saving the config, restart Cursor. The MCP server will start automatically, and Cursor’s AI will discover the available tools.

What Cursor Can Do Now

Discover Tables

In Cursor’s chat, type:

What tables are available in my data lake?

Cursor calls lens.tables() via MCP:

Your data lake contains 8 tables:
raw_postgres_orders - 245,000 rows
raw_postgres_customers - 18,200 rows
raw_stripe_payments - 892,000 rows
raw_hubspot_contacts - 34,500 rows
curated_sales - 245,000 rows
curated_customer_360 - 18,200 rows
gold_revenue_by_region - 48 rows
gold_monthly_metrics - 36 rows

Explore Schemas

What columns does curated_sales have? Show me sample data.

Cursor calls lens.schema() and lens.query():

curated_sales schema:
order_id VARCHAR
customer_id VARCHAR
product VARCHAR
category VARCHAR
amount DOUBLE
quantity INTEGER
region VARCHAR
sale_date DATE
payment_method VARCHAR
Sample data (5 rows):
| order_id | customer_id | product | amount | region | sale_date |
|----------|-------------|----------------|---------|----------------|------------|
| ORD-001 | CUST-042 | Widget Pro | 299.99 | North America | 2026-03-15 |
| ORD-002 | CUST-108 | Widget Basic | 49.99 | Europe | 2026-03-15 |
| ORD-003 | CUST-042 | Gadget Plus | 599.00 | North America | 2026-03-16 |

Write Analytical Queries

Show me revenue trends by month for the last 6 months, broken down by region.

Cursor generates and executes:

SELECT
DATE_TRUNC('month', sale_date) AS month,
region,
SUM(amount) AS revenue,
COUNT(*) AS orders,
ROUND(AVG(amount), 2) AS avg_order_value
FROM curated_sales
WHERE sale_date >= '2025-10-01'
GROUP BY month, region
ORDER BY month, revenue DESC

And displays the results right in the chat.

Generate Analysis Code

Ask Cursor to create a Python analysis script based on the data it just discovered:

Write a Python script that analyzes customer cohorts by signup month
and tracks their monthly spending over time. Use dataspoc_lens to query.

Cursor writes directly in your editor:

"""Customer cohort analysis using DataSpoc Lens."""
import pandas as pd
import matplotlib.pyplot as plt
from dataspoc_lens import LensClient
lens = LensClient()
# Get customer cohorts (signup month) with their spending over time
df = lens.query("""
WITH cohorts AS (
SELECT
customer_id,
DATE_TRUNC('month', MIN(sale_date)) AS cohort_month
FROM curated_sales
GROUP BY customer_id
),
monthly_spend AS (
SELECT
s.customer_id,
DATE_TRUNC('month', s.sale_date) AS spend_month,
SUM(s.amount) AS monthly_total
FROM curated_sales s
GROUP BY s.customer_id, spend_month
)
SELECT
c.cohort_month,
m.spend_month,
DATEDIFF('month', c.cohort_month, m.spend_month) AS months_since_signup,
COUNT(DISTINCT m.customer_id) AS active_customers,
SUM(m.monthly_total) AS total_revenue,
AVG(m.monthly_total) AS avg_spend_per_customer
FROM cohorts c
JOIN monthly_spend m ON c.customer_id = m.customer_id
WHERE c.cohort_month >= '2025-07-01'
GROUP BY c.cohort_month, m.spend_month, months_since_signup
ORDER BY c.cohort_month, months_since_signup
""")
# Pivot for cohort retention chart
pivot = df.pivot_table(
index="cohort_month",
columns="months_since_signup",
values="active_customers",
aggfunc="sum"
)
# Calculate retention rates
retention = pivot.div(pivot[0], axis=0) * 100
# Plot
fig, ax = plt.subplots(figsize=(12, 8))
import seaborn as sns
sns.heatmap(
retention,
annot=True,
fmt=".0f",
cmap="YlOrRd_r",
ax=ax,
vmin=0,
vmax=100,
)
ax.set_title("Customer Cohort Retention (%)")
ax.set_xlabel("Months Since Signup")
ax.set_ylabel("Cohort Month")
plt.tight_layout()
plt.savefig("cohort_retention.png", dpi=150)
plt.show()
print("Chart saved to cohort_retention.png")

You can run this directly from Cursor’s terminal.

Build Charts Interactively

Create a bar chart showing the top 10 products by revenue this quarter.

Cursor queries the data and generates the visualization code:

from dataspoc_lens import LensClient
import matplotlib.pyplot as plt
lens = LensClient()
df = lens.query("""
SELECT product, SUM(amount) AS revenue
FROM curated_sales
WHERE sale_date >= '2026-01-01'
GROUP BY product
ORDER BY revenue DESC
LIMIT 10
""")
fig, ax = plt.subplots(figsize=(10, 6))
ax.barh(df["product"][::-1], df["revenue"][::-1])
ax.set_xlabel("Revenue ($)")
ax.set_title("Top 10 Products by Revenue — Q1 2026")
for i, (val, name) in enumerate(zip(df["revenue"][::-1], df["product"][::-1])):
ax.text(val + 1000, i, f"${val:,.0f}", va="center", fontsize=9)
plt.tight_layout()
plt.savefig("top_products.png", dpi=150)
plt.show()

Generate Markdown Reports

Write a weekly metrics report for the last 7 days. Include revenue,
order count, top products, and regional breakdown. Format as markdown.

Cursor queries the lake multiple times and assembles a report:

# Weekly Metrics Report: April 8-14, 2026
## Summary
- **Total Revenue**: $987,000 (+8.3% vs prior week)
- **Orders**: 3,420 (+5.1% vs prior week)
- **Average Order Value**: $288.60 (+3.0% vs prior week)
## Daily Revenue
| Date | Revenue | Orders |
|------|---------|--------|
| Apr 8 | $142,000 | 490 |
| Apr 9 | $138,000 | 478 |
| Apr 10 | $145,000 | 502 |
| Apr 11 | $151,000 | 523 |
| Apr 12 | $133,000 | 461 |
| Apr 13 | $128,000 | 443 |
| Apr 14 | $150,000 | 523 |
## Top 5 Products
1. Widget Pro — $124,000 (12.6%)
2. Gadget Plus — $98,000 (9.9%)
...

The Workflow

The typical Cursor + DataSpoc workflow:

  1. Open Cursor in your project directory
  2. Ask about data — Cursor discovers tables and schemas via MCP
  3. Iterate on queries — refine SQL based on results, all in chat
  4. Generate code — ask Cursor to write Python scripts that use LensClient
  5. Run and refine — execute in Cursor’s terminal, review output, iterate
  6. Save artifacts — charts, reports, and analysis scripts stay in your project

Comparison: Traditional BI vs Jupyter vs CLI vs Cursor

TaskBI Tool (Metabase)JupyterCLI (dataspoc-lens)Cursor + MCP
Explore tablesClick through UIlens.tables()dataspoc-lens tables”What tables exist?”
Write SQLSQL editorCell with lens.query()Pipe to command”Show me revenue by month”
Generate chartsDrag-and-dropMatplotlib codeExport + separate tool”Create a bar chart”
Write reportsDashboardMarkdown cellsManual”Write a weekly report”
IterateRefresh dashboardRe-run cellsRe-run commandContinue conversation
Save workDashboard link.ipynb fileShell history.py files in project
Learn curveMediumMediumLowLow

Cursor with MCP combines the natural language interface of a BI tool with the code generation of Jupyter and the speed of a CLI. You describe what you want, Cursor writes the code, and you run it.

Tips for Effective Data Analysis in Cursor

Start with exploration:

What tables are available? Show me the schema of the largest table.

Be specific about time ranges:

Revenue by product category for March 2026 only, excluding returns.

Ask for comparisons:

Compare this month vs last month for all key metrics.
Show percentage changes.

Request reusable code:

Write a function I can reuse to generate monthly reports.
Take the month as a parameter.

Chain analyses:

First show me which customers churned last month.
Then analyze what they had in common (plan, region, usage).
Then suggest retention strategies based on the patterns.

Security

Cursor’s MCP integration inherits all of DataSpoc’s security properties:

  • Read-only: The MCP server rejects write SQL
  • Cloud IAM: Uses your existing credentials (AWS SSO, gcloud, Azure CLI)
  • Scoped: Only accesses the bucket you configure
  • Auditable: Every query is SQL that gets logged

Your data stays in your cloud bucket. Cursor’s AI generates SQL, Lens executes it locally via DuckDB, and results stay on your machine. No data leaves your environment unless you explicitly export it.

Recommended