cursormcpdata-analysisai-agentside

Analiza Tu Data Lake desde Cursor IDE con DataSpoc MCP

Michael San Martim · 2026-04-23

Cursor es un editor de código potenciado por IA. DataSpoc Lens expone un data lake cómo servidor MCP. Conéctalos y Cursor se convierte en un entorno de análisis de datos — explora schemas, escribe SQL, genera gráficos y construye reportes sin salir de tu editor.

Configuración

1. Instalar DataSpoc Lens

pip install dataspoc-lens[mcp]

2. Configurar los Ajustes MCP de Cursor

Abre los ajustes de Cursor y navega a la configuración de MCP. Agrega el servidor DataSpoc Lens.

Crea o edita .cursor/mcp.json en la raíz de tu proyecto:

{
  "mcpServers": {
    "dataspoc-lens": {
      "command": "dataspoc-lens",
      "args": ["mcp"],
      "env": {
        "DATASPOC_BUCKET": "s3://my-company-data"
      }
    }
  }
}

Para GCS:

{
  "mcpServers": {
    "dataspoc-lens": {
      "command": "dataspoc-lens",
      "args": ["mcp"],
      "env": {
        "DATASPOC_BUCKET": "gs://my-company-data"
      }
    }
  }
}

3. Reiniciar Cursor

Después de guardar la configuración, reinicia Cursor. El servidor MCP se iniciará automáticamente, y la IA de Cursor descubrirá las herramientas disponibles.

Qué Puede Hacer Cursor Ahora

Descubrir Tablas

En el chat de Cursor, escribe:

What tables are available in my data lake?

Cursor llama a lens.tables() vía MCP:

Your data lake contains 8 tables:

raw_postgres_orders        - 245,000 rows
raw_postgres_customers     - 18,200 rows
raw_stripe_payments        - 892,000 rows
raw_hubspot_contacts       - 34,500 rows
curated_sales              - 245,000 rows
curated_customer_360       - 18,200 rows
gold_revenue_by_region     - 48 rows
gold_monthly_metrics       - 36 rows

Explorar Schemas

What columns does curated_sales have? Show me sample data.

Cursor llama a lens.schema() y lens.query():

curated_sales schema:
  order_id        VARCHAR
  customer_id     VARCHAR
  product         VARCHAR
  category        VARCHAR
  amount          DOUBLE
  quantity        INTEGER
  region          VARCHAR
  sale_date       DATE
  payment_method  VARCHAR

Sample data (5 rows):
| order_id | customer_id | product        | amount  | region         | sale_date  |
|----------|-------------|----------------|---------|----------------|------------|
| ORD-001  | CUST-042    | Widget Pro     | 299.99  | North America  | 2026-03-15 |
| ORD-002  | CUST-108    | Widget Basic   | 49.99   | Europe         | 2026-03-15 |
| ORD-003  | CUST-042    | Gadget Plus    | 599.00  | North America  | 2026-03-16 |

Escribir Consultas Analíticas

Show me revenue trends by month for the last 6 months, broken down by region.

Cursor genera y ejecuta:

SELECT
    DATE_TRUNC('month', sale_date) AS month,
    region,
    SUM(amount) AS revenue,
    COUNT(*) AS orders,
    ROUND(AVG(amount), 2) AS avg_order_value
FROM curated_sales
WHERE sale_date >= '2025-10-01'
GROUP BY month, region
ORDER BY month, revenue DESC

Y muestra los resultados directamente en el chat.

Generar Código de Análisis

Pídele a Cursor que cree un script de análisis en Python basado en los datos que acaba de descubrir:

Write a Python script that analyzes customer cohorts by signup month
and tracks their monthly spending over time. Use dataspoc_lens to query.

Cursor escribe directamente en tu editor:

"""Customer cohort analysis using DataSpoc Lens."""
import pandas as pd
import matplotlib.pyplot as plt
from dataspoc_lens import LensClient

lens = LensClient()

# Get customer cohorts (signup month) with their spending over time
df = lens.query("""
    WITH cohorts AS (
        SELECT
            customer_id,
            DATE_TRUNC('month', MIN(sale_date)) AS cohort_month
        FROM curated_sales
        GROUP BY customer_id
    ),
    monthly_spend AS (
        SELECT
            s.customer_id,
            DATE_TRUNC('month', s.sale_date) AS spend_month,
            SUM(s.amount) AS monthly_total
        FROM curated_sales s
        GROUP BY s.customer_id, spend_month
    )
    SELECT
        c.cohort_month,
        m.spend_month,
        DATEDIFF('month', c.cohort_month, m.spend_month) AS months_since_signup,
        COUNT(DISTINCT m.customer_id) AS active_customers,
        SUM(m.monthly_total) AS total_revenue,
        AVG(m.monthly_total) AS avg_spend_per_customer
    FROM cohorts c
    JOIN monthly_spend m ON c.customer_id = m.customer_id
    WHERE c.cohort_month >= '2025-07-01'
    GROUP BY c.cohort_month, m.spend_month, months_since_signup
    ORDER BY c.cohort_month, months_since_signup
""")

# Pivot for cohort retention chart
pivot = df.pivot_table(
    index="cohort_month",
    columns="months_since_signup",
    values="active_customers",
    aggfunc="sum"
)

# Calculate retention rates
retention = pivot.div(pivot[0], axis=0) * 100

# Plot
fig, ax = plt.subplots(figsize=(12, 8))
import seaborn as sns
sns.heatmap(
    retention,
    annot=True,
    fmt=".0f",
    cmap="YlOrRd_r",
    ax=ax,
    vmin=0,
    vmax=100,
)
ax.set_title("Customer Cohort Retention (%)")
ax.set_xlabel("Months Since Signup")
ax.set_ylabel("Cohort Month")
plt.tight_layout()
plt.savefig("cohort_retention.png", dpi=150)
plt.show()
print("Chart saved to cohort_retention.png")

Puedes ejecutar esto directamente desde la terminal de Cursor.

Construir Gráficos Interactivamente

Create a bar chart showing the top 10 products by revenue this quarter.

Cursor consulta los datos y genera el código de visualización:

from dataspoc_lens import LensClient
import matplotlib.pyplot as plt

lens = LensClient()

df = lens.query("""
    SELECT product, SUM(amount) AS revenue
    FROM curated_sales
    WHERE sale_date >= '2026-01-01'
    GROUP BY product
    ORDER BY revenue DESC
    LIMIT 10
""")

fig, ax = plt.subplots(figsize=(10, 6))
ax.barh(df["product"][::-1], df["revenue"][::-1])
ax.set_xlabel("Revenue ($)")
ax.set_title("Top 10 Products by Revenue — Q1 2026")

for i, (val, name) in enumerate(zip(df["revenue"][::-1], df["product"][::-1])):
    ax.text(val + 1000, i, f"${val:,.0f}", va="center", fontsize=9)

plt.tight_layout()
plt.savefig("top_products.png", dpi=150)
plt.show()

Generar Reportes en Markdown

Write a weekly metrics report for the last 7 days. Include revenue,
order count, top products, and regional breakdown. Format as markdown.

Cursor consulta el lake múltiples veces y ensambla un reporte:

# Weekly Metrics Report: April 8-14, 2026

## Summary
- **Total Revenue**: $987,000 (+8.3% vs prior week)
- **Orders**: 3,420 (+5.1% vs prior week)
- **Average Order Value**: $288.60 (+3.0% vs prior week)

## Daily Revenue
| Date | Revenue | Orders |
|------|---------|--------|
| Apr 8  | $142,000 | 490 |
| Apr 9  | $138,000 | 478 |
| Apr 10 | $145,000 | 502 |
| Apr 11 | $151,000 | 523 |
| Apr 12 | $133,000 | 461 |
| Apr 13 | $128,000 | 443 |
| Apr 14 | $150,000 | 523 |

## Top 5 Products
1. Widget Pro — $124,000 (12.6%)
2. Gadget Plus — $98,000 (9.9%)
...

El Flujo de Trabajo

El flujo de trabajo típico de Cursor + DataSpoc:

Abrir Cursor en el directorio de tu proyecto
Preguntar sobre datos — Cursor descubre tablas y schemas vía MCP
Iterar en consultas — refinar SQL basándose en resultados, todo en el chat
Generar código — pedirle a Cursor que escriba scripts Python que usen LensClient
Ejecutar y refinar — ejecutar en la terminal de Cursor, revisar la salida, iterar
Guardar artefactos — gráficos, reportes y scripts de análisis se quedan en tu proyecto

Comparación: BI Tradicional vs Jupyter vs CLI vs Cursor

Tarea	Herramienta BI (Metabase)	Jupyter	CLI (`dataspoc-lens`)	Cursor + MCP
Explorar tablas	Clic en la UI	`lens.tables()`	`dataspoc-lens tables`	”What tables exist?”
Escribir SQL	Editor SQL	Celda con `lens.query()`	Pipe al comando	”Show me revenue by month”
Generar gráficos	Drag-and-drop	Código Matplotlib	Exportar + herramienta separada	”Create a bar chart”
Escribir reportes	Dashboard	Celdas Markdown	Manual	”Write a weekly report”
Iterar	Refrescar dashboard	Re-ejecutar celdas	Re-ejecutar comando	Continuar conversación
Guardar trabajo	Link de dashboard	Archivo .ipynb	Historial de shell	Archivos .py en proyecto
Curva de aprendizaje	Media	Media	Baja	Baja

Cursor con MCP combina la interfaz de lenguaje natural de una herramienta BI con la generación de código de Jupyter y la velocidad de un CLI. Describes lo que quieres, Cursor escribe el código, y tú lo ejecutas.

Consejos para Análisis de Datos Efectivo en Cursor

Empieza con exploración:

What tables are available? Show me the schema of the largest table.

Sé específico con rangos de tiempo:

Revenue by product category for March 2026 only, excluding returns.

Pide comparaciónes:

Compare this month vs last month for all key metrics.
Show percentage changes.

Solicita código reutilizable:

Write a function I can reuse to generate monthly reports.
Take the month as a parameter.

Encadena análisis:

First show me which customers churned last month.
Then analyze what they had in common (plan, region, usage).
Then suggest retention strategies based on the patterns.

Seguridad

La integración MCP de Cursor hereda todas las propiedades de seguridad de DataSpoc:

Solo lectura: El servidor MCP rechaza SQL de escritura
IAM en la nube: Usa tus credenciales existentes (AWS SSO, gcloud, Azure CLI)
Alcance limitado: Solo accede al bucket que configures
Auditable: Cada consulta es SQL que se registra en logs

Tus datos permanecen en tu bucket en la nube. La IA de Cursor genera SQL, Lens lo ejecuta localmente vía DuckDB, y los resultados se quedan en tu máquina. Ningún dato sale de tu entorno a menos que lo exportes explícitamente.

← Volver al blog