We don't just sell tools. We operate them for you.

Most companies buy a data platform and then spend 6 months figuring out how to use it. We do it differently. We design, build, deploy, and operate your entire data stack — using the same open-source tools we built. You get insights. We handle the plumbing.

This is what we keep hearing

"6 months"

"We started building our data lake 6 months ago. We still can't answer basic business questions. The pipeline breaks every week and nobody knows how to fix it."

— Head of Data, Series B startup

"$12k/month"

"Our Snowflake bill is $12k/month. Most queries are SELECT COUNT(*). We know it's overkill but we don't have time to migrate."

— CTO, 40-person company

"No one"

"We have data in 5 different places and no one on the team who can build a proper data lake. We need someone to set it up and teach us how to use it."

— CEO, early-stage company

We build it. We run it. You make decisions.

Our services cover the full lifecycle — from understanding your data to putting AI agents on top of it.

1

Week 1

Discovery & Architecture

We map your data sources, understand your business questions, and design the bucket architecture. We define who needs access to what, how data flows, and where AI agents will connect.

Data source inventory IAM design Bucket architecture Agent integration plan
2

Weeks 2–3

Build & Deploy Pipelines

We configure DataSpoc Pipe for every data source. Databases, APIs, spreadsheets — everything flows into your bucket as organized Parquet. Incremental extraction, scheduling, monitoring. We build the connectors you need, including custom ones.

dataspoc-pipe add postgres-production
dataspoc-pipe add stripe-payments
dataspoc-pipe add hubspot-crm
dataspoc-pipe add internal-api  # custom connector
dataspoc-pipe run _ --all
dataspoc-pipe schedule install
400+ Singer taps Custom connectors Incremental sync Scheduled runs
3

Week 3

Query Layer & Self-Service

We set up DataSpoc Lens so your team can query the lake. SQL shell for engineers, Jupyter for analysts, AI Ask for everyone. We create the curated views and transforms your business needs — and train your team to create their own.

dataspoc-lens add-bucket s3://company-data
dataspoc-lens catalog
dataspoc-lens ask "monthly revenue by product line"
dataspoc-lens notebook  # Jupyter for the analysts
SQL + AI queries Jupyter notebooks SQL transforms Team training
4

Week 4

AI Agent Integration

We connect your AI tools — Claude, Cursor, custom agents — directly to your data lake via MCP. Your agents get governed, read-only access to real data. No RAG hacks. No hallucinations. We configure the MCP servers, set up the SDK integrations, and validate that agents return accurate answers.

dataspoc-lens mcp   # agents query your lake
dataspoc-pipe mcp   # agents manage pipelines

# Claude: "How did sales perform last week?"
# → Real SQL, real data, real answer.
MCP servers Python SDK Agent validation Read-only access
5

Ongoing

Operate & Evolve

We don't disappear after deployment. We monitor pipelines, add new sources as your business grows, optimize queries, train new team members, and keep your AI agents sharp. You focus on decisions. We keep the data flowing.

Pipeline monitoring New source onboarding Query optimization Team training Agent tuning

What you get

A complete, running data platform — not a proof of concept.

📦

Production Data Lake

Organized Parquet files in your cloud bucket. Multi-source, incremental, scheduled. You own every byte.

🔍

Self-Service Analytics

Your team queries the lake with SQL, notebooks, or plain English. No more waiting on the data team.

🤖

AI-Ready Infrastructure

MCP servers running. AI agents connected. Real data flowing to Claude, Cursor, or your custom agents.

🎓

Trained Team

Your people know how to use, extend, and operate the platform. We transfer knowledge, not dependency.

Choose what you need

Full platform deployment or individual services.

Most popular

Full Platform Setup

The complete package. We build your data lake from scratch, set up ingestion, configure the query layer, connect AI agents, and train your team. Production-ready in 4 weeks.

  • Architecture design
  • Pipe: all pipelines configured and scheduled
  • Lens: query layer + notebooks + AI Ask
  • MCP servers for AI agents
  • Team training (2 sessions)
  • 30 days of operational support
Get a quote

Custom Connectors

We build Singer taps for your internal APIs, proprietary databases, or SaaS tools that don't have a connector yet. Delivered with tests, docs, and incremental support.

  • Singer-compatible tap
  • Incremental replication
  • Schema discovery
  • Tests and documentation
Get a quote

ML & Predictions

We build and deploy machine learning models on your data lake. From problem framing to production predictions that your team can query with SQL.

  • Problem framing and data audit
  • Feature engineering
  • Model training and evaluation
  • Predictions as SQL tables in Lens
Get a quote

Training & Workshops

Hands-on workshops for your team. We teach data lake architecture, DataSpoc tools, SQL analytics, and how to build AI agent integrations — with your own data.

  • Data lake fundamentals
  • Pipe + Lens hands-on
  • AI agent integration
  • Uses your real data
Get a quote

Why work with us

🔧

We built the tools

We're the creators of DataSpoc Pipe, Lens, and ML. Nobody knows these tools better. When something doesn't fit, we extend the platform — not work around it.

🎯

Outcome-focused

We don't sell hours. We sell a running platform. If the pipeline breaks, we fix it. If the AI agent gives wrong answers, we tune it. Your success is our deliverable.

🌐

No lock-in

Everything we build is on open-source tools, open formats (Parquet), and your cloud account. If you want to walk away, you take everything with you. We earn your business every month.

From zero to insights in 4 weeks.

Week 1: we understand your data. Week 2–3: pipelines are running. Week 4: your team is querying and your AI agents are connected. Week 5: you wonder why you ever considered Databricks.

Open source. Your cloud. Your data. Our expertise.

Get in touch

Tell us about your data challenges. We'll respond within 24 hours with an honest assessment of how we can help — or if you'd be better off doing it yourself with our docs.

services@dataspoc.com

Location

Brazil

Languages

English, Portuguese, Spanish

Response time

Within 24 hours