We don't just sell tools. We operate them for you.
Most companies buy a data platform and then spend 6 months figuring out how to use it. We do it differently. We design, build, deploy, and operate your entire data stack — using the same open-source tools we built. You get insights. We handle the plumbing.
This is what we keep hearing
"6 months"
"We started building our data lake 6 months ago. We still can't answer basic business questions. The pipeline breaks every week and nobody knows how to fix it."
— Head of Data, Series B startup
"$12k/month"
"Our Snowflake bill is $12k/month. Most queries are SELECT COUNT(*). We know it's overkill but we don't have time to migrate."
— CTO, 40-person company
"No one"
"We have data in 5 different places and no one on the team who can build a proper data lake. We need someone to set it up and teach us how to use it."
— CEO, early-stage company
We build it. We run it. You make decisions.
Our services cover the full lifecycle — from understanding your data to putting AI agents on top of it.
Week 1
Discovery & Architecture
We map your data sources, understand your business questions, and design the bucket architecture. We define who needs access to what, how data flows, and where AI agents will connect.
Weeks 2–3
Build & Deploy Pipelines
We configure DataSpoc Pipe for every data source. Databases, APIs, spreadsheets — everything flows into your bucket as organized Parquet. Incremental extraction, scheduling, monitoring. We build the connectors you need, including custom ones.
dataspoc-pipe add postgres-production dataspoc-pipe add stripe-payments dataspoc-pipe add hubspot-crm dataspoc-pipe add internal-api # custom connector dataspoc-pipe run _ --all dataspoc-pipe schedule install
Week 3
Query Layer & Self-Service
We set up DataSpoc Lens so your team can query the lake. SQL shell for engineers, Jupyter for analysts, AI Ask for everyone. We create the curated views and transforms your business needs — and train your team to create their own.
dataspoc-lens add-bucket s3://company-data dataspoc-lens catalog dataspoc-lens ask "monthly revenue by product line" dataspoc-lens notebook # Jupyter for the analysts
Week 4
AI Agent Integration
We connect your AI tools — Claude, Cursor, custom agents — directly to your data lake via MCP. Your agents get governed, read-only access to real data. No RAG hacks. No hallucinations. We configure the MCP servers, set up the SDK integrations, and validate that agents return accurate answers.
dataspoc-lens mcp # agents query your lake dataspoc-pipe mcp # agents manage pipelines # Claude: "How did sales perform last week?" # → Real SQL, real data, real answer.
Ongoing
Operate & Evolve
We don't disappear after deployment. We monitor pipelines, add new sources as your business grows, optimize queries, train new team members, and keep your AI agents sharp. You focus on decisions. We keep the data flowing.
What you get
A complete, running data platform — not a proof of concept.
📦
Production Data Lake
Organized Parquet files in your cloud bucket. Multi-source, incremental, scheduled. You own every byte.
🔍
Self-Service Analytics
Your team queries the lake with SQL, notebooks, or plain English. No more waiting on the data team.
🤖
AI-Ready Infrastructure
MCP servers running. AI agents connected. Real data flowing to Claude, Cursor, or your custom agents.
🎓
Trained Team
Your people know how to use, extend, and operate the platform. We transfer knowledge, not dependency.
Choose what you need
Full platform deployment or individual services.
Full Platform Setup
The complete package. We build your data lake from scratch, set up ingestion, configure the query layer, connect AI agents, and train your team. Production-ready in 4 weeks.
- ✓ Architecture design
- ✓ Pipe: all pipelines configured and scheduled
- ✓ Lens: query layer + notebooks + AI Ask
- ✓ MCP servers for AI agents
- ✓ Team training (2 sessions)
- ✓ 30 days of operational support
Custom Connectors
We build Singer taps for your internal APIs, proprietary databases, or SaaS tools that don't have a connector yet. Delivered with tests, docs, and incremental support.
- ✓ Singer-compatible tap
- ✓ Incremental replication
- ✓ Schema discovery
- ✓ Tests and documentation
ML & Predictions
We build and deploy machine learning models on your data lake. From problem framing to production predictions that your team can query with SQL.
- ✓ Problem framing and data audit
- ✓ Feature engineering
- ✓ Model training and evaluation
- ✓ Predictions as SQL tables in Lens
Training & Workshops
Hands-on workshops for your team. We teach data lake architecture, DataSpoc tools, SQL analytics, and how to build AI agent integrations — with your own data.
- ✓ Data lake fundamentals
- ✓ Pipe + Lens hands-on
- ✓ AI agent integration
- ✓ Uses your real data
Why work with us
We built the tools
We're the creators of DataSpoc Pipe, Lens, and ML. Nobody knows these tools better. When something doesn't fit, we extend the platform — not work around it.
Outcome-focused
We don't sell hours. We sell a running platform. If the pipeline breaks, we fix it. If the AI agent gives wrong answers, we tune it. Your success is our deliverable.
No lock-in
Everything we build is on open-source tools, open formats (Parquet), and your cloud account. If you want to walk away, you take everything with you. We earn your business every month.
From zero to insights in 4 weeks.
Week 1: we understand your data. Week 2–3: pipelines are running. Week 4: your team is querying and your AI agents are connected. Week 5: you wonder why you ever considered Databricks.
Open source. Your cloud. Your data. Our expertise.
Get in touch
Tell us about your data challenges. We'll respond within 24 hours with an honest assessment of how we can help — or if you'd be better off doing it yourself with our docs.
services@dataspoc.comLocation
Brazil
Languages
English, Portuguese, Spanish
Response time
Within 24 hours