Cache
Lens can cache remote Parquet files locally so you can work offline and avoid repeated cloud egress charges.
Cache a table
Section titled “Cache a table”dataspoc-lens cache ordersCaching 'orders'...Cached 'orders': 4 file(s), 12.3 MBThis downloads all Parquet files for the orders table to your local cache directory.
List cached tables
Section titled “List cached tables”dataspoc-lens cache --list┌──────────────┬─────────────────────┬──────────┬────────┐│ Table │ Cached At │ Size │ Status │├──────────────┼─────────────────────┼──────────┼────────┤│ orders │ 2026-04-15 10:30:00 │ 12.3 MB │ fresh ││ customers │ 2026-04-14 08:00:00 │ 2.1 MB │ stale │└──────────────┴─────────────────────┴──────────┴────────┘For JSON output:
dataspoc-lens cache --list --output jsonForce re-download
Section titled “Force re-download”dataspoc-lens cache orders --refreshDownloads the latest data even if a local copy already exists.
Clear cache
Section titled “Clear cache”# Clear a specific tabledataspoc-lens cache orders --clear
# Clear all cached datadataspoc-lens cache --clearFreshness detection
Section titled “Freshness detection”Lens determines cache freshness by comparing two timestamps:
cached_at— when the local cache was createdlast_extraction— the latest extraction timestamp from the Pipe manifest
If Pipe ran an extraction after the cache was created, the cache is marked as stale. Otherwise it is fresh.
| Condition | Status | Behavior |
|---|---|---|
cached_at > last_extraction | fresh | Queries use local cache |
cached_at < last_extraction | stale | Queries still use cache, but a warning is shown |
| No cache exists | — | Queries read directly from remote bucket |
Automatic cache usage
Section titled “Automatic cache usage”When you run queries (via query, shell, ask, or notebooks), Lens automatically uses the local cache for tables that have a fresh cached copy. No configuration needed — mount_views() detects the cache and switches the DuckDB view to read from the local path instead of the remote bucket.
Directory structure
Section titled “Directory structure”Cached files are stored under ~/.dataspoc-lens/cache/:
~/.dataspoc-lens/ cache/ orders/ part-0001.parquet part-0002.parquet part-0003.parquet part-0004.parquet customers/ part-0001.parquet cache_meta.json # Metadata: cached_at, size, freshness per tableWorkflow: offline analysis
Section titled “Workflow: offline analysis”# 1. Cache the tables you need while onlinedataspoc-lens cache ordersdataspoc-lens cache customersdataspoc-lens cache products
# 2. Verify cachedataspoc-lens cache --list
# 3. Go offline and query normallydataspoc-lens query "SELECT * FROM orders JOIN customers USING (customer_id)"dataspoc-lens shelldataspoc-lens ask "Top customers by revenue"All queries will read from the local cache transparently.