Skip to content

Generating Predictions

Score new data against a trained model to generate predictions.

Terminal window
dataspoc-lens ml predict --model <model> --from <table>
FlagDescription
--modelName of a previously trained model
--fromThe source table containing new data to score
  1. Loads the model — reads model.pkl and features.json from bucket/ml/models/<model>/.
  2. Reads new data — loads the source table from your bucket.
  3. Applies feature engineering — transforms the input data using the same pipeline used during training.
  4. Generates predictions — scores every row and produces prediction columns.
  5. Saves to bucket — writes Parquet files to bucket/ml/predictions/<model>/.

Predictions are saved as Parquet files at:

bucket/
ml/
predictions/
<model>/
predictions_20260415_120000.parquet

Each prediction file includes the original key columns plus the prediction output and confidence scores.

Once predictions are written to the bucket, they become queryable as SQL tables in Lens:

SELECT customer_id, prediction, confidence
FROM ml_predictions.churned_activity
WHERE confidence > 0.8
ORDER BY confidence DESC

No additional configuration is needed — Lens discovers prediction Parquet files automatically.

Score new customer data against a trained churn model:

Terminal window
dataspoc-lens ml predict --model churned_activity --from curated/customers/activity

Output:

[ML] Loading model churned_activity...
[ML] Loading table curated/customers/activity...
[ML] 12,045 rows to score
[ML] Generating predictions...
[ML] 3,218 predicted to churn (26.7%)
[ML] Saved to ml/predictions/churned_activity/
[ML] Done. Query with: SELECT * FROM ml_predictions.churned_activity