Example Scripts

The following example scripts demonstrate common CorePlexML SDK workflows. Each script is self-contained, uses argparse for configuration, and can be run directly against a CorePlexML instance.

All examples are located in the docs-site/examples/ directory.

01 – Quick Start

End-to-end ML workflow: create a project, upload data, train a classifier, deploy to staging, and make predictions.

python 01_quickstart.py --base-url https://your-domain.com \
    --api-key YOUR_KEY --csv data.csv --target churned
 1"""CorePlexML Quick Start -- End-to-end ML workflow.
 2
 3Creates a project, uploads a dataset, trains a classification model,
 4deploys it to staging, makes predictions, and cleans up.
 5
 6Usage:
 7    python 01_quickstart.py --base-url https://your-domain.com --api-key YOUR_KEY --csv data.csv
 8"""
 9
10import argparse
11import sys
12
13from coreplexml import CorePlexMLClient, CorePlexMLError
14
15
16def main():
17    parser = argparse.ArgumentParser(description="CorePlexML Quick Start")
18    parser.add_argument("--base-url", default="http://localhost:8888", help="CorePlexML server URL")
19    parser.add_argument("--api-key", required=True, help="API key for authentication")
20    parser.add_argument("--csv", required=True, help="Path to the training CSV file")
21    parser.add_argument("--target", default="target", help="Target column name (default: target)")
22    args = parser.parse_args()
23
24    client = CorePlexMLClient(base_url=args.base_url, api_key=args.api_key, timeout=120)
25    print("Connected to CorePlexML")
26
27    try:
28        # Step 1: Create a project
29        project = client.projects.create("Quick Start Project", description="SDK quick start demo")
30        project_id = project["id"]
31        print(f"[1/8] Created project: {project_id}")
32
33        # Step 2: Upload the dataset
34        ds = client.datasets.upload(project_id, args.csv, "Training Data")
35        dataset_id = ds["id"]
36        version_id = ds["version_id"]
37        print(f"[2/8] Uploaded dataset: {dataset_id} (version {version_id})")
38
39        # Step 3: Inspect columns
40        cols = client.datasets.columns(dataset_id)
41        col_names = [c["name"] for c in cols["columns"]]
42        print(f"[3/8] Detected {len(col_names)} columns: {', '.join(col_names[:5])}...")
43
44        # Step 4: Create an experiment
45        exp = client.experiments.create(
46            project_id=project_id,
47            dataset_version_id=version_id,
48            target_column=args.target,
49            name="Quick Start Classifier",
50            problem_type="classification",
51            config={"max_models": 5, "max_runtime_secs": 120},
52        )
53        experiment_id = exp["id"]
54        print(f"[4/8] Started experiment: {experiment_id}")
55
56        # Step 5: Wait for training
57        print("       Waiting for training to complete...")
58        result = client.experiments.wait(experiment_id, interval=5.0, timeout=1800.0)
59        print(f"[5/8] Experiment status: {result['status']}")
60
61        if result["status"] != "succeeded":
62            print(f"Training did not succeed: {result.get('error', 'unknown')}")
63            sys.exit(1)
64
65        # Step 6: Get the best model
66        models = client.models.list(experiment_id=experiment_id)
67        best = models["items"][0]
68        model_id = best["id"]
69        print(f"[6/8] Best model: {best.get('algorithm', 'N/A')} ({model_id})")
70
71        # Step 7: Deploy to staging
72        dep = client.deployments.create(
73            project_id=project_id,
74            model_id=model_id,
75            name="Quick Start Deployment",
76            stage="staging",
77        )
78        deployment_id = dep["id"]
79        print(f"[7/8] Deployed to staging: {deployment_id}")
80
81        # Step 8: Make a prediction
82        sample_input = {name: 0 for name in col_names if name != args.target}
83        pred = client.deployments.predict(deployment_id, inputs=sample_input)
84        print(f"[8/8] Prediction: {pred.get('prediction', pred)}")
85
86        print("\nQuick start complete!")
87
88        # Cleanup
89        client.deployments.deactivate(deployment_id)
90        client.projects.delete(project_id)
91        print("Cleaned up resources.")
92
93    except CorePlexMLError as e:
94        print(f"Error ({e.status_code}): {e.message}")
95        sys.exit(1)
96
97
98if __name__ == "__main__":
99    main()

02 – Batch Predictions

Read rows from a CSV, send them in configurable batches to a deployment endpoint, and write predictions to an output file.

python 02_batch_predictions.py --base-url https://your-domain.com \
    --api-key YOUR_KEY --deployment-id DEPLOY_UUID \
    --input new_data.csv --output predictions.csv --batch-size 500
 1"""Batch predictions with deployed models.
 2
 3Reads rows from a CSV file, sends them in batches to a deployment endpoint,
 4and writes predictions to an output CSV.
 5
 6Usage:
 7    python 02_batch_predictions.py --base-url https://your-domain.com \
 8        --api-key YOUR_KEY --deployment-id DEPLOY_UUID --input data.csv --output predictions.csv
 9"""
10
11import argparse
12import csv
13import sys
14
15from coreplexml import CorePlexMLClient, CorePlexMLError
16
17
18def chunked(items, size):
19    """Yield successive chunks of the given size."""
20    for i in range(0, len(items), size):
21        yield items[i : i + size]
22
23
24def main():
25    parser = argparse.ArgumentParser(description="Batch predictions")
26    parser.add_argument("--base-url", default="http://localhost:8888")
27    parser.add_argument("--api-key", required=True)
28    parser.add_argument("--deployment-id", required=True, help="UUID of the deployment")
29    parser.add_argument("--input", required=True, help="Path to input CSV")
30    parser.add_argument("--output", default="predictions.csv", help="Path to output CSV")
31    parser.add_argument("--batch-size", type=int, default=500, help="Rows per batch (default: 500)")
32    args = parser.parse_args()
33
34    client = CorePlexMLClient(base_url=args.base_url, api_key=args.api_key, timeout=120)
35
36    # Verify the deployment exists
37    try:
38        dep = client.deployments.get(args.deployment_id)
39        print(f"Deployment: {dep.get('name', args.deployment_id)} (stage={dep.get('stage', 'unknown')})")
40    except CorePlexMLError as e:
41        print(f"Cannot access deployment: {e.message}")
42        sys.exit(1)
43
44    # Read input CSV
45    with open(args.input, newline="") as f:
46        reader = csv.DictReader(f)
47        rows = list(reader)
48    print(f"Loaded {len(rows)} rows from {args.input}")
49
50    if not rows:
51        print("No rows to predict.")
52        sys.exit(0)
53
54    # Run batch predictions
55    all_predictions = []
56    for i, batch in enumerate(chunked(rows, args.batch_size)):
57        try:
58            result = client.deployments.predict(args.deployment_id, inputs=batch)
59            preds = result.get("predictions", [])
60            all_predictions.extend(preds)
61            processed = min((i + 1) * args.batch_size, len(rows))
62            print(f"  Batch {i + 1}: {processed}/{len(rows)} rows processed")
63        except CorePlexMLError as e:
64            print(f"  Batch {i + 1} failed: {e.message}")
65            # Fill with error markers so row count stays aligned
66            all_predictions.extend([{"prediction": "ERROR"}] * len(batch))
67
68    # Write output CSV
69    fieldnames = list(rows[0].keys()) + ["prediction", "probability"]
70    with open(args.output, "w", newline="") as f:
71        writer = csv.DictWriter(f, fieldnames=fieldnames)
72        writer.writeheader()
73        for row, pred in zip(rows, all_predictions):
74            row["prediction"] = pred.get("prediction", "")
75            row["probability"] = pred.get("probability", pred.get("probabilities", ""))
76            writer.writerow(row)
77
78    print(f"\nWrote {len(all_predictions)} predictions to {args.output}")
79
80
81if __name__ == "__main__":
82    main()

03 – Experiment Comparison

Run multiple experiments with different AutoML configurations on the same dataset, wait for all to complete, and print a ranked comparison.

python 03_experiment_comparison.py --base-url https://your-domain.com \
    --api-key YOUR_KEY --project-id PROJ_UUID --version-id VER_UUID \
    --target churned --problem-type classification
  1"""Compare multiple experiments and find the best model.
  2
  3Runs several experiments with different configurations on the same dataset,
  4waits for all to complete, and prints a ranked comparison of the best model
  5from each experiment.
  6
  7Usage:
  8    python 03_experiment_comparison.py --base-url https://your-domain.com \
  9        --api-key YOUR_KEY --project-id PROJ_UUID --version-id VER_UUID --target churned
 10"""
 11
 12import argparse
 13import sys
 14
 15from coreplexml import CorePlexMLClient, CorePlexMLError
 16
 17
 18EXPERIMENT_CONFIGS = [
 19    {"name": "Fast (5 models, 60s)", "max_models": 5, "max_runtime_secs": 60},
 20    {"name": "Standard (10 models, 180s)", "max_models": 10, "max_runtime_secs": 180},
 21    {"name": "Thorough (20 models, 300s)", "max_models": 20, "max_runtime_secs": 300},
 22]
 23
 24
 25def main():
 26    parser = argparse.ArgumentParser(description="Experiment comparison")
 27    parser.add_argument("--base-url", default="http://localhost:8888")
 28    parser.add_argument("--api-key", required=True)
 29    parser.add_argument("--project-id", required=True, help="UUID of the project")
 30    parser.add_argument("--version-id", required=True, help="UUID of the dataset version")
 31    parser.add_argument("--target", required=True, help="Target column name")
 32    parser.add_argument("--problem-type", default="classification", choices=["classification", "regression"])
 33    args = parser.parse_args()
 34
 35    client = CorePlexMLClient(base_url=args.base_url, api_key=args.api_key, timeout=60)
 36
 37    # Launch all experiments
 38    experiments = []
 39    for cfg in EXPERIMENT_CONFIGS:
 40        try:
 41            exp = client.experiments.create(
 42                project_id=args.project_id,
 43                dataset_version_id=args.version_id,
 44                target_column=args.target,
 45                name=cfg["name"],
 46                problem_type=args.problem_type,
 47                config={"max_models": cfg["max_models"], "max_runtime_secs": cfg["max_runtime_secs"]},
 48            )
 49            experiments.append({"id": exp["id"], "name": cfg["name"]})
 50            print(f"Started: {cfg['name']} ({exp['id']})")
 51        except CorePlexMLError as e:
 52            print(f"Failed to start '{cfg['name']}': {e.message}")
 53
 54    if not experiments:
 55        print("No experiments started.")
 56        sys.exit(1)
 57
 58    # Wait for all experiments
 59    print("\nWaiting for all experiments to complete...")
 60    results = []
 61    for exp in experiments:
 62        try:
 63            status = client.experiments.wait(exp["id"], interval=10.0, timeout=3600.0)
 64            exp["status"] = status.get("status", "unknown")
 65            print(f"  {exp['name']}: {exp['status']}")
 66        except CorePlexMLError as e:
 67            exp["status"] = "timeout"
 68            print(f"  {exp['name']}: timed out ({e.message})")
 69
 70    # Collect best model from each successful experiment
 71    print("\n--- Results ---")
 72    best_overall = None
 73    metric_key = "auc" if args.problem_type == "classification" else "rmse"
 74
 75    for exp in experiments:
 76        if exp["status"] != "succeeded":
 77            print(f"  {exp['name']}: SKIPPED ({exp['status']})")
 78            continue
 79        models = client.models.list(experiment_id=exp["id"])
 80        if not models.get("items"):
 81            print(f"  {exp['name']}: No models produced")
 82            continue
 83        top = models["items"][0]
 84        metric_val = top.get("metrics", {}).get(metric_key, "N/A")
 85        print(f"  {exp['name']}: {top.get('algorithm', '?')} {metric_key}={metric_val}")
 86
 87        if best_overall is None:
 88            best_overall = (exp, top, metric_val)
 89        else:
 90            try:
 91                current_best = float(best_overall[2])
 92                candidate = float(metric_val)
 93                # For AUC, higher is better; for RMSE, lower is better
 94                if args.problem_type == "classification" and candidate > current_best:
 95                    best_overall = (exp, top, metric_val)
 96                elif args.problem_type == "regression" and candidate < current_best:
 97                    best_overall = (exp, top, metric_val)
 98            except (ValueError, TypeError):
 99                pass
100
101    if best_overall:
102        exp_info, model_info, metric = best_overall
103        print(f"\nBest overall: {model_info.get('algorithm')} from '{exp_info['name']}'")
104        print(f"  Model ID: {model_info['id']}")
105        print(f"  {metric_key}: {metric}")
106    else:
107        print("\nNo successful experiments to compare.")
108
109
110if __name__ == "__main__":
111    main()

04 – Privacy Workflow

Create a HIPAA compliance policy, scan a dataset for PII, apply privacy transformations (masking, hashing, redaction), and retrieve results.

python 04_privacy_workflow.py --base-url https://your-domain.com \
    --api-key YOUR_KEY --project-id PROJ_UUID --dataset-id DS_UUID \
    --profile hipaa

05 – Synthetic Data Generation

Train a SynthGen model (CTGAN/CopulaGAN/TVAE/Gaussian Copula) on a dataset version, wait for training, and generate synthetic rows with optional reproducibility seed.

python 05_synthetic_data.py --base-url https://your-domain.com \
    --api-key YOUR_KEY --project-id PROJ_UUID --version-id VER_UUID \
    --model-type ctgan --epochs 300 --num-rows 5000 --seed 42

06 – Model Monitoring

Deploy a model, promote to production, and check for data drift.

python 06_model_monitoring.py --base-url https://your-domain.com \
    --api-key YOUR_KEY --project-id PROJ_UUID --model-id MODEL_UUID

07 – What-If Analysis

Create a studio session with baseline inputs, define counterfactual scenarios, run predictions, and compare results side by side.

python 07_what_if_analysis.py --base-url https://your-domain.com \
    --api-key YOUR_KEY --project-id PROJ_UUID --deployment-id DEP_UUID

08 – Report Generation

Generate experiment and deployment reports with AI insights, wait for PDF generation, and download the results.

python 08_reports.py --base-url https://your-domain.com \
    --api-key YOUR_KEY --project-id PROJ_UUID \
    --experiment-id EXP_UUID --output-dir ./reports

Running the Examples

All examples follow the same pattern:

  1. Install the SDK:

    pip install coreplexml
    
  2. Set your API key and server URL. You can pass them as arguments or export them as environment variables and modify the scripts to read from os.environ.

  3. Run the script:

    python 01_quickstart.py --base-url https://ml.example.com --api-key cp_ab12cd34.your-secret-key --csv train.csv
    
  4. Each script prints progress to stdout and exits with code 0 on success or 1 on error.