Quick Start

This tutorial walks through a complete ML workflow using the CorePlexML Python SDK: creating a project, uploading data, training models, deploying to production, and making predictions.

Prerequisites

A running CorePlexML instance
An API key (generate one under Settings > API Keys in the web UI)
A CSV dataset file

Step 1 – Install and Configure

pip install coreplexml

from coreplexml import CorePlexMLClient

client = CorePlexMLClient(
    base_url="https://your-domain.com",
    api_key="prefix.your-secret-key",
    timeout=60,
)

Step 2 – Create a Project

Projects are the top-level organizational unit. Every dataset, experiment, model, and deployment belongs to a project.

project = client.projects.create(
    name="Churn Prediction",
    description="Predict customer churn from usage data",
)
project_id = project["id"]
print(f"Created project: {project_id}")

Step 3 – Upload a Dataset

Upload a CSV file. The platform automatically profiles the data, detects column types, and creates an initial dataset version.

ds = client.datasets.upload(
    project_id=project_id,
    file_path="customer_churn.csv",
    name="Customer Churn Data",
    description="Historical churn data with 10,000 records",
)
dataset_id = ds["id"]
version_id = ds["version_id"]
print(f"Uploaded dataset {dataset_id}, version {version_id}")

Inspect the columns detected by the platform:

cols = client.datasets.columns(dataset_id)
for col in cols["columns"]:
    print(f"  {col['name']}: {col['dtype']}")

Step 4 – Create an Experiment (Classification)

An experiment runs H2O AutoML to train and rank multiple models. Specify the dataset version, the target column, and the problem type.

exp = client.experiments.create(
    project_id=project_id,
    dataset_version_id=version_id,
    target_column="churned",
    name="Churn Classifier v1",
    problem_type="classification",
    config={
        "max_models": 10,
        "max_runtime_secs": 300,
    },
)
experiment_id = exp["id"]
print(f"Experiment started: {experiment_id}")

Step 5 – Wait for Training

The wait method polls the experiment status until training completes, fails, or times out.

result = client.experiments.wait(
    experiment_id,
    interval=5.0,     # poll every 5 seconds
    timeout=3600.0,   # give up after 1 hour
)
print(f"Experiment status: {result['status']}")

Step 6 – List Models and Get the Best One

AutoML produces several candidate models. They are returned sorted by the primary metric (e.g., AUC for classification).

models = client.models.list(experiment_id=experiment_id)
best = models["items"][0]
model_id = best["id"]
print(f"Best model: {best['algorithm']} (AUC={best.get('metrics', {}).get('auc', 'N/A')})")

# Inspect hyperparameters
params = client.models.parameters(model_id)
for key, value in params.get("parameters", {}).items():
    print(f"  {key}: {value}")

Step 7 – Deploy to Staging

Deploy the best model so it can serve predictions via a REST endpoint.

dep = client.deployments.create(
    project_id=project_id,
    model_id=model_id,
    name="Churn Predictor Staging",
    stage="staging",
)
deployment_id = dep["id"]
print(f"Deployed to staging: {deployment_id}")

Step 8 – Make Predictions via Deployment

Send feature values and receive predictions in real time.

# Single prediction
pred = client.deployments.predict(
    deployment_id,
    inputs={
        "tenure_months": 12,
        "monthly_charges": 65.0,
        "contract": "Month-to-month",
        "payment_method": "Electronic check",
    },
)
print(f"Prediction: {pred['prediction']}")
print(f"Probability: {pred.get('probabilities', {})}")

# Batch prediction
batch = client.deployments.predict(
    deployment_id,
    inputs=[
        {"tenure_months": 12, "monthly_charges": 65.0, "contract": "Month-to-month"},
        {"tenure_months": 48, "monthly_charges": 95.0, "contract": "Two year"},
        {"tenure_months": 2, "monthly_charges": 25.0, "contract": "One year"},
    ],
)
for row in batch["predictions"]:
    print(f"  -> {row['prediction']}")

Step 9 – Generate a Report

Create a PDF report summarizing the experiment results.

report = client.reports.create(
    project_id=project_id,
    kind="experiment",
    entity_id=experiment_id,
)
report_id = report["id"]
print(f"Report queued: {report_id}")

# Wait for generation
status = client.reports.wait(report_id)
print(f"Report status: {status['report']['status']}")

# Download
client.reports.download(report_id, "churn_report.pdf")
print("Downloaded churn_report.pdf")

Step 10 – Clean Up

Remove the deployment and project when finished.

client.deployments.deactivate(deployment_id)
client.projects.delete(project_id)
print("Cleanup complete")

Error Handling

Wrap operations in try/except to handle errors gracefully:

from coreplexml import (
    CorePlexMLError,
    AuthenticationError,
    NotFoundError,
    ValidationError,
)

try:
    client.experiments.create(
        project_id=project_id,
        dataset_version_id=version_id,
        target_column="nonexistent_column",
        problem_type="classification",
    )
except ValidationError as e:
    print(f"Validation error: {e.detail}")
except AuthenticationError:
    print("Invalid or expired API key")
except CorePlexMLError as e:
    print(f"API error ({e.status_code}): {e.message}")

Next Steps

Resource Reference – full reference for every resource and method
Advanced Usage – batch predictions, Privacy Suite, SynthGen, and more
Example Scripts – downloadable, runnable example scripts