Quick Start
This guide walks through a complete ML pipeline using the CorePlexML REST API. You will create a project, upload data, train models, deploy the best one, and make predictions – all via HTTP calls.
Prerequisites
A running CorePlexML instance (e.g.,
http://localhost:8888)An API key (see Authentication)
A CSV dataset file
Set up shell variables used throughout this guide:
export BASE_URL="http://localhost:8888"
export API_KEY="cpx_live.your-api-key-here"
1. Create a Project
curl -X POST "$BASE_URL/api/projects" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{"name": "Churn Prediction", "description": "Telco customer churn"}'
import requests
BASE_URL = "http://localhost:8888"
HEADERS = {"Authorization": "Bearer cpx_live.your-api-key"}
resp = requests.post(f"{BASE_URL}/api/projects", headers=HEADERS, json={
"name": "Churn Prediction",
"description": "Telco customer churn",
})
project = resp.json()
project_id = project["project_id"]
Save the project_id from the response – you’ll need it for every
subsequent call.
2. Upload a Dataset
Upload a CSV file using multipart/form-data:
curl -X POST "$BASE_URL/api/datasets/upload" \
-H "Authorization: Bearer $API_KEY" \
-F "project_id=PROJECT_ID" \
-F "name=Customer Churn Data" \
-F "description=10k records of telco customers" \
-F "file=@customer_churn.csv"
with open("customer_churn.csv", "rb") as f:
resp = requests.post(
f"{BASE_URL}/api/datasets/upload",
headers=HEADERS,
files={"file": ("customer_churn.csv", f, "text/csv")},
data={
"project_id": project_id,
"name": "Customer Churn Data",
"description": "10k records of telco customers",
},
)
ds = resp.json()
dataset_id = ds["id"]
version_id = ds["version_id"]
The platform automatically detects column types, computes statistics, and creates an immutable dataset version.
3. Inspect the Schema
Check what columns were detected:
curl "$BASE_URL/api/dataset-versions/$VERSION_ID" \
-H "Authorization: Bearer $API_KEY"
resp = requests.get(
f"{BASE_URL}/api/dataset-versions/{version_id}",
headers=HEADERS,
)
for col in resp.json().get("columns", []):
print(f" {col['name']}: {col['dtype']}")
4. Run an AutoML Experiment
Launch H2O AutoML to train and rank multiple model types:
curl -X POST "$BASE_URL/api/experiments" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"project_id": "PROJECT_ID",
"dataset_version_id": "VERSION_ID",
"target_column": "churned",
"problem_type": "classification",
"name": "Churn Classifier v1",
"config": {
"max_models": 10,
"max_runtime_secs": 300
}
}'
resp = requests.post(f"{BASE_URL}/api/experiments", headers=HEADERS, json={
"project_id": project_id,
"dataset_version_id": version_id,
"target_column": "churned",
"problem_type": "classification",
"name": "Churn Classifier v1",
"config": {
"max_models": 10,
"max_runtime_secs": 300,
},
})
exp = resp.json()
experiment_id = exp["id"]
Training runs as a background job. The response includes a status
field (pending → running → completed).
5. Poll for Completion
# Poll until status is "completed" or "failed"
curl "$BASE_URL/api/experiments/$EXPERIMENT_ID" \
-H "Authorization: Bearer $API_KEY"
import time
while True:
resp = requests.get(
f"{BASE_URL}/api/experiments/{experiment_id}",
headers=HEADERS,
)
status = resp.json()["experiment"]["status"]
print(f"Status: {status}")
if status in ("completed", "failed"):
break
time.sleep(10)
Or use the SDK which handles polling automatically:
from coreplexml import CorePlexMLClient
client = CorePlexMLClient(base_url=BASE_URL, api_key="cpx_live.your-key")
result = client.experiments.wait(experiment_id, interval=5.0, timeout=3600.0)
6. List Trained Models
After training completes, list the ranked models:
curl "$BASE_URL/api/experiments/$EXPERIMENT_ID/models" \
-H "Authorization: Bearer $API_KEY"
resp = requests.get(
f"{BASE_URL}/api/experiments/{experiment_id}/models",
headers=HEADERS,
)
models = resp.json()["items"]
for m in models[:5]:
print(f" {m['algorithm']}: AUC={m.get('metrics', {}).get('auc', 'N/A')}")
best_model_id = models[0]["id"]
Models are sorted by the primary metric (AUC for classification, RMSE for regression).
7. Make a Prediction
Use the best model directly:
curl -X POST "$BASE_URL/api/models/$MODEL_ID/predict" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"inputs": {
"tenure_months": 12,
"monthly_charges": 65.0,
"contract": "Month-to-month",
"payment_method": "Electronic check"
}
}'
resp = requests.post(
f"{BASE_URL}/api/models/{best_model_id}/predict",
headers=HEADERS,
json={
"inputs": {
"tenure_months": 12,
"monthly_charges": 65.0,
"contract": "Month-to-month",
"payment_method": "Electronic check",
},
},
)
print(resp.json())
8. Deploy to Production
Deploy the model as a persistent REST endpoint:
curl -X POST "$BASE_URL/api/mlops/projects/$PROJECT_ID/deployments" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model_id": "MODEL_ID",
"name": "Churn Predictor",
"stage": "production"
}'
resp = requests.post(
f"{BASE_URL}/api/mlops/projects/{project_id}/deployments",
headers=HEADERS,
json={
"model_id": best_model_id,
"name": "Churn Predictor",
"stage": "production",
},
)
deployment_id = resp.json()["id"]
Once deployed, make predictions through the deployment endpoint:
curl -X POST "$BASE_URL/api/mlops/projects/$PROJECT_ID/deployments/$DEPLOYMENT_ID/predict" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{"inputs": {"tenure_months": 48, "monthly_charges": 95.0, "contract": "Two year"}}'
Next Steps
End-to-End ML Pipeline – Deep dive into ML pipeline patterns
Privacy Suite – Scan and protect sensitive data
Synthetic Data Generation – Generate privacy-safe synthetic datasets
MLOps & Model Serving – Model monitoring, alerts, and auto-retraining
Quick Start – Full SDK tutorial with the same workflow
API Reference – Complete API reference