=========== Quick Start =========== This guide walks through a complete ML pipeline using the CorePlexML REST API. You will create a project, upload data, train models, deploy the best one, and make predictions -- all via HTTP calls. .. contents:: Steps :local: :depth: 1 Prerequisites ------------- - A running CorePlexML instance (e.g., ``http://localhost:8888``) - An API key (see :doc:`authentication`) - A CSV dataset file Set up shell variables used throughout this guide: .. code-block:: bash export BASE_URL="http://localhost:8888" export API_KEY="cpx_live.your-api-key-here" ---- 1. Create a Project ------------------- .. code-block:: bash curl -X POST "$BASE_URL/api/projects" \ -H "Authorization: Bearer $API_KEY" \ -H "Content-Type: application/json" \ -d '{"name": "Churn Prediction", "description": "Telco customer churn"}' .. code-block:: python import requests BASE_URL = "http://localhost:8888" HEADERS = {"Authorization": "Bearer cpx_live.your-api-key"} resp = requests.post(f"{BASE_URL}/api/projects", headers=HEADERS, json={ "name": "Churn Prediction", "description": "Telco customer churn", }) project = resp.json() project_id = project["project_id"] Save the ``project_id`` from the response -- you'll need it for every subsequent call. ---- 2. Upload a Dataset ------------------- Upload a CSV file using ``multipart/form-data``: .. code-block:: bash curl -X POST "$BASE_URL/api/datasets/upload" \ -H "Authorization: Bearer $API_KEY" \ -F "project_id=PROJECT_ID" \ -F "name=Customer Churn Data" \ -F "description=10k records of telco customers" \ -F "file=@customer_churn.csv" .. code-block:: python with open("customer_churn.csv", "rb") as f: resp = requests.post( f"{BASE_URL}/api/datasets/upload", headers=HEADERS, files={"file": ("customer_churn.csv", f, "text/csv")}, data={ "project_id": project_id, "name": "Customer Churn Data", "description": "10k records of telco customers", }, ) ds = resp.json() dataset_id = ds["id"] version_id = ds["version_id"] The platform automatically detects column types, computes statistics, and creates an immutable dataset version. ---- 3. Inspect the Schema --------------------- Check what columns were detected: .. code-block:: bash curl "$BASE_URL/api/dataset-versions/$VERSION_ID" \ -H "Authorization: Bearer $API_KEY" .. code-block:: python resp = requests.get( f"{BASE_URL}/api/dataset-versions/{version_id}", headers=HEADERS, ) for col in resp.json().get("columns", []): print(f" {col['name']}: {col['dtype']}") ---- 4. Run an AutoML Experiment --------------------------- Launch H2O AutoML to train and rank multiple model types: .. code-block:: bash curl -X POST "$BASE_URL/api/experiments" \ -H "Authorization: Bearer $API_KEY" \ -H "Content-Type: application/json" \ -d '{ "project_id": "PROJECT_ID", "dataset_version_id": "VERSION_ID", "target_column": "churned", "problem_type": "classification", "name": "Churn Classifier v1", "config": { "max_models": 10, "max_runtime_secs": 300 } }' .. code-block:: python resp = requests.post(f"{BASE_URL}/api/experiments", headers=HEADERS, json={ "project_id": project_id, "dataset_version_id": version_id, "target_column": "churned", "problem_type": "classification", "name": "Churn Classifier v1", "config": { "max_models": 10, "max_runtime_secs": 300, }, }) exp = resp.json() experiment_id = exp["id"] Training runs as a background job. The response includes a ``status`` field (``pending`` → ``running`` → ``completed``). ---- 5. Poll for Completion ---------------------- .. code-block:: bash # Poll until status is "completed" or "failed" curl "$BASE_URL/api/experiments/$EXPERIMENT_ID" \ -H "Authorization: Bearer $API_KEY" .. code-block:: python import time while True: resp = requests.get( f"{BASE_URL}/api/experiments/{experiment_id}", headers=HEADERS, ) status = resp.json()["experiment"]["status"] print(f"Status: {status}") if status in ("completed", "failed"): break time.sleep(10) Or use the SDK which handles polling automatically: .. code-block:: python from coreplexml import CorePlexMLClient client = CorePlexMLClient(base_url=BASE_URL, api_key="cpx_live.your-key") result = client.experiments.wait(experiment_id, interval=5.0, timeout=3600.0) ---- 6. List Trained Models ---------------------- After training completes, list the ranked models: .. code-block:: bash curl "$BASE_URL/api/experiments/$EXPERIMENT_ID/models" \ -H "Authorization: Bearer $API_KEY" .. code-block:: python resp = requests.get( f"{BASE_URL}/api/experiments/{experiment_id}/models", headers=HEADERS, ) models = resp.json()["items"] for m in models[:5]: print(f" {m['algorithm']}: AUC={m.get('metrics', {}).get('auc', 'N/A')}") best_model_id = models[0]["id"] Models are sorted by the primary metric (AUC for classification, RMSE for regression). ---- 7. Make a Prediction -------------------- Use the best model directly: .. code-block:: bash curl -X POST "$BASE_URL/api/models/$MODEL_ID/predict" \ -H "Authorization: Bearer $API_KEY" \ -H "Content-Type: application/json" \ -d '{ "inputs": { "tenure_months": 12, "monthly_charges": 65.0, "contract": "Month-to-month", "payment_method": "Electronic check" } }' .. code-block:: python resp = requests.post( f"{BASE_URL}/api/models/{best_model_id}/predict", headers=HEADERS, json={ "inputs": { "tenure_months": 12, "monthly_charges": 65.0, "contract": "Month-to-month", "payment_method": "Electronic check", }, }, ) print(resp.json()) ---- 8. Deploy to Production ----------------------- Deploy the model as a persistent REST endpoint: .. code-block:: bash curl -X POST "$BASE_URL/api/mlops/projects/$PROJECT_ID/deployments" \ -H "Authorization: Bearer $API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model_id": "MODEL_ID", "name": "Churn Predictor", "stage": "production" }' .. code-block:: python resp = requests.post( f"{BASE_URL}/api/mlops/projects/{project_id}/deployments", headers=HEADERS, json={ "model_id": best_model_id, "name": "Churn Predictor", "stage": "production", }, ) deployment_id = resp.json()["id"] Once deployed, make predictions through the deployment endpoint: .. code-block:: bash curl -X POST "$BASE_URL/api/mlops/projects/$PROJECT_ID/deployments/$DEPLOYMENT_ID/predict" \ -H "Authorization: Bearer $API_KEY" \ -H "Content-Type: application/json" \ -d '{"inputs": {"tenure_months": 48, "monthly_charges": 95.0, "contract": "Two year"}}' ---- Next Steps ---------- - :doc:`/user-guide/ml-pipeline` -- Deep dive into ML pipeline patterns - :doc:`/user-guide/privacy-suite` -- Scan and protect sensitive data - :doc:`/user-guide/synthetic-data` -- Generate privacy-safe synthetic datasets - :doc:`/user-guide/mlops` -- Model monitoring, alerts, and auto-retraining - :doc:`/sdk/quickstart` -- Full SDK tutorial with the same workflow - :doc:`/api-reference/index` -- Complete API reference