Experiments API

Experiments run AutoML training on a dataset. Creating an experiment enqueues a background job that trains multiple models and selects the best one according to the configured metric.

Create Experiment 

POST /api/experiments

Create a new experiment and enqueue an automl_train background job. The experiment starts in queued status and transitions through running to succeeded or failed.

Request Body

Field	Type	Required	Description
`project_id`	string	Yes	UUID of the project.
`dataset_version_id`	string	Yes	UUID of the dataset version to train on.
`name`	string	Yes	Human-readable experiment name.
`target_column`	string	Yes	Name of the column to predict.
`problem_type`	string	Yes	`classification` or `regression`. Also accepts `binary` / `multiclass` (mapped to `classification`).
`config`	object	No	AutoML configuration (see below).
`use_gpu`	boolean	No	Request GPU-enabled training (default `false`).

Config Options

Key	Type	Description
`max_runtime_secs`	integer	Maximum training time in seconds (default 3600).
`max_models`	integer	Maximum number of models to train (default 20). May be capped by account quota.
`max_runtime_secs_per_model`	integer	Maximum training time per individual model (optional).
`sort_metric`	string	Metric to rank models by (see Sort & Stopping Metrics below). Default `AUTO` (AUC for classification, RMSE for regression).
`seed`	integer	Random seed for reproducibility (default 42).
`balance_classes`	boolean	Oversample minority class (classification only, default `false`).
`exclude_algos`	array[string]	Algorithms to exclude (see Supported Algorithms below).
`nfolds`	integer	Number of cross-validation folds (default 5, use `-1` for auto-detection, `0` to disable).
`stopping_rounds`	integer	Number of scoring rounds without improvement before early stopping (default 3, use `0` to disable).
`stopping_metric`	string	Metric to monitor for early stopping: `AUTO`, `deviance`, `logloss`, `RMSE`, `AUC`.
`stopping_tolerance`	float	Relative tolerance for early stopping (default 0.001).
`exploitation_ratio`	float	Ratio of time for fine-tuning vs. exploring new models (0.0–1.0).
`distribution`	string	Distribution family (regression): `gaussian`, `poisson`, `gamma`, `tweedie`, `huber`, `quantile`, `laplace`.
`keep_cross_validation_predictions`	boolean	Keep CV predictions for Stacked Ensembles (default `true`).
`keep_cross_validation_models`	boolean	Keep individual CV models (default `false`).
`verbosity`	string	Logging level: `debug`, `info`, `warn`.

Supported Algorithms

CorePlexML uses H2O AutoML which automatically trains, tunes, and ranks models from the following algorithm families:

Algorithm ID	Display Name	Description
`GBM`	Gradient Boosting	Gradient Boosting Machine. Builds sequential decision trees where each tree corrects errors of the previous ones. Strong default performance on most tabular datasets.
`XGBoost`	XGBoost	Extreme Gradient Boosting. Optimized implementation with regularization options (L1/L2). Often produces top-performing models, especially on structured data.
`DRF`	Random Forest	Distributed Random Forest. Ensemble of decision trees using random feature subsets and bootstrap sampling. Robust to overfitting.
`DeepLearning`	Deep Learning	Multi-layer feed-forward neural network. Multiple hidden layers with configurable activation functions. Best for datasets with complex nonlinear relationships.
`GLM`	Linear Model	Generalized Linear Model. Includes linear regression, logistic regression, and other link functions. Fast, interpretable, and useful as a baseline.
`StackedEnsemble`	Stacked Ensemble	Meta-model that combines predictions from all other trained models using a secondary learner. Typically achieves the best accuracy. Note: SHAP feature contributions are not available for StackedEnsemble models.

All six algorithms are trained by default. Use exclude_algos to skip specific ones:

{
  "config": {
    "exclude_algos": ["DeepLearning", "StackedEnsemble"]
  }
}

Sort & Stopping Metrics

The sort_metric determines how models are ranked on the leaderboard. The stopping_metric determines when early stopping triggers. Set to AUTO (default) and the platform selects the best metric for your problem type.

Binary Classification Metrics

Metric	Description
`AUC`	Area Under ROC Curve (default for binary classification). Higher is better. Range: 0–1.
`AUCPR`	Area Under Precision-Recall Curve. Better than AUC for imbalanced datasets.
`logloss`	Logarithmic loss (cross-entropy). Lower is better.
`mean_per_class_error`	Average error rate across classes. Lower is better.
`accuracy`	Classification accuracy. Higher is better. Not recommended for imbalanced classes.
`MCC`	Matthews Correlation Coefficient. Balanced metric for imbalanced data. Range: -1 to 1.

Multiclass Classification Metrics

Metric	Description
`logloss`	Multinomial log loss (default for multiclass). Lower is better.
`mean_per_class_error`	Average error rate across all classes. Lower is better.

Regression Metrics

Metric	Description
`RMSE`	Root Mean Squared Error (default for regression). Lower is better.
`MSE`	Mean Squared Error. Lower is better.
`MAE`	Mean Absolute Error. Less sensitive to outliers. Lower is better.
`RMSLE`	Root Mean Squared Logarithmic Error. Good for skewed targets. Lower is better.
`mean_residual_deviance`	Mean Residual Deviance. Lower is better.
`R2`	R-squared (coefficient of determination). Higher is better. Range: 0–1.

Distribution-Specific Metrics

Metric	Description
`quantile_loss`	Quantile loss. Available when `distribution` is `quantile` or `huber`.
`deviance`	Deviance. Available for all distributions. Lower is better.

Problem Types

Value	Description
`classification`	Binary or multiclass classification. Auto-detected from the target column cardinality.
`regression`	Continuous numeric prediction.
`binary`	Alias for `classification` (mapped automatically).
`multiclass`	Alias for `classification` (mapped automatically).

Experiment Status Lifecycle

An experiment transitions through these statuses:

Status	Description
`queued`	Experiment created and waiting for worker to pick it up.
`running`	AutoML training in progress. Models are being trained.
`completed` / `succeeded`	Training finished successfully. Models are available on the leaderboard. (API returns `succeeded`, UI displays `completed`.)
`failed`	Training failed due to an error (bad data, configuration issue, or system error). Check experiment logs for details.

Job Status Values

Background jobs (experiments, reports, synthgen) use these statuses:

Status	Description
`queued`	Job created, waiting for worker.
`running`	Worker is processing the job.
`succeeded`	Job completed successfully.
`failed`	Job failed. Error details in job payload.

Example

curl -X POST "$BASE_URL/api/experiments" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "project_id": "d4e5f6a7-b8c9-0123-def4-567890123456",
    "dataset_version_id": "f6a7b8c9-d0e1-2345-6789-0abcdef12345",
    "name": "Fraud Detection AutoML",
    "target_column": "is_fraud",
    "problem_type": "classification",
    "config": {
      "max_runtime_secs": 300,
      "max_models": 20,
      "sort_metric": "auc",
      "nfolds": 5,
      "balance_classes": true,
      "exclude_algos": ["DeepLearning"]
    }
  }'

import requests
import time

# Create the experiment
resp = requests.post(f"{BASE_URL}/api/experiments", headers={
    "Authorization": "Bearer YOUR_API_KEY",
}, json={
    "project_id": "d4e5f6a7-b8c9-0123-def4-567890123456",
    "dataset_version_id": "f6a7b8c9-d0e1-2345-6789-0abcdef12345",
    "name": "Fraud Detection AutoML",
    "target_column": "is_fraud",
    "problem_type": "classification",
    "config": {
        "max_runtime_secs": 300,
        "max_models": 20,
        "sort_metric": "auc",
        "nfolds": 5,
        "balance_classes": True,
    },
})
experiment_id = resp.json()["experiment_id"]

# Poll for completion
while True:
    status_resp = requests.get(
        f"{BASE_URL}/api/experiments/{experiment_id}/status",
        headers={"Authorization": "Bearer YOUR_API_KEY"},
    )
    status = status_resp.json()["status"]
    print(f"Status: {status}")
    if status in ("succeeded", "failed"):
        break
    time.sleep(10)

Response 201 Created

{
  "id": "b8c9d0e1-f2a3-4567-8901-bcdef1234567",
  "experiment_id": "b8c9d0e1-f2a3-4567-8901-bcdef1234567",
  "job_id": "c9d0e1f2-a3b4-5678-9012-cdef12345678",
  "status": "queued"
}

List Experiments 

GET /api/experiments

Query Parameters

Parameter	Type	Default	Description
`project_id`	string	–	Filter by project.
`dataset_id`	string	–	Filter by source dataset.
`status`	string	–	Filter by status: `queued`, `running`, `succeeded`, `failed`.
`limit`	integer	50	Max items (1–500).
`offset`	integer	0	Pagination offset.
`search`	string	–	Search in name and target column.
`sort_field`	string	`created_at`	`created_at`, `name`, or `status`.
`sort_direction`	string	`desc`	`asc` or `desc`.

Example

curl "$BASE_URL/api/experiments?project_id=d4e5f6a7-b8c9-0123-def4-567890123456" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response 200 OK

{
  "items": [
    {
      "id": "b8c9d0e1-f2a3-4567-8901-bcdef1234567",
      "name": "Fraud Detection AutoML",
      "project_name": "Fraud Detection v2",
      "dataset_name": "Transactions Q4",
      "target_column": "is_fraud",
      "problem_type": "classification",
      "status": "succeeded",
      "model_count": 12,
      "best_model_id": "a7b8c9d0-e1f2-3456-7890-abcdef123456",
      "best_model_algo": "GBM",
      "created_at": "2026-02-12T10:30:00Z"
    }
  ],
  "total": 1,
  "limit": 50,
  "offset": 0
}

Get Experiment Detail 

GET /api/experiments/{experiment_id}

Return the experiment, its models, artifact links, and the best model ID.

Example

curl "$BASE_URL/api/experiments/b8c9d0e1-f2a3-4567-8901-bcdef1234567" \
  -H "Authorization: Bearer YOUR_API_KEY"

resp = requests.get(
    f"{BASE_URL}/api/experiments/b8c9d0e1-f2a3-4567-8901-bcdef1234567",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
)
data = resp.json()
print("Best model:", data["best_model_id"])
for m in data["models"]:
    print(f"  {m['algo']}: {m['metrics'].get('auc', 'N/A')}")

Response 200 OK

{
  "experiment": {
    "id": "b8c9d0e1-f2a3-4567-8901-bcdef1234567",
    "name": "Fraud Detection AutoML",
    "project_id": "d4e5f6a7-b8c9-0123-def4-567890123456",
    "dataset_version_id": "f6a7b8c9-d0e1-2345-6789-0abcdef12345",
    "target_column": "is_fraud",
    "problem_type": "classification",
    "status": "succeeded",
    "max_runtime_secs": 300,
    "max_models": 20,
    "sort_metric": "auc"
  },
  "models": [
    {
      "id": "a7b8c9d0-e1f2-3456-7890-abcdef123456",
      "name": "GBM_1_AutoML",
      "algo": "GBM",
      "metrics": {"auc": 0.982, "logloss": 0.071}
    }
  ],
  "artifact_links": [],
  "best_model_id": "a7b8c9d0-e1f2-3456-7890-abcdef123456"
}

Get Experiment Status 

GET /api/experiments/{experiment_id}/status

Lightweight polling endpoint that returns only the current status. Use this for progress polling instead of the full detail endpoint.

Example

curl "$BASE_URL/api/experiments/b8c9d0e1-f2a3-4567-8901-bcdef1234567/status" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response 200 OK

{
  "experiment_id": "b8c9d0e1-f2a3-4567-8901-bcdef1234567",
  "status": "running"
}

Possible status values: queued, running, succeeded, failed, canceled.

Get Training Logs 

GET /api/experiments/{experiment_id}/logs

Return training logs for debugging and monitoring. Includes H2O engine output, job status, and error messages.

Query Parameters

Parameter	Type	Default	Description
`include_h2o_logs`	boolean	`true`	Include detailed H2O event logs.

Example

curl "$BASE_URL/api/experiments/b8c9d0e1-f2a3-4567-8901-bcdef1234567/logs" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response 200 OK

{
  "experiment_id": "b8c9d0e1-f2a3-4567-8901-bcdef1234567",
  "logs": [
    "Experiment status: succeeded",
    "Job status: completed",
    "",
    "AutoML progress: 100%",
    "Best model: GBM_1_AutoML (AUC=0.982)"
  ],
  "count": 5
}

Explainability 

GET /api/experiments/{experiment_id}/explain

Return multi-model explainability data including a Pareto front (accuracy vs. training time) and a variable importance heatmap across all models in the experiment.

Example

curl "$BASE_URL/api/experiments/b8c9d0e1-f2a3-4567-8901-bcdef1234567/explain" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response 200 OK

{
  "explain": {
    "pareto_front": {
      "models": [
        {"name": "GBM_1_AutoML", "algo": "GBM", "metric": 0.982, "training_time": 45.2},
        {"name": "XGBoost_1_AutoML", "algo": "XGBoost", "metric": 0.978, "training_time": 32.1}
      ]
    },
    "varimp_heatmap": {
      "models": ["GBM_1_AutoML", "XGBoost_1_AutoML"],
      "features": ["amount", "merchant_id", "hour"],
      "matrix": [
        [0.45, 0.30, 0.15],
        [0.42, 0.35, 0.12]
      ]
    }
  }
}

Set Best Model 

POST /api/experiments/{experiment_id}/best-model

Override the default best model selection for an experiment.

Request Body

Field	Type	Required	Description
`model_id`	string	Yes	UUID of a model that belongs to this experiment.

Example

curl -X POST "$BASE_URL/api/experiments/b8c9d0e1-f2a3-4567-8901-bcdef1234567/best-model" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model_id": "a7b8c9d0-e1f2-3456-7890-abcdef123456"}'

Response 200 OK

{
  "ok": true,
  "experiment_id": "b8c9d0e1-f2a3-4567-8901-bcdef1234567",
  "best_model_id": "a7b8c9d0-e1f2-3456-7890-abcdef123456"
}

Delete Experiment 

DELETE /api/experiments/{experiment_id}

Delete an experiment and its associated models.

Example

curl -X DELETE "$BASE_URL/api/experiments/b8c9d0e1-f2a3-4567-8901-bcdef1234567" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response 200 OK

{
  "ok": true
}