===============
Experiments API
===============

Experiments run AutoML training on a dataset. Creating an experiment
enqueues a background job that trains multiple models and selects the best
one according to the configured metric.

.. contents:: Endpoints
   :local:
   :depth: 1

----

Create Experiment
-----------------

.. code-block:: text

   POST /api/experiments

Create a new experiment and enqueue an ``automl_train`` background job.
The experiment starts in ``queued`` status and transitions through
``running`` to ``succeeded`` or ``failed``.

**Request Body**

.. list-table::
   :header-rows: 1
   :widths: 20 10 10 60

   * - Field
     - Type
     - Required
     - Description
   * - ``project_id``
     - string
     - Yes
     - UUID of the project.
   * - ``dataset_version_id``
     - string
     - Yes
     - UUID of the dataset version to train on.
   * - ``name``
     - string
     - Yes
     - Human-readable experiment name.
   * - ``target_column``
     - string
     - Yes
     - Name of the column to predict.
   * - ``problem_type``
     - string
     - Yes
     - ``classification`` or ``regression``. Also accepts ``binary`` /
       ``multiclass`` (mapped to ``classification``).
   * - ``config``
     - object
     - No
     - AutoML configuration (see below).
   * - ``use_gpu``
     - boolean
     - No
     - Request GPU-enabled training (default ``false``).

**Config Options**

.. list-table::
   :header-rows: 1
   :widths: 25 10 65

   * - Key
     - Type
     - Description
   * - ``max_runtime_secs``
     - integer
     - Maximum training time in seconds (default 3600).
   * - ``max_models``
     - integer
     - Maximum number of models to train (default 20). May be capped by
       account quota.
   * - ``max_runtime_secs_per_model``
     - integer
     - Maximum training time per individual model (optional).
   * - ``sort_metric``
     - string
     - Metric to rank models by (see :ref:`sort-metrics` below). Default
       ``AUTO`` (AUC for classification, RMSE for regression).
   * - ``seed``
     - integer
     - Random seed for reproducibility (default 42).
   * - ``balance_classes``
     - boolean
     - Oversample minority class (classification only, default ``false``).
   * - ``exclude_algos``
     - array[string]
     - Algorithms to exclude (see :ref:`supported-algorithms` below).
   * - ``nfolds``
     - integer
     - Number of cross-validation folds (default 5, use ``-1`` for
       auto-detection, ``0`` to disable).
   * - ``stopping_rounds``
     - integer
     - Number of scoring rounds without improvement before early stopping
       (default 3, use ``0`` to disable).
   * - ``stopping_metric``
     - string
     - Metric to monitor for early stopping: ``AUTO``, ``deviance``,
       ``logloss``, ``RMSE``, ``AUC``.
   * - ``stopping_tolerance``
     - float
     - Relative tolerance for early stopping (default 0.001).
   * - ``exploitation_ratio``
     - float
     - Ratio of time for fine-tuning vs. exploring new models (0.0--1.0).
   * - ``distribution``
     - string
     - Distribution family (regression): ``gaussian``, ``poisson``,
       ``gamma``, ``tweedie``, ``huber``, ``quantile``, ``laplace``.
   * - ``keep_cross_validation_predictions``
     - boolean
     - Keep CV predictions for Stacked Ensembles (default ``true``).
   * - ``keep_cross_validation_models``
     - boolean
     - Keep individual CV models (default ``false``).
   * - ``verbosity``
     - string
     - Logging level: ``debug``, ``info``, ``warn``.

.. _supported-algorithms:

Supported Algorithms
^^^^^^^^^^^^^^^^^^^^

CorePlexML uses H2O AutoML which automatically trains, tunes, and ranks
models from the following algorithm families:

.. list-table::
   :header-rows: 1
   :widths: 20 15 65

   * - Algorithm ID
     - Display Name
     - Description
   * - ``GBM``
     - Gradient Boosting
     - Gradient Boosting Machine. Builds sequential decision trees where
       each tree corrects errors of the previous ones. Strong default
       performance on most tabular datasets.
   * - ``XGBoost``
     - XGBoost
     - Extreme Gradient Boosting. Optimized implementation with
       regularization options (L1/L2). Often produces top-performing
       models, especially on structured data.
   * - ``DRF``
     - Random Forest
     - Distributed Random Forest. Ensemble of decision trees using random
       feature subsets and bootstrap sampling. Robust to overfitting.
   * - ``DeepLearning``
     - Deep Learning
     - Multi-layer feed-forward neural network. Multiple hidden layers
       with configurable activation functions. Best for datasets with
       complex nonlinear relationships.
   * - ``GLM``
     - Linear Model
     - Generalized Linear Model. Includes linear regression, logistic
       regression, and other link functions. Fast, interpretable, and
       useful as a baseline.
   * - ``StackedEnsemble``
     - Stacked Ensemble
     - Meta-model that combines predictions from all other trained models
       using a secondary learner. Typically achieves the best accuracy.
       Note: SHAP feature contributions are **not available** for
       StackedEnsemble models.

All six algorithms are trained by default. Use ``exclude_algos`` to skip
specific ones:

.. code-block:: json

   {
     "config": {
       "exclude_algos": ["DeepLearning", "StackedEnsemble"]
     }
   }

.. _sort-metrics:

Sort & Stopping Metrics
^^^^^^^^^^^^^^^^^^^^^^^^

The ``sort_metric`` determines how models are ranked on the leaderboard.
The ``stopping_metric`` determines when early stopping triggers. Set to
``AUTO`` (default) and the platform selects the best metric for your
problem type.

**Binary Classification Metrics**

.. list-table::
   :header-rows: 1
   :widths: 25 75

   * - Metric
     - Description
   * - ``AUC``
     - Area Under ROC Curve (default for binary classification). Higher
       is better. Range: 0--1.
   * - ``AUCPR``
     - Area Under Precision-Recall Curve. Better than AUC for imbalanced
       datasets.
   * - ``logloss``
     - Logarithmic loss (cross-entropy). Lower is better.
   * - ``mean_per_class_error``
     - Average error rate across classes. Lower is better.
   * - ``accuracy``
     - Classification accuracy. Higher is better. Not recommended for
       imbalanced classes.
   * - ``MCC``
     - Matthews Correlation Coefficient. Balanced metric for imbalanced
       data. Range: -1 to 1.

**Multiclass Classification Metrics**

.. list-table::
   :header-rows: 1
   :widths: 25 75

   * - Metric
     - Description
   * - ``logloss``
     - Multinomial log loss (default for multiclass). Lower is better.
   * - ``mean_per_class_error``
     - Average error rate across all classes. Lower is better.

**Regression Metrics**

.. list-table::
   :header-rows: 1
   :widths: 25 75

   * - Metric
     - Description
   * - ``RMSE``
     - Root Mean Squared Error (default for regression). Lower is better.
   * - ``MSE``
     - Mean Squared Error. Lower is better.
   * - ``MAE``
     - Mean Absolute Error. Less sensitive to outliers. Lower is better.
   * - ``RMSLE``
     - Root Mean Squared Logarithmic Error. Good for skewed targets.
       Lower is better.
   * - ``mean_residual_deviance``
     - Mean Residual Deviance. Lower is better.
   * - ``R2``
     - R-squared (coefficient of determination). Higher is better.
       Range: 0--1.

**Distribution-Specific Metrics**

.. list-table::
   :header-rows: 1
   :widths: 25 75

   * - Metric
     - Description
   * - ``quantile_loss``
     - Quantile loss. Available when ``distribution`` is ``quantile``
       or ``huber``.
   * - ``deviance``
     - Deviance. Available for all distributions. Lower is better.

Problem Types
^^^^^^^^^^^^^

.. list-table::
   :header-rows: 1
   :widths: 20 80

   * - Value
     - Description
   * - ``classification``
     - Binary or multiclass classification. Auto-detected from the target
       column cardinality.
   * - ``regression``
     - Continuous numeric prediction.
   * - ``binary``
     - Alias for ``classification`` (mapped automatically).
   * - ``multiclass``
     - Alias for ``classification`` (mapped automatically).

Experiment Status Lifecycle
^^^^^^^^^^^^^^^^^^^^^^^^^^^

An experiment transitions through these statuses:

.. list-table::
   :header-rows: 1
   :widths: 20 80

   * - Status
     - Description
   * - ``queued``
     - Experiment created and waiting for worker to pick it up.
   * - ``running``
     - AutoML training in progress. Models are being trained.
   * - ``completed`` / ``succeeded``
     - Training finished successfully. Models are available on the
       leaderboard. (API returns ``succeeded``, UI displays ``completed``.)
   * - ``failed``
     - Training failed due to an error (bad data, configuration issue,
       or system error). Check experiment logs for details.

Job Status Values
^^^^^^^^^^^^^^^^^

Background jobs (experiments, reports, synthgen) use these statuses:

.. list-table::
   :header-rows: 1
   :widths: 20 80

   * - Status
     - Description
   * - ``queued``
     - Job created, waiting for worker.
   * - ``running``
     - Worker is processing the job.
   * - ``succeeded``
     - Job completed successfully.
   * - ``failed``
     - Job failed. Error details in job payload.

**Example**

.. code-block:: bash

   curl -X POST "$BASE_URL/api/experiments" \
     -H "Authorization: Bearer YOUR_API_KEY" \
     -H "Content-Type: application/json" \
     -d '{
       "project_id": "d4e5f6a7-b8c9-0123-def4-567890123456",
       "dataset_version_id": "f6a7b8c9-d0e1-2345-6789-0abcdef12345",
       "name": "Fraud Detection AutoML",
       "target_column": "is_fraud",
       "problem_type": "classification",
       "config": {
         "max_runtime_secs": 300,
         "max_models": 20,
         "sort_metric": "auc",
         "nfolds": 5,
         "balance_classes": true,
         "exclude_algos": ["DeepLearning"]
       }
     }'

.. code-block:: python

   import requests
   import time

   # Create the experiment
   resp = requests.post(f"{BASE_URL}/api/experiments", headers={
       "Authorization": "Bearer YOUR_API_KEY",
   }, json={
       "project_id": "d4e5f6a7-b8c9-0123-def4-567890123456",
       "dataset_version_id": "f6a7b8c9-d0e1-2345-6789-0abcdef12345",
       "name": "Fraud Detection AutoML",
       "target_column": "is_fraud",
       "problem_type": "classification",
       "config": {
           "max_runtime_secs": 300,
           "max_models": 20,
           "sort_metric": "auc",
           "nfolds": 5,
           "balance_classes": True,
       },
   })
   experiment_id = resp.json()["experiment_id"]

   # Poll for completion
   while True:
       status_resp = requests.get(
           f"{BASE_URL}/api/experiments/{experiment_id}/status",
           headers={"Authorization": "Bearer YOUR_API_KEY"},
       )
       status = status_resp.json()["status"]
       print(f"Status: {status}")
       if status in ("succeeded", "failed"):
           break
       time.sleep(10)

**Response** ``201 Created``

.. code-block:: json

   {
     "id": "b8c9d0e1-f2a3-4567-8901-bcdef1234567",
     "experiment_id": "b8c9d0e1-f2a3-4567-8901-bcdef1234567",
     "job_id": "c9d0e1f2-a3b4-5678-9012-cdef12345678",
     "status": "queued"
   }

----

List Experiments
----------------

.. code-block:: text

   GET /api/experiments

**Query Parameters**

.. list-table::
   :header-rows: 1
   :widths: 20 10 10 60

   * - Parameter
     - Type
     - Default
     - Description
   * - ``project_id``
     - string
     - --
     - Filter by project.
   * - ``dataset_id``
     - string
     - --
     - Filter by source dataset.
   * - ``status``
     - string
     - --
     - Filter by status: ``queued``, ``running``, ``succeeded``, ``failed``.
   * - ``limit``
     - integer
     - 50
     - Max items (1--500).
   * - ``offset``
     - integer
     - 0
     - Pagination offset.
   * - ``search``
     - string
     - --
     - Search in name and target column.
   * - ``sort_field``
     - string
     - ``created_at``
     - ``created_at``, ``name``, or ``status``.
   * - ``sort_direction``
     - string
     - ``desc``
     - ``asc`` or ``desc``.

**Example**

.. code-block:: bash

   curl "$BASE_URL/api/experiments?project_id=d4e5f6a7-b8c9-0123-def4-567890123456" \
     -H "Authorization: Bearer YOUR_API_KEY"

**Response** ``200 OK``

.. code-block:: json

   {
     "items": [
       {
         "id": "b8c9d0e1-f2a3-4567-8901-bcdef1234567",
         "name": "Fraud Detection AutoML",
         "project_name": "Fraud Detection v2",
         "dataset_name": "Transactions Q4",
         "target_column": "is_fraud",
         "problem_type": "classification",
         "status": "succeeded",
         "model_count": 12,
         "best_model_id": "a7b8c9d0-e1f2-3456-7890-abcdef123456",
         "best_model_algo": "GBM",
         "created_at": "2026-02-12T10:30:00Z"
       }
     ],
     "total": 1,
     "limit": 50,
     "offset": 0
   }

----

Get Experiment Detail
---------------------

.. code-block:: text

   GET /api/experiments/{experiment_id}

Return the experiment, its models, artifact links, and the best model ID.

**Example**

.. code-block:: bash

   curl "$BASE_URL/api/experiments/b8c9d0e1-f2a3-4567-8901-bcdef1234567" \
     -H "Authorization: Bearer YOUR_API_KEY"

.. code-block:: python

   resp = requests.get(
       f"{BASE_URL}/api/experiments/b8c9d0e1-f2a3-4567-8901-bcdef1234567",
       headers={"Authorization": "Bearer YOUR_API_KEY"},
   )
   data = resp.json()
   print("Best model:", data["best_model_id"])
   for m in data["models"]:
       print(f"  {m['algo']}: {m['metrics'].get('auc', 'N/A')}")

**Response** ``200 OK``

.. code-block:: json

   {
     "experiment": {
       "id": "b8c9d0e1-f2a3-4567-8901-bcdef1234567",
       "name": "Fraud Detection AutoML",
       "project_id": "d4e5f6a7-b8c9-0123-def4-567890123456",
       "dataset_version_id": "f6a7b8c9-d0e1-2345-6789-0abcdef12345",
       "target_column": "is_fraud",
       "problem_type": "classification",
       "status": "succeeded",
       "max_runtime_secs": 300,
       "max_models": 20,
       "sort_metric": "auc"
     },
     "models": [
       {
         "id": "a7b8c9d0-e1f2-3456-7890-abcdef123456",
         "name": "GBM_1_AutoML",
         "algo": "GBM",
         "metrics": {"auc": 0.982, "logloss": 0.071}
       }
     ],
     "artifact_links": [],
     "best_model_id": "a7b8c9d0-e1f2-3456-7890-abcdef123456"
   }

----

Get Experiment Status
---------------------

.. code-block:: text

   GET /api/experiments/{experiment_id}/status

Lightweight polling endpoint that returns only the current status.
Use this for progress polling instead of the full detail endpoint.

**Example**

.. code-block:: bash

   curl "$BASE_URL/api/experiments/b8c9d0e1-f2a3-4567-8901-bcdef1234567/status" \
     -H "Authorization: Bearer YOUR_API_KEY"

**Response** ``200 OK``

.. code-block:: json

   {
     "experiment_id": "b8c9d0e1-f2a3-4567-8901-bcdef1234567",
     "status": "running"
   }

Possible status values: ``queued``, ``running``, ``succeeded``, ``failed``,
``canceled``.

----

Get Training Logs
-----------------

.. code-block:: text

   GET /api/experiments/{experiment_id}/logs

Return training logs for debugging and monitoring. Includes H2O engine
output, job status, and error messages.

**Query Parameters**

.. list-table::
   :header-rows: 1
   :widths: 20 10 10 60

   * - Parameter
     - Type
     - Default
     - Description
   * - ``include_h2o_logs``
     - boolean
     - ``true``
     - Include detailed H2O event logs.

**Example**

.. code-block:: bash

   curl "$BASE_URL/api/experiments/b8c9d0e1-f2a3-4567-8901-bcdef1234567/logs" \
     -H "Authorization: Bearer YOUR_API_KEY"

**Response** ``200 OK``

.. code-block:: json

   {
     "experiment_id": "b8c9d0e1-f2a3-4567-8901-bcdef1234567",
     "logs": [
       "Experiment status: succeeded",
       "Job status: completed",
       "",
       "AutoML progress: 100%",
       "Best model: GBM_1_AutoML (AUC=0.982)"
     ],
     "count": 5
   }

----

Explainability
--------------

.. code-block:: text

   GET /api/experiments/{experiment_id}/explain

Return multi-model explainability data including a Pareto front
(accuracy vs. training time) and a variable importance heatmap across
all models in the experiment.

**Example**

.. code-block:: bash

   curl "$BASE_URL/api/experiments/b8c9d0e1-f2a3-4567-8901-bcdef1234567/explain" \
     -H "Authorization: Bearer YOUR_API_KEY"

**Response** ``200 OK``

.. code-block:: json

   {
     "explain": {
       "pareto_front": {
         "models": [
           {"name": "GBM_1_AutoML", "algo": "GBM", "metric": 0.982, "training_time": 45.2},
           {"name": "XGBoost_1_AutoML", "algo": "XGBoost", "metric": 0.978, "training_time": 32.1}
         ]
       },
       "varimp_heatmap": {
         "models": ["GBM_1_AutoML", "XGBoost_1_AutoML"],
         "features": ["amount", "merchant_id", "hour"],
         "matrix": [
           [0.45, 0.30, 0.15],
           [0.42, 0.35, 0.12]
         ]
       }
     }
   }

----

Set Best Model
--------------

.. code-block:: text

   POST /api/experiments/{experiment_id}/best-model

Override the default best model selection for an experiment.

**Request Body**

.. list-table::
   :header-rows: 1
   :widths: 20 10 10 60

   * - Field
     - Type
     - Required
     - Description
   * - ``model_id``
     - string
     - Yes
     - UUID of a model that belongs to this experiment.

**Example**

.. code-block:: bash

   curl -X POST "$BASE_URL/api/experiments/b8c9d0e1-f2a3-4567-8901-bcdef1234567/best-model" \
     -H "Authorization: Bearer YOUR_API_KEY" \
     -H "Content-Type: application/json" \
     -d '{"model_id": "a7b8c9d0-e1f2-3456-7890-abcdef123456"}'

**Response** ``200 OK``

.. code-block:: json

   {
     "ok": true,
     "experiment_id": "b8c9d0e1-f2a3-4567-8901-bcdef1234567",
     "best_model_id": "a7b8c9d0-e1f2-3456-7890-abcdef123456"
   }

----

Delete Experiment
-----------------

.. code-block:: text

   DELETE /api/experiments/{experiment_id}

Delete an experiment and its associated models.

**Example**

.. code-block:: bash

   curl -X DELETE "$BASE_URL/api/experiments/b8c9d0e1-f2a3-4567-8901-bcdef1234567" \
     -H "Authorization: Bearer YOUR_API_KEY"

**Response** ``200 OK``

.. code-block:: json

   {
     "ok": true
   }

----

.. seealso::

   - :doc:`models` -- Inspecting individual models from an experiment.
   - :doc:`deployments` -- Deploying the best model to production.