===============
Deployments API
===============

Deployments serve trained models as real-time prediction endpoints. The
MLOps lifecycle supports staging and production stages, promotion, rollback,
inference logging, and data drift detection.

All deployment endpoints are prefixed with ``/api/mlops``.

.. contents:: Endpoints
   :local:
   :depth: 1

----

Create Deployment
-----------------

.. code-block:: text

   POST /api/mlops/projects/{project_id}/deployments

Deploy a model to either staging or production. If deploying to
``production``, any existing production deployment in the same project
is automatically archived.

**Path Parameters**

.. list-table::
   :header-rows: 1
   :widths: 20 10 70

   * - Parameter
     - Type
     - Description
   * - ``project_id``
     - UUID
     - Project that owns the model.

**Request Body**

.. list-table::
   :header-rows: 1
   :widths: 25 10 10 55

   * - Field
     - Type
     - Required
     - Description
   * - ``model_id``
     - string
     - Yes
     - UUID of the model to deploy. Must belong to the same project.
   * - ``name``
     - string
     - Yes
     - Deployment name.
   * - ``stage``
     - string
     - Yes
     - ``staging`` or ``production``.
   * - ``traffic_percent``
     - integer
     - No
     - Traffic allocation percentage (1--100, default 100).
   * - ``privacy_policy_id``
     - string
     - No
     - Attach a privacy policy for runtime anonymization.
   * - ``privacy_anonymize_logs``
     - boolean
     - No
     - Anonymize PII in inference logs.
   * - ``privacy_anonymize_response``
     - boolean
     - No
     - Anonymize PII in prediction responses.
   * - ``privacy_column_map``
     - object
     - No
     - Map input column names to PII types.
   * - ``privacy_threshold``
     - float
     - No
     - Confidence threshold for PII detection (0.0--1.0).

**Example**

.. code-block:: bash

   curl -X POST "$BASE_URL/api/mlops/projects/d4e5f6a7-b8c9-0123-def4-567890123456/deployments" \
     -H "Authorization: Bearer YOUR_API_KEY" \
     -H "Content-Type: application/json" \
     -d '{
       "model_id": "a7b8c9d0-e1f2-3456-7890-abcdef123456",
       "name": "fraud-detector-staging",
       "stage": "staging",
       "traffic_percent": 100
     }'

.. code-block:: python

   import requests

   resp = requests.post(
       f"{BASE_URL}/api/mlops/projects/d4e5f6a7-b8c9-0123-def4-567890123456/deployments",
       headers={"Authorization": "Bearer YOUR_API_KEY"},
       json={
           "model_id": "a7b8c9d0-e1f2-3456-7890-abcdef123456",
           "name": "fraud-detector-staging",
           "stage": "staging",
           "traffic_percent": 100,
       },
   )
   deployment_id = resp.json()["deployment_id"]
   print("Deployed:", deployment_id)

**Response** ``201 Created``

.. code-block:: json

   {
     "id": "d0e1f2a3-b4c5-6789-0123-def456789012",
     "deployment_id": "d0e1f2a3-b4c5-6789-0123-def456789012",
     "stage": "staging",
     "is_active": true
   }

----

List Deployments
----------------

.. code-block:: text

   GET /api/mlops/projects/{project_id}/deployments

Return all deployments for a project.

**Query Parameters**

.. list-table::
   :header-rows: 1
   :widths: 20 10 10 60

   * - Parameter
     - Type
     - Default
     - Description
   * - ``limit``
     - integer
     - 50
     - Max items (1--200).
   * - ``offset``
     - integer
     - 0
     - Pagination offset.

**Example**

.. code-block:: bash

   curl "$BASE_URL/api/mlops/projects/d4e5f6a7-b8c9-0123-def4-567890123456/deployments" \
     -H "Authorization: Bearer YOUR_API_KEY"

.. code-block:: python

   resp = requests.get(
       f"{BASE_URL}/api/mlops/projects/d4e5f6a7-b8c9-0123-def4-567890123456/deployments",
       headers={"Authorization": "Bearer YOUR_API_KEY"},
   )
   for dep in resp.json()["items"]:
       print(f"{dep['id']}  stage={dep['stage']}  active={dep['is_active']}")

**Response** ``200 OK``

.. code-block:: json

   {
     "items": [
       {
         "id": "d0e1f2a3-b4c5-6789-0123-def456789012",
         "model_id": "a7b8c9d0-e1f2-3456-7890-abcdef123456",
         "stage": "staging",
         "is_active": true,
         "config": {"name": "fraud-detector-staging", "traffic_percent": 100},
         "created_at": "2026-02-25T09:00:00Z"
       }
     ],
     "total": 1,
     "limit": 50,
     "offset": 0
   }

----

Get Deployment Detail
---------------------

.. code-block:: text

   GET /api/mlops/deployments/{deployment_id}

Return deployment metadata and the associated model details.

**Example**

.. code-block:: bash

   curl "$BASE_URL/api/mlops/deployments/d0e1f2a3-b4c5-6789-0123-def456789012" \
     -H "Authorization: Bearer YOUR_API_KEY"

**Response** ``200 OK``

.. code-block:: json

   {
     "deployment": {
       "id": "d0e1f2a3-b4c5-6789-0123-def456789012",
       "project_id": "d4e5f6a7-b8c9-0123-def4-567890123456",
       "model_id": "a7b8c9d0-e1f2-3456-7890-abcdef123456",
       "stage": "staging",
       "is_active": true,
       "config": {"name": "fraud-detector-staging", "traffic_percent": 100},
       "created_at": "2026-02-25T09:00:00Z"
     },
     "model": {
       "id": "a7b8c9d0-e1f2-3456-7890-abcdef123456",
       "name": "GBM_1_AutoML",
       "algo": "GBM"
     }
   }

----

Predict via Deployment
----------------------

.. code-block:: text

   POST /api/mlops/deployments/{deployment_id}/predict

Send input features and receive predictions from the deployed model.
This is the primary real-time prediction endpoint. Inference requests are
logged automatically for monitoring and drift detection.

**Request Body**

.. list-table::
   :header-rows: 1
   :widths: 20 10 10 60

   * - Field
     - Type
     - Required
     - Description
   * - ``inputs``
     - dict or list[dict]
     - Yes
     - Feature values. Single record or batch.
   * - ``options``
     - object
     - No
     - Options (see table below).

**Predict Options**

.. list-table::
   :header-rows: 1
   :widths: 25 10 65

   * - Key
     - Type
     - Description
   * - ``include_contributions``
     - boolean
     - Include SHAP-style feature contributions (not available for
       StackedEnsemble models). Default: ``false``.
   * - ``store_payload``
     - boolean
     - Store input payload in inference logs. Default: ``true``.
   * - ``privacy_policy_id``
     - string
     - Override the deployment-level privacy policy for this request.
   * - ``anonymize_logs``
     - boolean
     - Override log anonymization for this request.
   * - ``anonymize_response``
     - boolean
     - Override response anonymization for this request.

**Example**

.. code-block:: bash

   curl -X POST "$BASE_URL/api/mlops/deployments/d0e1f2a3-b4c5-6789-0123-def456789012/predict" \
     -H "Authorization: Bearer YOUR_API_KEY" \
     -H "Content-Type: application/json" \
     -d '{
       "inputs": {"amount": 29.99, "merchant_id": 42, "hour": 14},
       "options": {"include_contributions": true}
     }'

.. code-block:: python

   resp = requests.post(
       f"{BASE_URL}/api/mlops/deployments/d0e1f2a3-b4c5-6789-0123-def456789012/predict",
       headers={"Authorization": "Bearer YOUR_API_KEY"},
       json={
           "inputs": {"amount": 29.99, "merchant_id": 42, "hour": 14},
           "options": {"include_contributions": True},
       },
   )
   result = resp.json()
   print("Prediction:", result["predictions"][0]["prediction"])

**Response** ``200 OK``

.. code-block:: json

   {
     "predictions": [
       {"prediction": "0"}
     ],
     "contributions": [
       [
         {"feature": "amount", "value": -0.32},
         {"feature": "merchant_id", "value": -0.15},
         {"feature": "BiasTerm", "value": -1.20}
       ]
     ],
     "latency_ms": 45
   }

----

Promote Deployment
------------------

.. code-block:: text

   POST /api/mlops/deployments/{deployment_id}/promote

Promote a deployment to production. Any existing production deployment
in the same project is automatically archived.

**Request Body**

.. list-table::
   :header-rows: 1
   :widths: 20 10 10 60

   * - Field
     - Type
     - Required
     - Description
   * - ``to_stage``
     - string
     - Yes
     - Must be ``production``.

**Example**

.. code-block:: bash

   curl -X POST "$BASE_URL/api/mlops/deployments/d0e1f2a3-b4c5-6789-0123-def456789012/promote" \
     -H "Authorization: Bearer YOUR_API_KEY" \
     -H "Content-Type: application/json" \
     -d '{"to_stage": "production"}'

.. code-block:: python

   requests.post(
       f"{BASE_URL}/api/mlops/deployments/d0e1f2a3-b4c5-6789-0123-def456789012/promote",
       headers={"Authorization": "Bearer YOUR_API_KEY"},
       json={"to_stage": "production"},
   )

**Response** ``200 OK``

.. code-block:: json

   {
     "ok": true,
     "deployment_id": "d0e1f2a3-b4c5-6789-0123-def456789012",
     "stage": "production"
   }

----

Rollback Deployment
-------------------

.. code-block:: text

   POST /api/mlops/deployments/{deployment_id}/rollback

Rollback production to a previous deployment or model. Provide either
``to_deployment_id`` (reactivate an archived deployment) or
``to_model_id`` (create a new production deployment for the given model).

**Request Body**

.. list-table::
   :header-rows: 1
   :widths: 25 10 10 55

   * - Field
     - Type
     - Required
     - Description
   * - ``to_deployment_id``
     - string
     - No*
     - UUID of the archived deployment to reactivate.
   * - ``to_model_id``
     - string
     - No*
     - UUID of the model to deploy as the new production version.

\* At least one of ``to_deployment_id`` or ``to_model_id`` is required.

**Example**

.. code-block:: bash

   curl -X POST "$BASE_URL/api/mlops/deployments/d0e1f2a3-b4c5-6789-0123-def456789012/rollback" \
     -H "Authorization: Bearer YOUR_API_KEY" \
     -H "Content-Type: application/json" \
     -d '{"to_model_id": "a7b8c9d0-e1f2-3456-7890-abcdef123456"}'

**Response** ``200 OK``

.. code-block:: json

   {
     "ok": true
   }

----

Deactivate Deployment
---------------------

.. code-block:: text

   POST /api/mlops/deployments/{deployment_id}/deactivate

Deactivate a deployment without promoting a replacement. The deployment
moves to ``archived`` stage.

**Example**

.. code-block:: bash

   curl -X POST "$BASE_URL/api/mlops/deployments/d0e1f2a3-b4c5-6789-0123-def456789012/deactivate" \
     -H "Authorization: Bearer YOUR_API_KEY"

**Response** ``200 OK``

.. code-block:: json

   {
     "ok": true
   }

----

Drift Metrics
-------------

.. code-block:: text

   GET /api/mlops/deployments/{deployment_id}/drift

Return drift detection results comparing the training distribution to
recent inference data. Returns per-feature PSI scores and overall drift
status.

**Example**

.. code-block:: bash

   curl "$BASE_URL/api/mlops/deployments/d0e1f2a3-b4c5-6789-0123-def456789012/drift" \
     -H "Authorization: Bearer YOUR_API_KEY"

.. code-block:: python

   resp = requests.get(
       f"{BASE_URL}/api/mlops/deployments/d0e1f2a3-b4c5-6789-0123-def456789012/drift",
       headers={"Authorization": "Bearer YOUR_API_KEY"},
   )
   drift = resp.json()
   for feature, info in drift.get("features", {}).items():
       print(f"{feature}: PSI={info['psi']:.4f} ({info['status']})")

**Response** ``200 OK``

.. code-block:: json

   {
     "deployment_id": "d0e1f2a3-b4c5-6789-0123-def456789012",
     "overall_status": "low",
     "features": {
       "amount": {"psi": 0.023, "status": "low"},
       "merchant_id": {"psi": 0.15, "status": "moderate"},
       "hour": {"psi": 0.008, "status": "low"}
     },
     "computed_at": "2026-02-28T12:00:00Z",
     "inference_rows": 5000
   }

----

Run Drift Detection
-------------------

.. code-block:: text

   POST /api/mlops/deployments/{deployment_id}/drift/run

Enqueue a background job to compute drift metrics against recent
inference data. The result can be retrieved via the drift metrics
endpoint once the job completes.

**Example**

.. code-block:: bash

   curl -X POST "$BASE_URL/api/mlops/deployments/d0e1f2a3-b4c5-6789-0123-def456789012/drift/run" \
     -H "Authorization: Bearer YOUR_API_KEY"

**Response** ``200 OK``

.. code-block:: json

   {
     "ok": true,
     "job_id": "e1f2a3b4-c5d6-7890-1234-ef5678901234"
   }

----

Inference Logs
--------------

.. code-block:: text

   GET /api/mlops/deployments/{deployment_id}/inference-logs

Return a paginated list of inference requests and responses for the
deployment. Useful for monitoring, auditing, and debugging.

**Query Parameters**

.. list-table::
   :header-rows: 1
   :widths: 20 10 10 60

   * - Parameter
     - Type
     - Default
     - Description
   * - ``limit``
     - integer
     - 50
     - Max items (1--500).
   * - ``offset``
     - integer
     - 0
     - Pagination offset.
   * - ``since``
     - string
     - --
     - ISO 8601 timestamp. Only return logs after this time.
   * - ``until``
     - string
     - --
     - ISO 8601 timestamp. Only return logs before this time.

**Example**

.. code-block:: bash

   curl "$BASE_URL/api/mlops/deployments/d0e1f2a3-b4c5-6789-0123-def456789012/inference-logs?limit=5" \
     -H "Authorization: Bearer YOUR_API_KEY"

**Response** ``200 OK``

.. code-block:: json

   {
     "items": [
       {
         "id": "f2a3b4c5-d6e7-8901-2345-f67890123456",
         "deployment_id": "d0e1f2a3-b4c5-6789-0123-def456789012",
         "inputs": {"amount": 29.99, "merchant_id": 42, "hour": 14},
         "prediction": "0",
         "latency_ms": 45,
         "status_code": 200,
         "created_at": "2026-02-28T14:30:00Z"
       }
     ],
     "total": 5000,
     "limit": 5,
     "offset": 0
   }

----

Deployment Stages
^^^^^^^^^^^^^^^^^

.. list-table::
   :header-rows: 1
   :widths: 20 80

   * - Stage
     - Description
   * - ``staging``
     - Pre-production testing. Use this to validate model behavior
       before going live.
   * - ``production``
     - Live serving. Predictions from this deployment are production
       traffic.

Deployment Status Values
^^^^^^^^^^^^^^^^^^^^^^^^

.. list-table::
   :header-rows: 1
   :widths: 20 80

   * - Status
     - Description
   * - ``active``
     - Deployment is running and serving predictions.
   * - ``inactive``
     - Deployment has been deactivated. Not serving predictions.
   * - ``failed``
     - Deployment failed to initialize or load the model.

----

Alert Metrics Reference
^^^^^^^^^^^^^^^^^^^^^^^

.. list-table::
   :header-rows: 1
   :widths: 25 75

   * - Metric
     - Description
   * - ``drift_psi``
     - Population Stability Index. Measures feature distribution drift
       between training and production data.
   * - ``accuracy_degradation``
     - Degradation in prediction accuracy over time.
   * - ``error_rate``
     - Percentage of predictions returning errors.
   * - ``latency_p99``
     - 99th percentile prediction latency in milliseconds.
   * - ``model_staleness``
     - Days since the model was last retrained.
   * - ``prediction_anomaly``
     - Anomalous prediction patterns detected.

Alert Condition Operators
^^^^^^^^^^^^^^^^^^^^^^^^^

.. list-table::
   :header-rows: 1
   :widths: 15 20 65

   * - Operator
     - Symbol
     - Description
   * - ``gt``
     - >
     - Greater than threshold.
   * - ``lt``
     - <
     - Less than threshold.
   * - ``gte``
     - >=
     - Greater than or equal to threshold.
   * - ``lte``
     - <=
     - Less than or equal to threshold.
   * - ``eq``
     - ==
     - Equal to threshold.
   * - ``neq``
     - !=
     - Not equal to threshold.

Alert Severity Levels
^^^^^^^^^^^^^^^^^^^^^

.. list-table::
   :header-rows: 1
   :widths: 20 80

   * - Severity
     - Description
   * - ``info``
     - Informational alert. No action required.
   * - ``warning``
     - Warning. Should be investigated.
   * - ``critical``
     - Critical issue requiring immediate attention.

Notification Channels
^^^^^^^^^^^^^^^^^^^^^

.. list-table::
   :header-rows: 1
   :widths: 20 80

   * - Channel
     - Description
   * - ``email``
     - Send alert to one or more email addresses.
   * - ``slack``
     - Post alert to a Slack channel via webhook.
   * - ``webhook``
     - Send alert payload to a custom HTTP endpoint.

Alert Rule Options
^^^^^^^^^^^^^^^^^^

.. list-table::
   :header-rows: 1
   :widths: 25 15 60

   * - Option
     - Type
     - Description
   * - ``cooldown_minutes``
     - integer
     - Minimum minutes between repeated alerts (1--1440, default 60).
   * - ``escalation_minutes``
     - integer
     - Minutes before escalating severity (optional, 1--1440).
   * - ``escalation_severity``
     - string
     - Severity to escalate to: ``warning`` or ``critical``.

----

Deployment Lifecycle Example
----------------------------

The following shows the full lifecycle: create a staging deployment,
test it, promote to production, and rollback if needed.

.. code-block:: python

   import requests

   BASE_URL = "http://localhost:8888"
   HEADERS = {"Authorization": "Bearer YOUR_API_KEY"}
   PROJECT_ID = "d4e5f6a7-b8c9-0123-def4-567890123456"
   MODEL_ID = "a7b8c9d0-e1f2-3456-7890-abcdef123456"

   # 1. Deploy to staging
   resp = requests.post(
       f"{BASE_URL}/api/mlops/projects/{PROJECT_ID}/deployments",
       headers=HEADERS,
       json={"model_id": MODEL_ID, "name": "v2-staging", "stage": "staging"},
   )
   deployment_id = resp.json()["deployment_id"]

   # 2. Test with a sample prediction
   resp = requests.post(
       f"{BASE_URL}/api/mlops/deployments/{deployment_id}/predict",
       headers=HEADERS,
       json={"inputs": {"amount": 50.0, "merchant_id": 10, "hour": 12}},
   )
   assert resp.status_code == 200
   print("Staging test passed:", resp.json()["predictions"])

   # 3. Promote to production
   requests.post(
       f"{BASE_URL}/api/mlops/deployments/{deployment_id}/promote",
       headers=HEADERS,
       json={"to_stage": "production"},
   )
   print("Promoted to production")

   # 4. Later: rollback if needed
   # requests.post(
   #     f"{BASE_URL}/api/mlops/deployments/{deployment_id}/rollback",
   #     headers=HEADERS,
   #     json={"to_model_id": OLD_MODEL_ID},
   # )

----

.. seealso::

   - :doc:`models` -- Model details and direct prediction.
   - :doc:`studio` -- What-if analysis against a deployment.
   - :doc:`reports` -- Generate deployment reports.