================= Privacy Suite API ================= The Privacy Suite detects and transforms personally identifiable information (PII) in datasets. It supports 72+ PII types, four compliance profiles (HIPAA, GDPR, PCI-DSS, CCPA), and seven transformation actions. The typical workflow is: **create policy** (with rules) then **create session** (linking policy to dataset) then **detect** then **transform** then **download**. All endpoints are prefixed with ``/api/privacy``. .. contents:: Endpoints :local: :depth: 1 ---- Create Policy ------------- .. code-block:: text POST /api/privacy/policies Create a privacy policy that defines which PII types to detect and how to handle them. **Request Body** .. list-table:: :header-rows: 1 :widths: 20 10 10 60 * - Field - Type - Required - Description * - ``project_id`` - string - Yes - UUID of the project. * - ``name`` - string - Yes - Policy name. * - ``description`` - string - No - Optional description. * - ``profile`` - string - No - Compliance profile: ``hipaa``, ``gdpr``, ``pci_dss``, ``ccpa``, or ``custom``. * - ``status`` - string - No - ``draft`` (default) or ``active``. * - ``settings`` - object - No - Additional configuration (e.g. sensitivity thresholds). **Example** .. code-block:: bash curl -X POST "$BASE_URL/api/privacy/policies" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "project_id": "d4e5f6a7-b8c9-0123-def4-567890123456", "name": "HIPAA Compliance Policy", "description": "Detect and redact all 18 PHI identifiers", "profile": "hipaa", "status": "active" }' .. code-block:: python import requests resp = requests.post(f"{BASE_URL}/api/privacy/policies", headers={ "Authorization": "Bearer YOUR_API_KEY", }, json={ "project_id": "d4e5f6a7-b8c9-0123-def4-567890123456", "name": "HIPAA Compliance Policy", "description": "Detect and redact all 18 PHI identifiers", "profile": "hipaa", "status": "active", }) policy_id = resp.json()["policy_id"] **Response** ``201 Created`` .. code-block:: json { "id": "1a2b3c4d-5e6f-7890-abcd-ef1234567890", "policy_id": "1a2b3c4d-5e6f-7890-abcd-ef1234567890" } ---- List Policies ------------- .. code-block:: text GET /api/privacy/policies **Query Parameters** .. list-table:: :header-rows: 1 :widths: 20 10 10 60 * - Parameter - Type - Default - Description * - ``project_id`` - string - -- - Filter by project. If omitted, returns all accessible policies. * - ``status`` - string - -- - Filter by status (``draft``, ``active``). * - ``limit`` - integer - 50 - Max items. * - ``offset`` - integer - 0 - Pagination offset. **Example** .. code-block:: bash curl "$BASE_URL/api/privacy/policies?project_id=d4e5f6a7-b8c9-0123-def4-567890123456" \ -H "Authorization: Bearer YOUR_API_KEY" **Response** ``200 OK`` .. code-block:: json { "items": [ { "id": "1a2b3c4d-5e6f-7890-abcd-ef1234567890", "name": "HIPAA Compliance Policy", "profile": "hipaa", "status": "active", "rules_count": 5, "compliance_score": 100, "detected_count": 4, "protected_count": 4, "pending_count": 0, "created_at": "2026-02-20T09:00:00Z" } ], "total": 1, "limit": 50, "offset": 0 } ---- Get Policy ---------- .. code-block:: text GET /api/privacy/policies/{policy_id} Return policy details including its rules and compliance metrics. **Example** .. code-block:: bash curl "$BASE_URL/api/privacy/policies/1a2b3c4d-5e6f-7890-abcd-ef1234567890" \ -H "Authorization: Bearer YOUR_API_KEY" **Response** ``200 OK`` .. code-block:: json { "policy": { "id": "1a2b3c4d-5e6f-7890-abcd-ef1234567890", "name": "HIPAA Compliance Policy", "profile": "hipaa", "project_name": "Fraud Detection v2", "rules_count": 5, "compliance_score": 100 }, "rules": [ { "id": "2b3c4d5e-6f7a-8901-bcde-f23456789012", "pii_type": "email", "action": "redact", "is_enabled": true, "priority": 10, "config": {} } ] } ---- Update Policy ------------- .. code-block:: text PATCH /api/privacy/policies/{policy_id} Partial update of a policy. Only provided fields are modified. **Request Body** .. list-table:: :header-rows: 1 :widths: 20 10 10 60 * - Field - Type - Required - Description * - ``name`` - string - No - Updated name. * - ``description`` - string - No - Updated description. * - ``status`` - string - No - ``draft`` or ``active``. * - ``profile`` - string - No - Updated compliance profile. * - ``settings`` - object - No - Updated configuration. **Example** .. code-block:: bash curl -X PATCH "$BASE_URL/api/privacy/policies/1a2b3c4d-5e6f-7890-abcd-ef1234567890" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"status": "active"}' **Response** ``200 OK`` .. code-block:: json { "ok": true } ---- Delete Policy ------------- .. code-block:: text DELETE /api/privacy/policies/{policy_id} Permanently delete a privacy policy and its rules. **Example** .. code-block:: bash curl -X DELETE "$BASE_URL/api/privacy/policies/1a2b3c4d-5e6f-7890-abcd-ef1234567890" \ -H "Authorization: Bearer YOUR_API_KEY" **Response** ``200 OK`` .. code-block:: json { "ok": true } ---- Create Detection Rule --------------------- .. code-block:: text POST /api/privacy/policies/{policy_id}/rules Add a PII detection and transformation rule to a policy. **Request Body** .. list-table:: :header-rows: 1 :widths: 20 10 10 60 * - Field - Type - Required - Description * - ``pii_type`` - string - Yes - PII type identifier (e.g. ``email``, ``ssn``, ``credit_card``). Use ``GET /api/privacy/pii-types`` for the full list. * - ``action`` - string - Yes - Transformation action: ``mask``, ``hash``, ``redact``, ``encrypt``, ``generalize``, ``suppress``, or ``pseudonymize``. * - ``is_enabled`` - boolean - No - Whether this rule is active (default ``true``). * - ``priority`` - integer - No - Higher priority rules take precedence (default ``0``). * - ``config`` - object - No - Action-specific configuration (e.g. mask character, hash algorithm). **Example** .. code-block:: bash curl -X POST "$BASE_URL/api/privacy/policies/1a2b3c4d-5e6f-7890-abcd-ef1234567890/rules" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "pii_type": "email", "action": "redact", "is_enabled": true, "priority": 10 }' .. code-block:: python resp = requests.post( f"{BASE_URL}/api/privacy/policies/1a2b3c4d-5e6f-7890-abcd-ef1234567890/rules", headers={"Authorization": "Bearer YOUR_API_KEY"}, json={ "pii_type": "email", "action": "redact", "is_enabled": True, "priority": 10, }, ) rule_id = resp.json()["rule_id"] **Response** ``201 Created`` .. code-block:: json { "id": "2b3c4d5e-6f7a-8901-bcde-f23456789012", "rule_id": "2b3c4d5e-6f7a-8901-bcde-f23456789012" } ---- List Rules ---------- .. code-block:: text GET /api/privacy/policies/{policy_id}/rules Return all rules for a policy. **Example** .. code-block:: bash curl "$BASE_URL/api/privacy/policies/1a2b3c4d-5e6f-7890-abcd-ef1234567890/rules" \ -H "Authorization: Bearer YOUR_API_KEY" **Response** ``200 OK`` .. code-block:: json { "rules": [ { "id": "2b3c4d5e-6f7a-8901-bcde-f23456789012", "policy_id": "1a2b3c4d-5e6f-7890-abcd-ef1234567890", "pii_type": "email", "action": "redact", "is_enabled": true, "enabled": true, "priority": 10, "config": {} }, { "id": "3c4d5e6f-7a8b-9012-cdef-345678901234", "policy_id": "1a2b3c4d-5e6f-7890-abcd-ef1234567890", "pii_type": "ssn", "action": "hash", "is_enabled": true, "enabled": true, "priority": 20, "config": {"algorithm": "sha256"} } ] } ---- Create Session -------------- .. code-block:: text POST /api/privacy/sessions Create a privacy session that links a policy to a dataset version for PII detection and transformation. **Request Body** .. list-table:: :header-rows: 1 :widths: 25 10 10 55 * - Field - Type - Required - Description * - ``policy_id`` - string - Yes - UUID of the privacy policy to apply. * - ``dataset_id`` - string - No* - UUID of the dataset (uses latest version). * - ``dataset_version_id`` - string - No* - UUID of a specific dataset version. * - ``project_id`` - string - No - Must match the policy's project if provided. * - ``sample_size`` - integer - No - Number of rows to sample for detection (null = all rows). * - ``deep_scan`` - boolean - No - Enable deep scan mode for more thorough detection (default ``false``). \* Provide either ``dataset_id`` or ``dataset_version_id``. **Example** .. code-block:: bash curl -X POST "$BASE_URL/api/privacy/sessions" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "policy_id": "1a2b3c4d-5e6f-7890-abcd-ef1234567890", "dataset_id": "e5f6a7b8-c9d0-1234-5678-90abcdef1234", "deep_scan": true }' **Response** ``201 Created`` .. code-block:: json { "id": "4d5e6f7a-8b9c-0123-def4-567890123456", "session_id": "4d5e6f7a-8b9c-0123-def4-567890123456" } ---- List Sessions ------------- .. code-block:: text GET /api/privacy/sessions **Query Parameters** .. list-table:: :header-rows: 1 :widths: 20 10 10 60 * - Parameter - Type - Default - Description * - ``project_id`` - string - -- - Filter by project. * - ``policy_id`` - string - -- - Filter by policy. * - ``limit`` - integer - 50 - Max items. * - ``offset`` - integer - 0 - Pagination offset. **Example** .. code-block:: bash curl "$BASE_URL/api/privacy/sessions?project_id=d4e5f6a7-b8c9-0123-def4-567890123456" \ -H "Authorization: Bearer YOUR_API_KEY" **Response** ``200 OK`` .. code-block:: json { "items": [ { "id": "4d5e6f7a-8b9c-0123-def4-567890123456", "policy_name": "HIPAA Compliance Policy", "dataset_name": "Transactions Q4", "status": "completed", "pii_found": 4, "transformed": 4, "created_at": "2026-02-25T10:00:00Z" } ], "total": 1, "limit": 50, "offset": 0 } ---- Get Session Detail ------------------ .. code-block:: text GET /api/privacy/sessions/{session_id} Return session details including scan progress, detection counts, and transformation status. **Example** .. code-block:: bash curl "$BASE_URL/api/privacy/sessions/4d5e6f7a-8b9c-0123-def4-567890123456" \ -H "Authorization: Bearer YOUR_API_KEY" **Response** ``200 OK`` .. code-block:: json { "session": { "id": "4d5e6f7a-8b9c-0123-def4-567890123456", "policy_name": "HIPAA Compliance Policy", "dataset_name": "Transactions Q4", "status": "completed", "rows_scanned": 50000, "columns_analyzed": 15, "pii_columns": 4, "pii_instances": 200000, "transformed": 4, "pending": 0, "progress": 100, "progress_message": "Transformation complete", "output_artifact_id": "b4c5d6e7-f8a9-0123-4567-890abcdef123" } } ---- Run PII Detection ----------------- .. code-block:: text POST /api/privacy/sessions/{session_id}/detect Enqueue a ``privacy_detect`` background job to scan the dataset for PII. The session status transitions to ``detecting``. **Example** .. code-block:: bash curl -X POST "$BASE_URL/api/privacy/sessions/4d5e6f7a-8b9c-0123-def4-567890123456/detect" \ -H "Authorization: Bearer YOUR_API_KEY" **Response** ``200 OK`` .. code-block:: json { "job_id": "c5d6e7f8-a9b0-1234-5678-90abcdef1234", "session_id": "4d5e6f7a-8b9c-0123-def4-567890123456", "status": "detecting" } ---- Apply Anonymization ------------------- .. code-block:: text POST /api/privacy/sessions/{session_id}/transform Enqueue a ``privacy_transform`` background job to apply the transformation rules to detected PII. The session status transitions to ``transforming`` and then ``completed``. **Example** .. code-block:: bash curl -X POST "$BASE_URL/api/privacy/sessions/4d5e6f7a-8b9c-0123-def4-567890123456/transform" \ -H "Authorization: Bearer YOUR_API_KEY" **Response** ``200 OK`` .. code-block:: json { "job_id": "d6e7f8a9-b0c1-2345-6789-0abcdef12345", "session_id": "4d5e6f7a-8b9c-0123-def4-567890123456", "status": "transforming" } ---- Detection Results ----------------- .. code-block:: text GET /api/privacy/sessions/{session_id}/results Return aggregated PII detection results grouped by column and PII type. **Query Parameters** .. list-table:: :header-rows: 1 :widths: 20 10 10 60 * - Parameter - Type - Default - Description * - ``pii_type`` - string - -- - Filter by PII type (e.g. ``email``). * - ``confidence`` - string - -- - Filter by confidence bucket: ``high`` (>=0.9), ``medium`` (0.7--0.9), ``low`` (<0.7). **Example** .. code-block:: bash curl "$BASE_URL/api/privacy/sessions/4d5e6f7a-8b9c-0123-def4-567890123456/results" \ -H "Authorization: Bearer YOUR_API_KEY" .. code-block:: python resp = requests.get( f"{BASE_URL}/api/privacy/sessions/4d5e6f7a-8b9c-0123-def4-567890123456/results", headers={"Authorization": "Bearer YOUR_API_KEY"}, ) for r in resp.json()["results"]: print(f" {r['column']}: {r['pii_type']} (confidence={r['confidence']:.2f}, " f"instances={r['instances']}, action={r['action']})") **Response** ``200 OK`` .. code-block:: json { "results": [ { "column": "customer_email", "pii_type": "email", "confidence": 0.98, "instances": 49500, "sample": "alice@example.com", "action": "redact", "transformed": true, "samples": [] }, { "column": "customer_name", "pii_type": "name", "confidence": 0.92, "instances": 48000, "sample": "Alice Chen", "action": "mask", "transformed": true, "samples": [] } ] } ---- Download Anonymized Data ------------------------ .. code-block:: text GET /api/privacy/sessions/{session_id}/download Download the anonymized dataset. Redirects (302) to the artifact download URL. Returns ``404`` if the transformation has not been completed yet. **Example** .. code-block:: bash curl -L -o anonymized_data.csv \ "$BASE_URL/api/privacy/sessions/4d5e6f7a-8b9c-0123-def4-567890123456/download" \ -H "Authorization: Bearer YOUR_API_KEY" .. code-block:: python resp = requests.get( f"{BASE_URL}/api/privacy/sessions/4d5e6f7a-8b9c-0123-def4-567890123456/download", headers={"Authorization": "Bearer YOUR_API_KEY"}, allow_redirects=True, ) with open("anonymized_data.csv", "wb") as f: f.write(resp.content) **Response** ``302 Found`` Redirects to ``/api/artifacts/{artifact_id}/download``. ---- Generate Privacy Report ----------------------- .. code-block:: text GET /api/privacy/sessions/{session_id}/report Generate and download a branded PDF report summarizing the privacy session: policy details, detected PII, applied transformations, and compliance status. **Query Parameters** .. list-table:: :header-rows: 1 :widths: 20 10 10 60 * - Parameter - Type - Default - Description * - ``ai_insights`` - boolean - ``false`` - Include AI-generated commentary. **Example** .. code-block:: bash curl "$BASE_URL/api/privacy/sessions/4d5e6f7a-8b9c-0123-def4-567890123456/report?ai_insights=true" \ -H "Authorization: Bearer YOUR_API_KEY" **Response** ``200 OK`` .. code-block:: json { "artifact_id": "e7f8a9b0-c1d2-3456-7890-abcdef123456", "download_url": "/api/artifacts/e7f8a9b0-c1d2-3456-7890-abcdef123456/download" } ---- List PII Types -------------- .. code-block:: text GET /api/privacy/pii-types Return the full list of supported PII types. Over 72 types covering names, addresses, phone numbers, financial identifiers, medical records, biometric data, and more. **Example** .. code-block:: bash curl "$BASE_URL/api/privacy/pii-types" \ -H "Authorization: Bearer YOUR_API_KEY" **Response** ``200 OK`` .. code-block:: json { "types": [ {"id": "email", "name": "Email"}, {"id": "phone", "name": "Phone"}, {"id": "ssn", "name": "Ssn"}, {"id": "credit_card", "name": "Credit Card"}, {"id": "name", "name": "Name"}, {"id": "address", "name": "Address"}, {"id": "date_of_birth", "name": "Date Of Birth"}, {"id": "ip_address", "name": "Ip Address"}, {"id": "medical_record_number", "name": "Medical Record Number"} ] } ---- List Actions ------------ .. code-block:: text GET /api/privacy/actions Return all supported transformation actions. **Example** .. code-block:: bash curl "$BASE_URL/api/privacy/actions" \ -H "Authorization: Bearer YOUR_API_KEY" **Response** ``200 OK`` .. code-block:: json { "actions": [ {"id": "mask", "name": "Mask"}, {"id": "hash", "name": "Hash"}, {"id": "redact", "name": "Redact"}, {"id": "encrypt", "name": "Encrypt"}, {"id": "generalize", "name": "Generalize"}, {"id": "suppress", "name": "Suppress"}, {"id": "pseudonymize", "name": "Pseudonymize"} ] } ---- List Compliance Profiles ------------------------ .. code-block:: text GET /api/privacy/profiles Return available compliance profiles with their default PII types and recommended actions. **Example** .. code-block:: bash curl "$BASE_URL/api/privacy/profiles" \ -H "Authorization: Bearer YOUR_API_KEY" **Response** ``200 OK`` .. code-block:: json { "profiles": [ { "id": "hipaa", "name": "HIPAA", "description": "Health Insurance Portability and Accountability Act - 18 PHI identifiers", "pii_types": ["name", "address", "date", "phone", "fax", "email", "ssn", "medical_record_number", "health_plan_id", "account_number", "certificate_license", "vehicle_id", "device_id", "url", "ip_address", "biometric", "photo", "other_unique_id"], "default_action": "redact" }, { "id": "gdpr", "name": "GDPR", "description": "General Data Protection Regulation - EU personal data protection", "pii_types": ["name", "email", "phone", "address", "date_of_birth", "national_id", "passport", "ip_address", "cookie_id", "location", "biometric", "genetic", "health", "political", "religious", "sexual_orientation", "trade_union"], "default_action": "pseudonymize" }, { "id": "pci_dss", "name": "PCI-DSS", "description": "Payment Card Industry Data Security Standard", "pii_types": ["credit_card", "cvv", "pin", "magnetic_stripe", "cardholder_name", "expiration_date", "service_code"], "default_action": "tokenize" }, { "id": "ccpa", "name": "CCPA", "description": "California Consumer Privacy Act", "pii_types": ["name", "address", "email", "phone", "ssn", "drivers_license", "passport", "ip_address", "browsing_history", "purchase_history", "biometric", "geolocation", "employment_info", "education_info"], "default_action": "redact" }, { "id": "custom", "name": "Custom", "description": "Create your own custom privacy profile", "pii_types": [], "default_action": "mask" } ] } ---- Full Workflow Example --------------------- The following demonstrates the complete PII detection and anonymization workflow: .. code-block:: python import requests import time BASE_URL = "http://localhost:8888" HEADERS = {"Authorization": "Bearer YOUR_API_KEY"} PROJECT_ID = "d4e5f6a7-b8c9-0123-def4-567890123456" DATASET_ID = "e5f6a7b8-c9d0-1234-5678-90abcdef1234" # 1. Create a HIPAA policy resp = requests.post(f"{BASE_URL}/api/privacy/policies", headers=HEADERS, json={ "project_id": PROJECT_ID, "name": "HIPAA Policy", "profile": "hipaa", "status": "active", }) policy_id = resp.json()["policy_id"] # 2. Add detection rules for pii_type in ["email", "ssn", "phone", "name"]: requests.post( f"{BASE_URL}/api/privacy/policies/{policy_id}/rules", headers=HEADERS, json={"pii_type": pii_type, "action": "redact", "is_enabled": True}, ) # 3. Create a session linking the policy to the dataset resp = requests.post(f"{BASE_URL}/api/privacy/sessions", headers=HEADERS, json={ "policy_id": policy_id, "dataset_id": DATASET_ID, "deep_scan": True, }) session_id = resp.json()["session_id"] # 4. Run PII detection requests.post( f"{BASE_URL}/api/privacy/sessions/{session_id}/detect", headers=HEADERS, ) # Poll until detection completes while True: r = requests.get( f"{BASE_URL}/api/privacy/sessions/{session_id}", headers=HEADERS, ) status = r.json()["session"]["status"] if status not in ("pending", "detecting"): break time.sleep(5) # 5. View detection results results = requests.get( f"{BASE_URL}/api/privacy/sessions/{session_id}/results", headers=HEADERS, ).json() for r in results["results"]: print(f"Found {r['pii_type']} in column '{r['column']}' " f"({r['instances']} instances, confidence={r['confidence']:.0%})") # 6. Apply anonymization requests.post( f"{BASE_URL}/api/privacy/sessions/{session_id}/transform", headers=HEADERS, ) # Poll until transformation completes while True: r = requests.get( f"{BASE_URL}/api/privacy/sessions/{session_id}", headers=HEADERS, ) status = r.json()["session"]["status"] if status not in ("transforming",): break time.sleep(5) # 7. Download the anonymized dataset resp = requests.get( f"{BASE_URL}/api/privacy/sessions/{session_id}/download", headers=HEADERS, allow_redirects=True, ) with open("anonymized_data.csv", "wb") as f: f.write(resp.content) print("Anonymized dataset saved.") ---- Supported PII Types ------------------- The Privacy Suite supports 38 pattern-detected PII types organized into seven categories. Each type is identified by a unique string ID used in detection rules and API responses. **Personal Identity** .. list-table:: :header-rows: 1 :widths: 25 75 * - PII Type ID - Description * - ``name`` - Personal names (via context detection). * - ``ssn`` - US Social Security Numbers (xxx-xx-xxxx). * - ``date_of_birth`` - Date of birth patterns. * - ``passport_us`` - US passport numbers. * - ``passport_uk`` - UK passport numbers. * - ``passport_eu`` - EU passport numbers. * - ``drivers_license_us`` - US driver's license numbers. * - ``national_id_uk`` - UK National Insurance numbers. * - ``national_id_ca`` - Canadian Social Insurance numbers. * - ``ein`` - Employer Identification Numbers. * - ``itin`` - Individual Taxpayer Identification Numbers. * - ``vin`` - Vehicle Identification Numbers. **Contact Information** .. list-table:: :header-rows: 1 :widths: 25 75 * - PII Type ID - Description * - ``email`` - Email addresses (RFC 5322). * - ``phone_us`` - US phone numbers (xxx-xxx-xxxx, (xxx) xxx-xxxx). * - ``phone_intl`` - International phone numbers (+country code). * - ``phone_uk`` - UK phone numbers. * - ``phone_eu`` - EU phone numbers. * - ``address`` - Physical addresses (via context detection). **Financial** .. list-table:: :header-rows: 1 :widths: 25 75 * - PII Type ID - Description * - ``credit_card`` - Credit card numbers (Visa, Mastercard, Discover). * - ``credit_card_amex`` - American Express card numbers. * - ``iban`` - International Bank Account Numbers. * - ``swift_bic`` - SWIFT/BIC bank codes. * - ``routing_number`` - US bank routing numbers. **Healthcare (HIPAA)** .. list-table:: :header-rows: 1 :widths: 25 75 * - PII Type ID - Description * - ``medical_record`` - Medical record numbers (HIPAA). * - ``dea_number`` - DEA registration numbers. * - ``npi`` - National Provider Identifiers. * - ``health_plan_id`` - Health plan beneficiary numbers. **Geographic / Location** .. list-table:: :header-rows: 1 :widths: 25 75 * - PII Type ID - Description * - ``zip_code_us`` - US ZIP codes (5-digit and ZIP+4). * - ``postal_code_uk`` - UK postal codes. * - ``postal_code_ca`` - Canadian postal codes. * - ``latitude`` - Geographic latitude values. * - ``longitude`` - Geographic longitude values. * - ``geo_coordinate`` - Geographic coordinate pairs. **Digital Identifiers** .. list-table:: :header-rows: 1 :widths: 25 75 * - PII Type ID - Description * - ``ip_address`` - IPv4 addresses. * - ``ipv6_address`` - IPv6 addresses. * - ``mac_address`` - MAC addresses. * - ``url`` - URLs (http/https). * - ``uuid`` - UUID v4 identifiers. **Cryptocurrency** .. list-table:: :header-rows: 1 :widths: 25 75 * - PII Type ID - Description * - ``bitcoin_address`` - Bitcoin wallet addresses. * - ``ethereum_address`` - Ethereum wallet addresses. ---- Transformation Actions Reference --------------------------------- All eleven transformation actions supported by the Privacy Suite are listed below. Specify the action ID when creating detection rules. Some actions accept additional configuration via the ``config`` object. .. list-table:: :header-rows: 1 :widths: 20 45 35 * - Action ID - Description - Config Options * - ``redact`` - Replace detected value with ``[REDACTED]``. - None. * - ``mask`` - Partial masking (e.g., ``***-**-1234``). - ``show_last`` (int) -- number of trailing characters to reveal. * - ``hash`` - One-way SHA-256 hash. Preserves referential integrity across columns so the same input always produces the same hash. - None. * - ``encrypt`` - AES-256 reversible encryption. The original value can be recovered with the encryption key. - Requires an encryption key configured in policy settings. * - ``pseudonymize`` - Replace with consistent fake values (e.g., fake names, emails). The same input always maps to the same pseudonym within a session. - None. * - ``tokenize`` - Replace with random tokens. A lookup table is maintained so tokens can be reversed if needed. - None. * - ``generalize`` - Reduce precision of the value. - ``to`` (string) -- target granularity: ``"year"``, ``"month"``, ``"age_range"``, or ``"region"``. * - ``suppress`` - Remove the column entirely from output. - None. * - ``truncate`` - Shorten values to a fixed length. - ``length`` (int) -- maximum character length. * - ``k_anonymize`` - Apply k-anonymity grouping. Rows are grouped so that each combination of quasi-identifiers appears at least *k* times. - ``k`` (int) -- minimum group size. * - ``differential_privacy`` - Add calibrated statistical noise to numeric values. - ``epsilon`` (float) -- privacy budget. Lower values provide stronger privacy but more noise. ---- Compliance Profiles ------------------- The Privacy Suite ships with seven built-in compliance profiles. Each profile pre-selects the PII types mandated by the corresponding regulation and assigns a recommended default action. .. list-table:: :header-rows: 1 :widths: 12 25 63 * - Profile ID - Regulation - Scope * - ``hipaa`` - HIPAA (Health Insurance Portability and Accountability Act) - Targets all 18 PHI identifier categories: names, geographic data, dates, phone numbers, fax numbers, email addresses, SSNs, medical record numbers, health plan IDs, account numbers, certificate / license numbers, vehicle IDs, device serial numbers, URLs, IP addresses, biometric identifiers, photographs, and other unique identifiers. * - ``gdpr`` - GDPR (EU General Data Protection Regulation 2016/679) - All personal data as defined by the regulation: names, emails, IP addresses, location data, online identifiers, genetic data, and biometric data. * - ``pci_dss`` - PCI DSS v4.0 (Payment Card Industry Data Security Standard) - Card numbers, CVVs, cardholder names, expiration dates, PINs, and authentication data. * - ``ccpa`` - CCPA (California Consumer Privacy Act) - Broad PII scope: personal identifiers, commercial information, biometrics, internet activity, geolocation, and professional information. * - ``financial`` - General Financial Compliance - Account numbers, routing numbers, IBAN, SWIFT/BIC codes, EINs, and tax identification numbers. * - ``sox`` - SOX (Sarbanes-Oxley Act) - Financial records, audit trails, and internal control documentation. * - ``glba`` - GLBA (Gramm-Leach-Bliley Act) - Consumer financial data including account numbers, SSNs, income records, and credit history. In addition to these built-in profiles, you can select ``custom`` to create a profile with any combination of PII types and actions. ---- Detection Confidence Levels ---------------------------- Each PII detection is assigned a confidence score between ``0.0`` and ``1.0`` that reflects how certain the detector is about the match. **Confidence Score Ranges** .. list-table:: :header-rows: 1 :widths: 15 85 * - Score - Detection Method * - ``0.95`` - Exact column name match (e.g., a column named ``email`` or ``ssn``). * - ``0.85`` - Contains-keyword match (e.g., a column named ``customer_email_addr``). * - ``0.80`` - Prefix or suffix match (e.g., a column ending in ``_phone``). * - ``0.60 -- 0.95`` - Pattern / regex match. The exact score varies by pattern specificity. The default confidence threshold is **0.8 (80%)**. Detections below this threshold are not reported. You can lower the threshold in the policy settings: .. code-block:: json { "settings": { "confidence_threshold": 0.7 } } **Violation Severity Levels** When a compliance profile is active, each unresolved detection is classified into one of three severity levels: .. list-table:: :header-rows: 1 :widths: 15 85 * - Severity - Meaning * - ``error`` - Critical compliance violation that must be resolved before the dataset can be considered compliant. * - ``warning`` - Non-critical issue that should be reviewed. May indicate a borderline detection or a lower-priority PII type. * - ``info`` - Informational finding. Logged for audit purposes but does not affect the compliance score. ---- .. seealso:: - :doc:`datasets` -- Uploading datasets for PII scanning. - :doc:`deployments` -- Attaching privacy policies to deployments. - :doc:`reports` -- Generating compliance reports.