Privacy Suite API

The Privacy Suite detects and transforms personally identifiable information (PII) in datasets. It supports 72+ PII types, four compliance profiles (HIPAA, GDPR, PCI-DSS, CCPA), and seven transformation actions.

The typical workflow is: create policy (with rules) then create session (linking policy to dataset) then detect then transform then download.

All endpoints are prefixed with /api/privacy.


Create Policy

POST /api/privacy/policies

Create a privacy policy that defines which PII types to detect and how to handle them.

Request Body

Field

Type

Required

Description

project_id

string

Yes

UUID of the project.

name

string

Yes

Policy name.

description

string

No

Optional description.

profile

string

No

Compliance profile: hipaa, gdpr, pci_dss, ccpa, or custom.

status

string

No

draft (default) or active.

settings

object

No

Additional configuration (e.g. sensitivity thresholds).

Example

curl -X POST "$BASE_URL/api/privacy/policies" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "project_id": "d4e5f6a7-b8c9-0123-def4-567890123456",
    "name": "HIPAA Compliance Policy",
    "description": "Detect and redact all 18 PHI identifiers",
    "profile": "hipaa",
    "status": "active"
  }'
import requests

resp = requests.post(f"{BASE_URL}/api/privacy/policies", headers={
    "Authorization": "Bearer YOUR_API_KEY",
}, json={
    "project_id": "d4e5f6a7-b8c9-0123-def4-567890123456",
    "name": "HIPAA Compliance Policy",
    "description": "Detect and redact all 18 PHI identifiers",
    "profile": "hipaa",
    "status": "active",
})
policy_id = resp.json()["policy_id"]

Response 201 Created

{
  "id": "1a2b3c4d-5e6f-7890-abcd-ef1234567890",
  "policy_id": "1a2b3c4d-5e6f-7890-abcd-ef1234567890"
}

List Policies

GET /api/privacy/policies

Query Parameters

Parameter

Type

Default

Description

project_id

string

Filter by project. If omitted, returns all accessible policies.

status

string

Filter by status (draft, active).

limit

integer

50

Max items.

offset

integer

0

Pagination offset.

Example

curl "$BASE_URL/api/privacy/policies?project_id=d4e5f6a7-b8c9-0123-def4-567890123456" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response 200 OK

{
  "items": [
    {
      "id": "1a2b3c4d-5e6f-7890-abcd-ef1234567890",
      "name": "HIPAA Compliance Policy",
      "profile": "hipaa",
      "status": "active",
      "rules_count": 5,
      "compliance_score": 100,
      "detected_count": 4,
      "protected_count": 4,
      "pending_count": 0,
      "created_at": "2026-02-20T09:00:00Z"
    }
  ],
  "total": 1,
  "limit": 50,
  "offset": 0
}

Get Policy

GET /api/privacy/policies/{policy_id}

Return policy details including its rules and compliance metrics.

Example

curl "$BASE_URL/api/privacy/policies/1a2b3c4d-5e6f-7890-abcd-ef1234567890" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response 200 OK

{
  "policy": {
    "id": "1a2b3c4d-5e6f-7890-abcd-ef1234567890",
    "name": "HIPAA Compliance Policy",
    "profile": "hipaa",
    "project_name": "Fraud Detection v2",
    "rules_count": 5,
    "compliance_score": 100
  },
  "rules": [
    {
      "id": "2b3c4d5e-6f7a-8901-bcde-f23456789012",
      "pii_type": "email",
      "action": "redact",
      "is_enabled": true,
      "priority": 10,
      "config": {}
    }
  ]
}

Update Policy

PATCH /api/privacy/policies/{policy_id}

Partial update of a policy. Only provided fields are modified.

Request Body

Field

Type

Required

Description

name

string

No

Updated name.

description

string

No

Updated description.

status

string

No

draft or active.

profile

string

No

Updated compliance profile.

settings

object

No

Updated configuration.

Example

curl -X PATCH "$BASE_URL/api/privacy/policies/1a2b3c4d-5e6f-7890-abcd-ef1234567890" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"status": "active"}'

Response 200 OK

{
  "ok": true
}

Delete Policy

DELETE /api/privacy/policies/{policy_id}

Permanently delete a privacy policy and its rules.

Example

curl -X DELETE "$BASE_URL/api/privacy/policies/1a2b3c4d-5e6f-7890-abcd-ef1234567890" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response 200 OK

{
  "ok": true
}

Create Detection Rule

POST /api/privacy/policies/{policy_id}/rules

Add a PII detection and transformation rule to a policy.

Request Body

Field

Type

Required

Description

pii_type

string

Yes

PII type identifier (e.g. email, ssn, credit_card). Use GET /api/privacy/pii-types for the full list.

action

string

Yes

Transformation action: mask, hash, redact, encrypt, generalize, suppress, or pseudonymize.

is_enabled

boolean

No

Whether this rule is active (default true).

priority

integer

No

Higher priority rules take precedence (default 0).

config

object

No

Action-specific configuration (e.g. mask character, hash algorithm).

Example

curl -X POST "$BASE_URL/api/privacy/policies/1a2b3c4d-5e6f-7890-abcd-ef1234567890/rules" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "pii_type": "email",
    "action": "redact",
    "is_enabled": true,
    "priority": 10
  }'
resp = requests.post(
    f"{BASE_URL}/api/privacy/policies/1a2b3c4d-5e6f-7890-abcd-ef1234567890/rules",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={
        "pii_type": "email",
        "action": "redact",
        "is_enabled": True,
        "priority": 10,
    },
)
rule_id = resp.json()["rule_id"]

Response 201 Created

{
  "id": "2b3c4d5e-6f7a-8901-bcde-f23456789012",
  "rule_id": "2b3c4d5e-6f7a-8901-bcde-f23456789012"
}

List Rules

GET /api/privacy/policies/{policy_id}/rules

Return all rules for a policy.

Example

curl "$BASE_URL/api/privacy/policies/1a2b3c4d-5e6f-7890-abcd-ef1234567890/rules" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response 200 OK

{
  "rules": [
    {
      "id": "2b3c4d5e-6f7a-8901-bcde-f23456789012",
      "policy_id": "1a2b3c4d-5e6f-7890-abcd-ef1234567890",
      "pii_type": "email",
      "action": "redact",
      "is_enabled": true,
      "enabled": true,
      "priority": 10,
      "config": {}
    },
    {
      "id": "3c4d5e6f-7a8b-9012-cdef-345678901234",
      "policy_id": "1a2b3c4d-5e6f-7890-abcd-ef1234567890",
      "pii_type": "ssn",
      "action": "hash",
      "is_enabled": true,
      "enabled": true,
      "priority": 20,
      "config": {"algorithm": "sha256"}
    }
  ]
}

Create Session

POST /api/privacy/sessions

Create a privacy session that links a policy to a dataset version for PII detection and transformation.

Request Body

Field

Type

Required

Description

policy_id

string

Yes

UUID of the privacy policy to apply.

dataset_id

string

No*

UUID of the dataset (uses latest version).

dataset_version_id

string

No*

UUID of a specific dataset version.

project_id

string

No

Must match the policy’s project if provided.

sample_size

integer

No

Number of rows to sample for detection (null = all rows).

deep_scan

boolean

No

Enable deep scan mode for more thorough detection (default false).

* Provide either dataset_id or dataset_version_id.

Example

curl -X POST "$BASE_URL/api/privacy/sessions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "policy_id": "1a2b3c4d-5e6f-7890-abcd-ef1234567890",
    "dataset_id": "e5f6a7b8-c9d0-1234-5678-90abcdef1234",
    "deep_scan": true
  }'

Response 201 Created

{
  "id": "4d5e6f7a-8b9c-0123-def4-567890123456",
  "session_id": "4d5e6f7a-8b9c-0123-def4-567890123456"
}

List Sessions

GET /api/privacy/sessions

Query Parameters

Parameter

Type

Default

Description

project_id

string

Filter by project.

policy_id

string

Filter by policy.

limit

integer

50

Max items.

offset

integer

0

Pagination offset.

Example

curl "$BASE_URL/api/privacy/sessions?project_id=d4e5f6a7-b8c9-0123-def4-567890123456" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response 200 OK

{
  "items": [
    {
      "id": "4d5e6f7a-8b9c-0123-def4-567890123456",
      "policy_name": "HIPAA Compliance Policy",
      "dataset_name": "Transactions Q4",
      "status": "completed",
      "pii_found": 4,
      "transformed": 4,
      "created_at": "2026-02-25T10:00:00Z"
    }
  ],
  "total": 1,
  "limit": 50,
  "offset": 0
}

Get Session Detail

GET /api/privacy/sessions/{session_id}

Return session details including scan progress, detection counts, and transformation status.

Example

curl "$BASE_URL/api/privacy/sessions/4d5e6f7a-8b9c-0123-def4-567890123456" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response 200 OK

{
  "session": {
    "id": "4d5e6f7a-8b9c-0123-def4-567890123456",
    "policy_name": "HIPAA Compliance Policy",
    "dataset_name": "Transactions Q4",
    "status": "completed",
    "rows_scanned": 50000,
    "columns_analyzed": 15,
    "pii_columns": 4,
    "pii_instances": 200000,
    "transformed": 4,
    "pending": 0,
    "progress": 100,
    "progress_message": "Transformation complete",
    "output_artifact_id": "b4c5d6e7-f8a9-0123-4567-890abcdef123"
  }
}

Run PII Detection

POST /api/privacy/sessions/{session_id}/detect

Enqueue a privacy_detect background job to scan the dataset for PII. The session status transitions to detecting.

Example

curl -X POST "$BASE_URL/api/privacy/sessions/4d5e6f7a-8b9c-0123-def4-567890123456/detect" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response 200 OK

{
  "job_id": "c5d6e7f8-a9b0-1234-5678-90abcdef1234",
  "session_id": "4d5e6f7a-8b9c-0123-def4-567890123456",
  "status": "detecting"
}

Apply Anonymization

POST /api/privacy/sessions/{session_id}/transform

Enqueue a privacy_transform background job to apply the transformation rules to detected PII. The session status transitions to transforming and then completed.

Example

curl -X POST "$BASE_URL/api/privacy/sessions/4d5e6f7a-8b9c-0123-def4-567890123456/transform" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response 200 OK

{
  "job_id": "d6e7f8a9-b0c1-2345-6789-0abcdef12345",
  "session_id": "4d5e6f7a-8b9c-0123-def4-567890123456",
  "status": "transforming"
}

Detection Results

GET /api/privacy/sessions/{session_id}/results

Return aggregated PII detection results grouped by column and PII type.

Query Parameters

Parameter

Type

Default

Description

pii_type

string

Filter by PII type (e.g. email).

confidence

string

Filter by confidence bucket: high (>=0.9), medium (0.7–0.9), low (<0.7).

Example

curl "$BASE_URL/api/privacy/sessions/4d5e6f7a-8b9c-0123-def4-567890123456/results" \
  -H "Authorization: Bearer YOUR_API_KEY"
resp = requests.get(
    f"{BASE_URL}/api/privacy/sessions/4d5e6f7a-8b9c-0123-def4-567890123456/results",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
)
for r in resp.json()["results"]:
    print(f"  {r['column']}: {r['pii_type']} (confidence={r['confidence']:.2f}, "
          f"instances={r['instances']}, action={r['action']})")

Response 200 OK

{
  "results": [
    {
      "column": "customer_email",
      "pii_type": "email",
      "confidence": 0.98,
      "instances": 49500,
      "sample": "alice@example.com",
      "action": "redact",
      "transformed": true,
      "samples": []
    },
    {
      "column": "customer_name",
      "pii_type": "name",
      "confidence": 0.92,
      "instances": 48000,
      "sample": "Alice Chen",
      "action": "mask",
      "transformed": true,
      "samples": []
    }
  ]
}

Download Anonymized Data

GET /api/privacy/sessions/{session_id}/download

Download the anonymized dataset. Redirects (302) to the artifact download URL.

Returns 404 if the transformation has not been completed yet.

Example

curl -L -o anonymized_data.csv \
  "$BASE_URL/api/privacy/sessions/4d5e6f7a-8b9c-0123-def4-567890123456/download" \
  -H "Authorization: Bearer YOUR_API_KEY"
resp = requests.get(
    f"{BASE_URL}/api/privacy/sessions/4d5e6f7a-8b9c-0123-def4-567890123456/download",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    allow_redirects=True,
)
with open("anonymized_data.csv", "wb") as f:
    f.write(resp.content)

Response 302 Found

Redirects to /api/artifacts/{artifact_id}/download.


Generate Privacy Report

GET /api/privacy/sessions/{session_id}/report

Generate and download a branded PDF report summarizing the privacy session: policy details, detected PII, applied transformations, and compliance status.

Query Parameters

Parameter

Type

Default

Description

ai_insights

boolean

false

Include AI-generated commentary.

Example

curl "$BASE_URL/api/privacy/sessions/4d5e6f7a-8b9c-0123-def4-567890123456/report?ai_insights=true" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response 200 OK

{
  "artifact_id": "e7f8a9b0-c1d2-3456-7890-abcdef123456",
  "download_url": "/api/artifacts/e7f8a9b0-c1d2-3456-7890-abcdef123456/download"
}

List PII Types

GET /api/privacy/pii-types

Return the full list of supported PII types. Over 72 types covering names, addresses, phone numbers, financial identifiers, medical records, biometric data, and more.

Example

curl "$BASE_URL/api/privacy/pii-types" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response 200 OK

{
  "types": [
    {"id": "email", "name": "Email"},
    {"id": "phone", "name": "Phone"},
    {"id": "ssn", "name": "Ssn"},
    {"id": "credit_card", "name": "Credit Card"},
    {"id": "name", "name": "Name"},
    {"id": "address", "name": "Address"},
    {"id": "date_of_birth", "name": "Date Of Birth"},
    {"id": "ip_address", "name": "Ip Address"},
    {"id": "medical_record_number", "name": "Medical Record Number"}
  ]
}

List Actions

GET /api/privacy/actions

Return all supported transformation actions.

Example

curl "$BASE_URL/api/privacy/actions" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response 200 OK

{
  "actions": [
    {"id": "mask", "name": "Mask"},
    {"id": "hash", "name": "Hash"},
    {"id": "redact", "name": "Redact"},
    {"id": "encrypt", "name": "Encrypt"},
    {"id": "generalize", "name": "Generalize"},
    {"id": "suppress", "name": "Suppress"},
    {"id": "pseudonymize", "name": "Pseudonymize"}
  ]
}

List Compliance Profiles

GET /api/privacy/profiles

Return available compliance profiles with their default PII types and recommended actions.

Example

curl "$BASE_URL/api/privacy/profiles" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response 200 OK

{
  "profiles": [
    {
      "id": "hipaa",
      "name": "HIPAA",
      "description": "Health Insurance Portability and Accountability Act - 18 PHI identifiers",
      "pii_types": ["name", "address", "date", "phone", "fax", "email", "ssn",
                    "medical_record_number", "health_plan_id", "account_number",
                    "certificate_license", "vehicle_id", "device_id", "url",
                    "ip_address", "biometric", "photo", "other_unique_id"],
      "default_action": "redact"
    },
    {
      "id": "gdpr",
      "name": "GDPR",
      "description": "General Data Protection Regulation - EU personal data protection",
      "pii_types": ["name", "email", "phone", "address", "date_of_birth",
                    "national_id", "passport", "ip_address", "cookie_id",
                    "location", "biometric", "genetic", "health", "political",
                    "religious", "sexual_orientation", "trade_union"],
      "default_action": "pseudonymize"
    },
    {
      "id": "pci_dss",
      "name": "PCI-DSS",
      "description": "Payment Card Industry Data Security Standard",
      "pii_types": ["credit_card", "cvv", "pin", "magnetic_stripe",
                    "cardholder_name", "expiration_date", "service_code"],
      "default_action": "tokenize"
    },
    {
      "id": "ccpa",
      "name": "CCPA",
      "description": "California Consumer Privacy Act",
      "pii_types": ["name", "address", "email", "phone", "ssn", "drivers_license",
                    "passport", "ip_address", "browsing_history", "purchase_history",
                    "biometric", "geolocation", "employment_info", "education_info"],
      "default_action": "redact"
    },
    {
      "id": "custom",
      "name": "Custom",
      "description": "Create your own custom privacy profile",
      "pii_types": [],
      "default_action": "mask"
    }
  ]
}

Full Workflow Example

The following demonstrates the complete PII detection and anonymization workflow:

import requests
import time

BASE_URL = "http://localhost:8888"
HEADERS = {"Authorization": "Bearer YOUR_API_KEY"}
PROJECT_ID = "d4e5f6a7-b8c9-0123-def4-567890123456"
DATASET_ID = "e5f6a7b8-c9d0-1234-5678-90abcdef1234"

# 1. Create a HIPAA policy
resp = requests.post(f"{BASE_URL}/api/privacy/policies", headers=HEADERS, json={
    "project_id": PROJECT_ID,
    "name": "HIPAA Policy",
    "profile": "hipaa",
    "status": "active",
})
policy_id = resp.json()["policy_id"]

# 2. Add detection rules
for pii_type in ["email", "ssn", "phone", "name"]:
    requests.post(
        f"{BASE_URL}/api/privacy/policies/{policy_id}/rules",
        headers=HEADERS,
        json={"pii_type": pii_type, "action": "redact", "is_enabled": True},
    )

# 3. Create a session linking the policy to the dataset
resp = requests.post(f"{BASE_URL}/api/privacy/sessions", headers=HEADERS, json={
    "policy_id": policy_id,
    "dataset_id": DATASET_ID,
    "deep_scan": True,
})
session_id = resp.json()["session_id"]

# 4. Run PII detection
requests.post(
    f"{BASE_URL}/api/privacy/sessions/{session_id}/detect",
    headers=HEADERS,
)

# Poll until detection completes
while True:
    r = requests.get(
        f"{BASE_URL}/api/privacy/sessions/{session_id}",
        headers=HEADERS,
    )
    status = r.json()["session"]["status"]
    if status not in ("pending", "detecting"):
        break
    time.sleep(5)

# 5. View detection results
results = requests.get(
    f"{BASE_URL}/api/privacy/sessions/{session_id}/results",
    headers=HEADERS,
).json()
for r in results["results"]:
    print(f"Found {r['pii_type']} in column '{r['column']}' "
          f"({r['instances']} instances, confidence={r['confidence']:.0%})")

# 6. Apply anonymization
requests.post(
    f"{BASE_URL}/api/privacy/sessions/{session_id}/transform",
    headers=HEADERS,
)

# Poll until transformation completes
while True:
    r = requests.get(
        f"{BASE_URL}/api/privacy/sessions/{session_id}",
        headers=HEADERS,
    )
    status = r.json()["session"]["status"]
    if status not in ("transforming",):
        break
    time.sleep(5)

# 7. Download the anonymized dataset
resp = requests.get(
    f"{BASE_URL}/api/privacy/sessions/{session_id}/download",
    headers=HEADERS,
    allow_redirects=True,
)
with open("anonymized_data.csv", "wb") as f:
    f.write(resp.content)
print("Anonymized dataset saved.")

Supported PII Types

The Privacy Suite supports 38 pattern-detected PII types organized into seven categories. Each type is identified by a unique string ID used in detection rules and API responses.

Personal Identity

PII Type ID

Description

name

Personal names (via context detection).

ssn

US Social Security Numbers (xxx-xx-xxxx).

date_of_birth

Date of birth patterns.

passport_us

US passport numbers.

passport_uk

UK passport numbers.

passport_eu

EU passport numbers.

drivers_license_us

US driver’s license numbers.

national_id_uk

UK National Insurance numbers.

national_id_ca

Canadian Social Insurance numbers.

ein

Employer Identification Numbers.

itin

Individual Taxpayer Identification Numbers.

vin

Vehicle Identification Numbers.

Contact Information

PII Type ID

Description

email

Email addresses (RFC 5322).

phone_us

US phone numbers (xxx-xxx-xxxx, (xxx) xxx-xxxx).

phone_intl

International phone numbers (+country code).

phone_uk

UK phone numbers.

phone_eu

EU phone numbers.

address

Physical addresses (via context detection).

Financial

PII Type ID

Description

credit_card

Credit card numbers (Visa, Mastercard, Discover).

credit_card_amex

American Express card numbers.

iban

International Bank Account Numbers.

swift_bic

SWIFT/BIC bank codes.

routing_number

US bank routing numbers.

Healthcare (HIPAA)

PII Type ID

Description

medical_record

Medical record numbers (HIPAA).

dea_number

DEA registration numbers.

npi

National Provider Identifiers.

health_plan_id

Health plan beneficiary numbers.

Geographic / Location

PII Type ID

Description

zip_code_us

US ZIP codes (5-digit and ZIP+4).

postal_code_uk

UK postal codes.

postal_code_ca

Canadian postal codes.

latitude

Geographic latitude values.

longitude

Geographic longitude values.

geo_coordinate

Geographic coordinate pairs.

Digital Identifiers

PII Type ID

Description

ip_address

IPv4 addresses.

ipv6_address

IPv6 addresses.

mac_address

MAC addresses.

url

URLs (http/https).

uuid

UUID v4 identifiers.

Cryptocurrency

PII Type ID

Description

bitcoin_address

Bitcoin wallet addresses.

ethereum_address

Ethereum wallet addresses.


Transformation Actions Reference

All eleven transformation actions supported by the Privacy Suite are listed below. Specify the action ID when creating detection rules. Some actions accept additional configuration via the config object.

Action ID

Description

Config Options

redact

Replace detected value with [REDACTED].

None.

mask

Partial masking (e.g., ***-**-1234).

show_last (int) – number of trailing characters to reveal.

hash

One-way SHA-256 hash. Preserves referential integrity across columns so the same input always produces the same hash.

None.

encrypt

AES-256 reversible encryption. The original value can be recovered with the encryption key.

Requires an encryption key configured in policy settings.

pseudonymize

Replace with consistent fake values (e.g., fake names, emails). The same input always maps to the same pseudonym within a session.

None.

tokenize

Replace with random tokens. A lookup table is maintained so tokens can be reversed if needed.

None.

generalize

Reduce precision of the value.

to (string) – target granularity: "year", "month", "age_range", or "region".

suppress

Remove the column entirely from output.

None.

truncate

Shorten values to a fixed length.

length (int) – maximum character length.

k_anonymize

Apply k-anonymity grouping. Rows are grouped so that each combination of quasi-identifiers appears at least k times.

k (int) – minimum group size.

differential_privacy

Add calibrated statistical noise to numeric values.

epsilon (float) – privacy budget. Lower values provide stronger privacy but more noise.


Compliance Profiles

The Privacy Suite ships with seven built-in compliance profiles. Each profile pre-selects the PII types mandated by the corresponding regulation and assigns a recommended default action.

Profile ID

Regulation

Scope

hipaa

HIPAA (Health Insurance Portability and Accountability Act)

Targets all 18 PHI identifier categories: names, geographic data, dates, phone numbers, fax numbers, email addresses, SSNs, medical record numbers, health plan IDs, account numbers, certificate / license numbers, vehicle IDs, device serial numbers, URLs, IP addresses, biometric identifiers, photographs, and other unique identifiers.

gdpr

GDPR (EU General Data Protection Regulation 2016/679)

All personal data as defined by the regulation: names, emails, IP addresses, location data, online identifiers, genetic data, and biometric data.

pci_dss

PCI DSS v4.0 (Payment Card Industry Data Security Standard)

Card numbers, CVVs, cardholder names, expiration dates, PINs, and authentication data.

ccpa

CCPA (California Consumer Privacy Act)

Broad PII scope: personal identifiers, commercial information, biometrics, internet activity, geolocation, and professional information.

financial

General Financial Compliance

Account numbers, routing numbers, IBAN, SWIFT/BIC codes, EINs, and tax identification numbers.

sox

SOX (Sarbanes-Oxley Act)

Financial records, audit trails, and internal control documentation.

glba

GLBA (Gramm-Leach-Bliley Act)

Consumer financial data including account numbers, SSNs, income records, and credit history.

In addition to these built-in profiles, you can select custom to create a profile with any combination of PII types and actions.


Detection Confidence Levels

Each PII detection is assigned a confidence score between 0.0 and 1.0 that reflects how certain the detector is about the match.

Confidence Score Ranges

Score

Detection Method

0.95

Exact column name match (e.g., a column named email or ssn).

0.85

Contains-keyword match (e.g., a column named customer_email_addr).

0.80

Prefix or suffix match (e.g., a column ending in _phone).

0.60 -- 0.95

Pattern / regex match. The exact score varies by pattern specificity.

The default confidence threshold is 0.8 (80%). Detections below this threshold are not reported. You can lower the threshold in the policy settings:

{
  "settings": {
    "confidence_threshold": 0.7
  }
}

Violation Severity Levels

When a compliance profile is active, each unresolved detection is classified into one of three severity levels:

Severity

Meaning

error

Critical compliance violation that must be resolved before the dataset can be considered compliant.

warning

Non-critical issue that should be reviewed. May indicate a borderline detection or a lower-priority PII type.

info

Informational finding. Logged for audit purposes but does not affect the compliance score.


See also