======================== 7. Privacy and SynthGen ======================== Goal ==== Protect sensitive data with policy-driven transformations and create synthetic datasets for safe experimentation. Privacy workflow ================ 1. Open ``Privacy``. 2. Create or select a privacy policy. 3. Attach policy to dataset/session. 4. Run detection and review discovered PII entities. 5. Configure transformation action per rule (mask/redact/hash/encrypt/generalize/suppress). 6. Execute transform and export protected dataset. SynthGen workflow ================= 1. Open ``SynthGen``. 2. Select source dataset and generation profile. 3. Configure sample size and optional constraints. 4. Start generation job. 5. Download synthetic dataset artifact on completion. Data quality and parity checks ============================== 1. Schema parity: - Column count and names are as expected. 2. Privacy parity: - Original sensitive values are not exposed in transformed output. 3. Utility parity: - Basic distribution trends remain usable for modeling intent. 4. Integrity: - Output file is readable and row count matches configured generation target. Functional validation checklist =============================== 1. Privacy detection returns non-empty findings when PII exists. 2. Transform run produces downloadable artifact. 3. SynthGen run reaches terminal ``completed`` state. 4. Generated dataset can be loaded into Datasets/Experiments modules. 5. Error states surface actionable diagnostics. Expected result =============== 1. Sensitive data is protected according to selected policy. 2. Synthetic data can be used for prototyping and tests. Common errors and recovery ========================== 1. No PII findings when expected: - Verify selected policy/rules and source columns. 2. Transform failure: - Retry with narrower rule scope and inspect logs. 3. SynthGen timeout: - Reduce generation size and rerun. Screenshots =========== .. figure:: ../../_static/user-manual/common/privacy.png :alt: Privacy policy and detection workflow :width: 90% Privacy module with policy execution flow. .. figure:: ../../_static/user-manual/common/synthgen.png :alt: Synthetic data generation workflow :width: 90% SynthGen training/generation and artifact output view.