Skip to main content

Can sensitive information be excluded from training data?

Y
Written by Yatheendra Brahmadevera
Updated over a week ago

Direct Answer (TL;DR)

Brilo AI supports knowledge redaction to exclude sensitive information from training data so that private fields are not used when building or updating the Brilo AI voice agent knowledge base. Knowledge redaction is implemented as a configurable filtering step in Brilo AI’s data ingestion and training workflows; when enabled, specified fields or patterns are removed or masked before model updates. Redaction controls can target personally identifiable information (PII), protected health information (PHI), account numbers, and other sensitive text so that it never becomes part of training inputs. For regulated environments, Brilo AI can be configured to skip ingestion of certain documents, apply field-level exclusion rules, and trigger human review before any data is used for training.

  • Can I stop Brilo AI from using customer phone numbers in training? Yes — configure field-level exclusion rules for phone numbers and apply them during ingestion so those values are omitted from training datasets.

  • How do I prevent Brilo AI from learning patient names? Use document or field redaction rules that remove or mask patient name fields before the training step; Brilo AI will then not include those values in the knowledge base.

  • Will Brilo AI remove credit card numbers from training data? Yes when you enable pattern-based filters (for example, regex for numeric sequences) or configure explicit field exclusions, Brilo AI excludes those values prior to model updates.

Why This Question Comes Up (problem context)

Enterprise buyers ask about excluding sensitive information because training data can originate from multiple sources — call transcripts, CRM notes, and support tickets — any of which may contain regulated or confidential fields. Regulated sectors (healthcare, banking, insurance, financial services) require predictable controls over what data is used for model training to meet internal policies and to reduce legal and reputational risk. Buyers need clarity on Brilo AI’s redaction options so they can design ingestion pipelines that align with privacy controls and vendor governance.

How It Works (High-Level)

Brilo AI applies knowledge redaction during the data ingestion and pre-training stages of the voice agent workflow. Typical steps include data intake, field-level filtering, pattern-based masking, human review, and conditional inclusion into the training dataset. Redaction rules can be configured to run automatically or to flag items for manual approval before they are used to update the knowledge base.

Knowledge redaction in Brilo AI is a configurable rule set that prevents specified data from being included in model training. The training dataset is the collection of texts, transcripts, and knowledge items that Brilo AI uses to update the voice agent’s responses. Sensitive field exclusion maps data fields (for example, "ssn" or "patient_name") to redaction actions (remove, mask, or require manual review).

Related technical terms used in Brilo AI workflows include data ingestion, data filtering, training exclusion, PII (personally identifiable information), PHI (protected health information), pattern masking, and knowledge base ingestion.

Guardrails & Boundaries

Brilo AI’s redaction is a preventive control applied before training; it does not retroactively remove information already embedded in a previously trained model without a retraining or targeted data-removal process. Brilo AI’s redaction rules should not be treated as a substitute for upstream data minimization practices.

Common guardrails:

  • Do not rely on post-training prompts alone to hide sensitive data — configure redaction at ingestion.

  • Avoid ambiguous wildcard rules that may over-delete critical context; prefer explicit field rules plus pattern detection.

  • Use human review where automated filters have high false-positive or false-negative risk.

In Brilo AI, a redaction rule is a configuration object that specifies what to remove, mask, or route for human review during ingestion.

Applied Examples

Healthcare example:

  • A medical call center integrates patient call transcripts into Brilo AI. The team configures redaction rules to remove patient names and unstructured PHI fields during ingestion and to mask patient identifiers in notes. Calls containing unclear PHI patterns are routed for manual review before training inclusion.

Banking / Financial services / Insurance example:

  • An insurer supplies claims notes to Brilo AI. The insurer configures field-level exclusions for account numbers, policy numbers, and credit card-like sequences. Pattern-based masking removes numeric sequences that match account formats, and policy-handbook text remains available as non-sensitive training material.

Human Handoff & Escalation

When Brilo AI’s automated redaction rules detect ambiguous or borderline content, you can configure the workflow to:

  • Pause ingestion and create a review task for a human analyst.

  • Route the item to a secure review queue in your operational tooling.

  • Apply a temporary hold that prevents the data from entering the training dataset until approved.

Handoff can be implemented as part of the Brilo AI ingestion pipeline so escalation workflows are auditable and reproducible.

Setup Requirements

  1. Identify and document the sensitive fields and patterns you want Brilo AI to exclude (for example, "patient_name", "ssn", or structured account_number).

  2. Provide Brilo AI with a sample of your data sources (transcripts, CRM exports, support notes) so rule testing can be performed.

  3. Configure field-level exclusion rules and pattern filters in the Brilo AI ingestion configuration or provide mapping files that tag sensitive fields.

  4. Enable pattern-based masking (for example, numeric regex or email detection) where structured fields are not available.

  5. Define review routing: create the human review queue and specify when items should be escalated for manual approval.

  6. Schedule a controlled training run or model update after redaction rules are in place to verify that excluded data is not present in the updated knowledge base.

If you need help with these steps, contact your Brilo AI customer success representative or open a configuration request in the Brilo AI Console.

Business Outcomes

Properly configured knowledge redaction reduces the chance that regulated or private fields become embedded in the Brilo AI voice agent’s training data, lowering enterprise risk and improving alignment with internal privacy policies. It also streamlines audits by producing an auditable ingestion pipeline and provides operational flexibility: teams can safely use unstructured knowledge while excluding or masking sensitive tokens. Redaction helps keep the voice agent’s knowledge base focused on allowed content, improving answer relevance without exposing sensitive data.

FAQs

Can Brilo AI remove sensitive data that was already used in past training runs?

Removing data from an already-trained model requires a data-removal or retraining process. Brilo AI can perform targeted retraining or exclude old datasets from future updates, but complete removal may require a controlled retrain depending on how the data was used.

What kinds of patterns can Brilo AI redaction detect?

Brilo AI supports pattern-based filters for numeric sequences, email addresses, phone numbers, and other common PII patterns. For best results, combine pattern filters with explicit field-level exclusions from your source data.

Will redaction affect the Brilo AI voice agent’s ability to answer contextual questions?

Redaction removes or masks only the specified sensitive tokens; contextual non-sensitive content can still be ingested. Test rules in a staging environment to ensure critical context is preserved while sensitive fields are removed.

Do I need to change my CRM or data sources to use redaction?

Not necessarily. Brilo AI can apply redaction during ingestion, but providing structured exports or clear field mappings from your CRM reduces reliance on pattern matching and improves accuracy.

How do I audit what was excluded from training?

Create an ingestion log that records matched rules, masked tokens, and items sent for human review. Brilo AI’s ingestion pipeline can be configured to emit these logs for audit purposes.

Next Step

  • Contact your Brilo AI account team to request a redaction review and to schedule a staging ingestion test.

  • Open a configuration request in the Brilo AI Console to add field-level exclusions and pattern filters.

  • Ask Brilo AI support for a controlled training run after redaction rules are applied so you can validate that excluded items are not present in the updated knowledge base.

Did this answer your question?