Documentation in progress. New content is added regularly.

Data Privacy & Handling

Manage PII, retention, and data flows across the platform

Conversation data moves through a clear lifecycle on Helvia: collection, anonymization, storage, and deletion. The platform gives you direct control over the two stages that vary most between organizations: how personal information (PII) is identified and replaced, and how long conversation data is kept before it is deleted (data retention).

For certifications, the security FAQ, and how Helvia protects your data end-to-end, see the Security Overview page.

Anonymization and PII Handling

The platform automatically detects personal information in conversations and replaces it before that data is stored or sent to third parties. Anonymization is activated and configured per agent under Designer > Settings > Privacy, so each agent can carry rules appropriate to its domain.

Three mechanisms work together, each addressing a different sensitivity level:

Entity-Based Detection

Recognize named entities like people, locations, dates, and money, or match custom regex patterns

Custom Anonymization Service

Plug in your own detection endpoint when the built-in entities are not enough

Full Message Obfuscation

Replace the entire user message before storage for the highest-sensitivity scenarios

Entity-Based Anonymization

Entity-based anonymization uses Named Entity Recognition to detect Personally Identifiable Information (PII) in free text and substitute it with placeholder values. The detector recognizes a fixed set of categories, such as names, locations, dates, and monetary values.

Detection runs on every incoming user message when configured, before it enters the agent's processing pipeline, so the data is replaced before it ever reaches the language model or persistent storage. It can also be applied to the full conversation transcript before export to external systems. Contact support for special configurations beyond the defaults.

To add a detection and replace rule:

1

Open Privacy Settings for the Agent

In Designer, go to Settings > Privacy.

2

Add a Detection Rule

Under Anonymization Settings, select Add Data Type to insert a new row. You can add as many rows as you need, one per entity type or regex pattern you want to detect.

3

Choose What to Detect

Pick the Data Type category. The full set of supported categories is listed below.

4

Set the Replacement

Enter the text that will replace matches in Replacement Text. Leave it blank to fall back to the Default Replacement value set for the section.

5

Provide a Regex Pattern

Only required when Custom Regex is selected as a data type. Enter the Regex pattern and the Replacement Text. Use this for identifiers specific to your domain, such as account numbers or internal case IDs.

6

Save the Rule

Select Save Changes to apply.

The full list of supported entities is:

Entity
What It Covers

PERSON

People, including fictional

GPE

Countries, cities, states

NORP

Nationalities, religious or political groups

FAC

Buildings, airports, highways, bridges

ORG

Companies, agencies, institutions

LOC

Non-GPE locations, mountain ranges, bodies of water

PRODUCT

Objects, vehicles, foods (not services)

EVENT

Named hurricanes, battles, wars, sports events

WORK_OF_ART

Titles of books, songs, and other works

LAW

Named documents made into laws

LANGUAGE

Any named language

DATE

Absolute or relative dates and periods

TIME

Times smaller than a day

PERCENT

Percentages

MONEY

Monetary values, including currency

QUANTITY

Measurements such as weight or distance

ORDINAL

First, second, third, and so on

CARDINAL

Numerals that do not fit another type

Custom Anonymization Service

If the built-in detector does not cover a category specific to your domain, point Helvia at your own service instead. In the same Privacy settings, enable Service Configuration, supply the endpoint URL, and add any HTTP headers your service requires for authentication.

When to use a custom service: industry-specific identifiers (medical record numbers, account numbers, internal case IDs) or jurisdictions where you need detection beyond the standard entity set.

Full Message Obfuscation

For the highest-sensitivity scenarios, enable Obfuscate User Input to replace the entire user message with an obfuscated string before it is stored or sent downstream. Use this when entity-level redaction is not enough and no portion of the original message should be preserved.

Data Retention

Helvia retains conversation transcripts for a configurable window and deletes them automatically when that window expires. Data retention has two levels:

Set under Workspace > Settings > Configuration in the Data Retention field.

  • Range: 1 to 24 months

  • Applies to every agent in the Workspace unless overridden

When the retention period expires, the data is completely removed from production databases. Archived copies remain in backup storage for up to two years for disaster recovery, unless a shorter window is agreed in your contract.

Retention covers stored conversation transcripts (chat sessions). Audit logs, knowledge base content, and Workspace configuration follow separate retention rules.

Where Your Data Lives

All Helvia services and databases operate within the European Union, on managed cloud infrastructure with geographic redundancy for disaster recovery.

EU-only Hosting

All processing and storage happens inside European Union data centers

Cloud-Native

Hosted on AWS under their ISO 27001 and SOC 2 certified programs

Geographic Redundancy

Backups and replicas distributed across availability zones for continuity

Encryption

All customer data is encrypted in transit and at rest. The same encryption applies to backups and to data flowing between Helvia services.

State of Data
Protection

In transit

TLS/SSL across all internet communications, including traffic to LLM providers

At rest

AES-256 encryption applied at the storage layer

Database-level

Sensitive fields encrypted inside the database so data stays protected even on direct access

Passwords

Hashed with modern algorithms; never visible to administrators

Backups

Encrypted with the same standards as primary storage and held in secure, segregated locations

Data Minimization

Helvia collects only what each processing purpose requires and removes data when that purpose ends. These principles apply across the platform, from the data your agents receive to the records kept in Observatory.

Purpose-Bound Collection

Each data point is tied to a defined processing purpose and not collected for hypothetical uses

Least-Privilege Access

Role-based access ensures users see only the data their role requires

Routine Review

Stored data is reviewed regularly and removed if no longer necessary

Automatic Deletion

Retention windows enforce removal without relying on manual cleanup

Data Flows to Third Parties

Conversations sometimes need data to leave the Helvia platform, whether to generate a response with a language model or to update an external system. Every transfer is protected by the same controls applied across the platform.

Encrypted in transit

All data leaving the platform travels over TLS-encrypted channels, so it stays protected end-to-end between Helvia and the destination.

Anonymization available before export

Personal data can be detected and replaced before any of it is sent to a language model or downstream system, using the same anonymization rules configured per agent.

Vetted providers only

Every third-party provider Helvia integrates with is reviewed against recognized security standards such as ISO 27001 and SOC 2 before being added.

All third-party integrations, LLM providers included, run through your own provider account and credentials. This means two things:

  • the relationship is governed directly by your contract with that provider, including any data-use and training terms

  • the processing region follows the credentials you supply, which can be configured to be inside or outside the EU.

Best Practices

  • Configure anonymization per agent: match the rules to the data each agent actually handles, rather than applying one set across every agent

  • Use a custom service for domain identifiers: the built-in entities are broad; plug in your own service when you need medical IDs, account numbers, or other domain-specific patterns recognized

  • Tune data retention per agent: adjust the per-agent retention for agents handling more sensitive conversations according to your contractual obligations

  • Audit which provider your LLM calls use: customer-owned accounts give you direct control over training opt-outs and data-use terms

Last updated

Was this helpful?