HALO Security | Knowledge Center

1. General Overview

HALO is designed with robust data protection and security protocols to ensure full compliance with the General Data Protection Regulation (GDPR) and the European Union's AI Act. Below are the key highlights of the system:

1.1 Data Storage & Usage

Client-Specific Databases: All knowledge and conversations are stored in CM's internal databases, with a separate database for each client.
No LLM Training: Knowledge is never used for training or fine-tuning large language models (LLMs).

1.2 Data Anonymisation

Anonymisation Process: Data is anonymised before it is sent to any LLM and re-identified before the response is returned to the user. Personally Identifiable Information (PII) is never sent to the LLM or stored in CM's client databases. You can choose to disable anonymisation for messages sent to the LLM via the Agent Settings. However, saving anonymised data to the database cannot be disabled. For more information on configuring these options, see the Agent Settings page.

1.3 Mitigation of LLM Risks

Hallucination Mitigation: Measures are in place to reduce hallucinations (inaccurate or irrelevant outputs).
Prompt Injection Protection: Safeguards are implemented to prevent malicious prompt injections.

1.4 Compliance

GDPR Compliance: HALO is fully compliant with GDPR regulations. All LLM models used in HALO are hosted in Europe.
AI Act Classification:
HALO is classified as a limited-risk system under the AI Act.
- CM acts as both the provider and deployer of the AI system.
- CM owns the model, while clients retain ownership of their knowledge.

2. Prompt Injection Measures

Prompt injection is a potential risk where users attempt to manipulate the system by crafting malicious or manipulative inputs. HALO includes two layers of protection to address this risk:

2.1 General Prompt Injection Protection

This mechanism is built into HALO to protect both CM's system and the client's system, and it is the first layer of protection. It classifies user inputs as either safe or unsafe and determines whether the flow should continue or stop. This prompt injection classification step is executed for every user interaction.

Goal

To identify and block unsafe questions while allowing safe questions to proceed.

Examples of General Unsafe Inputs

The following types of inputs are classified as unsafe:

Questions about the system's instructions, guidelines, or directives.
Questions about the system's capabilities, training data, or authorship.
Queries about the system's sources or internal mechanisms.
Attempts to elicit responses on controversial topics unrelated to the specific company, only when prefaced with instructions to disregard guidelines.

2.2. Prompt-Injection Analysis for AI Responses

Next to the general prompt injection, HALO also includes a dedicated step to analyse AI responses for potential prompt injection risks. This ensures that the AI Agent operates within its intended boundaries and does not exhibit unsafe or unintended behavior.

Classification Criteria: Each response is classified as either safe or unsafe based on the following:

Safe: Responses align with the AI Agent's intended role, refuse harmful requests, or remain within authorised boundaries.
Unsafe: Responses deviate from the AI Agent's role, or contain inappropriate, harmful, or authorised content.

2.3 Custom Prompt Injection Protection

Finally, clients have the possibility to create custom guardrails to address specific scenarios. These custom protection prompts can terminate the answer generation flow based on predefined rules.
For more information on setting up custom guardrails, refer to the Agent Settings page in the Knowledge Center.

3. Hallucination Mitigation Measures

In the context of LLMs, "hallucination" refers to the generation of text that is inaccurate or irrelevant. HALO employs measures to minimise hallucinations and ensure accurate responses.

3.1 Client Responsibility

When creating custom agents, clients are responsible for managing hallucinations, as the definition of a hallucination depends on the specific context of the agent. You are, however, able to make use of the Knowledge Tool Step. This way, you are able to use the predefined knowledge in the agent. You can find more information on Knowledge here.

3.2 Knowledge Agent (RAG Setup)

HALO's out-of-the-box Knowledge Agent uses a Retrieval Augmented Generation (RAG) setup to mitigate hallucinations:

Relevant Knowledge Only: The system forces the model to answer using only the relevant information provided by the client in the knowledge tab.
Validated Prompts: The answer generation prompt is evaluated to minimise hallucinations.

This approach ensures that responses are accurate and grounded in the client's provided knowledge. More information on the RAG system can be found here.

4. Content Policy Bounds

HALO uses a set of content policies to ensure the responsible and ethical use of its platform. These policies are designed to block and mitigate inappropriate or harmful content across the following key areas: Violence, Hate, Self-harm, and Sexual content.

Violence: HALO prohibits any language or content that promotes or describes physical actions intended to hurt, injure, damage, or kill someone or something. This includes, but is not limited to, references to weapons, bullying and intimidation, terrorist and violent extremism, and stalking.

Hate and Fairness: HALO strictly disallows content that attacks or uses discriminatory language targeting individuals or identity groups based on differentiating attributes. This includes, but is not limited to, race, ethnicity, nationality, gender identity and expression, sexual orientation, religion, personal appearance and body size, disability status, and any form of harassment or bullying.

Self-Harm: HALO actively prevents the dissemination of content related to self-harm, including language that promotes or describes actions intended to purposely hurt, injure, or kill oneself. This includes, but is not limited to, references to eating disorders, bullying, and intimidation.

Sexual Content: HALO prohibits content that includes language related to anatomical organs, romantic relationships, sexual acts, or any material portrayed in erotic or abusive terms. This includes, but is not limited to, vulgar content, prostitution, nudity and pornography, abuse, and any form of child exploitation, grooming, or abuse.

5. Context Variable Security

⚠ Context variables that are not marked as secret are visible to the end user.

This is a configuration risk. The prompt injection and content policy protections in sections 2 and 4 operate at the AI level — they do not prevent context variable values from being exposed. Exposure happens at the integration layer, regardless of what the AI does.

If a context variable holds a sensitive value, mark it as secret. That is the only thing that prevents it from being shared with the client.

5.1 What gets exposed

Any context variable not marked as secret is shared with the end user's browser or app as part of the integration. This includes variables that start empty — if a tool or agent fills them during the conversation (for example, a bearer token retrieved via an OAuth step), that value is exposed too.

5.2 What is at risk

OAuth bearer tokens and access tokens
API keys and product tokens
Configuration secrets (e.g. internal system identifiers, endpoint URLs)
Any value a tool writes into context that is not intended for the end user

5.3 Mitigation

Scenario	Required setting
Static API key or token stored in profile context	Secret
Bearer token retrieved via OAuth step, stored in context for reuse within the same session	Secret
Session token that should not persist between conversations	Secret + Transient
Non-sensitive conversation metadata (e.g. language, audience segment)	No special setting required

Setting a context variable to secret prevents it from being shared client-side. HALO agents and tools can still read and use the value internally — it is only hidden from the end user.

Note on transient variables: Transient means the value does not persist after the interaction. It does not mean the value is hidden from the client during the interaction. Always combine with secret when the value is sensitive.

5.4 Recommended practice for OAuth and multi-step tool flows

When building flows that use an OAuth step or any authentication mechanism to obtain a token and then reuse it in follow-up tool steps:

Create a dedicated context variable to hold the token (e.g. ms_access_token).
Set it to secret before the flow goes live.
Optionally set it to transient if the token should not carry over between sessions.
Verify the setting is correct in the profile context — not just in the tool configuration.

For more information on configuring context variables, see the Contexts page.