How do you secure an GenAI application or service?

While there are many blog posts and articles mentioning WHY you should secure your GenAI services or applications that you are building, almost none of the describe the tools and methods to secure them. Therefore, I decided to write my own blog post on what kind of security risks you can have with building Gen AI tools and how you can secure them.

Let’s start with an overview visualization of how many of the regular RAG based services are built and work. This visualization below shows much of the different components involved in a RAG (Retrieval Augmented Generation) application. Where the purpose of the service is to serve data stored to the end-user or service, based upon input (which is usually prompts).

While the components here are not directly tied to a specific technology or tool, I have another example below which shows how it could look like with using specific tools. This is a common architecture where you have Langchain as the orchestration or integration layer and connecting the data together and handling the prompts coming in from the user via a web application running on Streamlit.

So, what are the attack vectors against a service or application like this? Let us consider the following scenario, we have a public service that is serving end-users with information from some data source.

Now there are a couple of risks that we need to mitigate with a chatbot, since without proper configuration the chatbot can generate “wrong” output. Such as what happened with this car dealership in the US which just released a Chatbot which is using GPT underneath.

While this is harmless from a technical standpoint, however, is generates some other issues
* Loss of reputation and trustworthiness.
* Extra cost (via DoS attacks) since users can essentially trigger the bot to run multiple prompts which can have an excessive cost if the newer LLM models are used. When a DoS attack is in effect, the service might also be unavailable for other users. The cost for instance with GPT-4 is close to 160$ a month for a 1,000 prompt/completions calls with 1,000 tokens. 1,000 tokens are the equivalent of 750 words. However let us go into detail on each of the attack vectors.

Supply chain vulnerabilities – This can be related to vulnerabilities in the code or components in the ecosystem. For instance, Langchain has a known vulnerability LangChain vulnerable to code injection · CVE-2023-29374 · GitHub Advisory Database and given the popularity of the ecosystem we can expect that more of these will pop up sooner or later.
DoS attacks – As mentioned earlier, can be attacks that is used to take services offline or to drive cost.
Data poisioning – Is used when an attack is targeting the training data or the data that is served using RAG, therefore impacting how the service fill generate the data.
Vulnerabilities – Vulnerabilities in the LLM Inference API.
Cache poisioning – When the service serves data from the GPTCache or a CDN to a non-intended user. This could be for instance earlier prompts from user 1, shown to another user. OpenAI says mysterious chat histories resulted from account takeover | Ars Technica
Prompt Injection – By using specialy crafted prompts can bypass content filters in the Inference API of the LLM. Hence getting the AI to generate content which is not the intended usecase. As seen here 0xk1h0/ChatGPT_DAN: ChatGPT DAN, Jailbreaks prompt (github.com)
Sensitive information disclosure – Using specialy crafted prompt to disclose sensitive information or data that the LLM has either been trained on or as part of the RAG component.
Function calling – By using specialy crafted prompts can get the app to call specific functions outside of the intended usecase. If the application for instance has a function with the ability to call a SQL database, but with limited RBAC configured an attack can configure a prompt that can get the app to modify the database or get other sensitive content.

So, what can we do to remove these attack vectors? or at least minimize the risk of an attack.

DoS attacks can be handled via API Gateways or like what Cloudflare has with AI Gateway AI Gateway · Cloudflare AI Gateway docs which can act as an API Gateway for LLM calls between your application and the LLM Inference API. This also allows you to add rate limiting and caching of LLM calls. You also have tools like Portkey but they are more aimed at providing load balancing (Portkey-AI/gateway: A Blazing Fast AI Gateway. Route to 100+ LLMs with 1 fast & friendly API. (github.com))
Prompt injection and such can be handled using different guardrails, such as NVIDIA NeMo (NVIDIA/NeMo-Guardrails: NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems. (github.com) or LLM-Guard that can handle both input and output content that is being generated Index – LLM Guard (llm-guard.com)
Sensitive information disclosure – Here we have tools like Microsoft presidio that can be used to find PII and anonymize the content as seen here. (Installation – Microsoft Presidio)

Protecting against function calling which is now ever more usable with the Assistants API and and also now being supported by other LLM models such as fllama. The issue might be that if you have an LLM application that can call an function that can “do harm” to a backend service. An example function within Langchain might look like this. This is a simple database function (I’ve excluded the other configuration but you can take a look at a example here gpt-ai/jarvissql.py at main · msandbu/gpt-ai (github.com)

.db_chain = create_sql_agent(
    llm=llm,
    toolkit=toolkit,
    verbose=True,
    agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
)
tools = [
    Tool(
        name="DB-function",
        func=db_chain.run,
        description="useful to answer questions about the database"
    )
]

The main issue here is how we configure the SQL connection string and access rights, since the LLM is only using a predefined service principal to authenticate to the SQL database. If we create a SQL only user that has only read access on certain tables or databases, the risk would be fairly small. Secondly you should also define a limit of how many function calls that can be triggered, since again using functions will generate a lot of cost of it is going in loops.

Inference API – You should ensure that the the Inference API is locked down, so that the API is not publically available, which could result in that anyone that has a API key could consume tokens. Such as with the cloud based LLM inference API you have the ability to lock down access only to the virtual network where the service is running. This ensures that only the internal application can call the API and not everyone else. Which you can find more information about here –> Security Best Practices for LLM Applications in Azure (microsoft.com) this also goes more into detail in regards to the high-level architecture design principles when setting up GenAI applications.
Custom LLMs – There are also custom LLMs that can be used which have more built-in content filters to block toxic content and such meta-llama/LlamaGuard-7b · Hugging Face such as LLamaGuard, however the issue here is that the majority of language support is mostly english, which does not help if you are building an application for non-english support.
Monitoring – In regards to monitoring and logging of prompts and calls it is dependant on what kind of framework you are building the application if. If you are using Langchain you can use their own tool called Langsmith. We also have a free alternative called Lunary lunary-ai/lunary: The production toolkit for LLMs. Observability, prompt management and evaluations. (github.com) if you are using semantic kernel.

While this does not cover all the aspects, much of the components and the surrounding ecosystem is still very young. Therefore this post will be updated from time-to-time to ensure that it shows the proper tools that are available that can be used.

Share this:

Leave a Reply Cancel reply