The Perils of Generative AI. Part 2 - An Overview of Security Risks

This essay covers security risks associated with Generative AI. It is the second in a series exploring the potential detrimental effects Generative AI models may have on business and financial health.

Jul 10, 2023

Generative AI models understand patterns in their training datasets to generate content. The process involves a training phase, where models learn from data to improve their performance, and an inference phase, where models generate output when interacted with. Training mechanisms such as pre-training, fine-tuning, and prompt engineering are used to teach the models new knowledge. In the inference phase, user queries are encoded into numerical representations, and the model generates a response based on its training. The response is then decoded and post-processed before being presented to users. Inference allows users to interact directly with generative AI models or use applications that incorporate these models through APIs.

An Overview of the Risks

For the purposes of this article, I will use my diagram above to categorize the risks of Generative AI by mainly Training and Inference. There are risks associated with the hosting and deployments of these models too, however, we are going to skip over them.

Training

Jumping directly into it, here is a list of risks associated with the Training stage:

Sensitive data accessed without proper authorization

Sensitive data from internal data sources might be accessed by unauthorized employees and used to train AI models. This can be protected by:

Implementing robust access control mechanisms for all data sources within the organization. However, this may not completely prevent access to sensitive data as certain employees may require access to specific data points that might then be included in AI model training datasets. These data points could be accessed by unauthorized users from the output of the model.
Providing adequate training and guidance to employees. Even with proper training, there is still a possibility of accidental or intentional data leakage, as exemplified by incidents such as the Samsung incident.

To mitigate the risk of unauthorized access to sensitive data, organizations must establish strict access controls, monitor data usage, and enforce policies that govern the appropriate handling of sensitive information. Ongoing training and awareness programs can help employees understand the importance of data protection and reduce the chances of data leakage. However, this does NOT guarantee that leakage will not happen.

Contamination of existing data sources

In some cases, original data sources that are part of a model training pipeline can get contaminated with sensitive information. For example, consider a case where a model training pipeline started with a data source that was free of sensitive data. With the launch of a new feature, an application started logging sensitive data to this shared data source. Without continuous monitoring of the data source, sensitive data will leak into the model.

Sensitive data stored as representations in an AI model

As we talked about in the last article, training data including sensitive data is converted into numeric representations during the training step. Unfortunately, once sensitive data has been incorporated into a model, it cannot be easily identified or retrieved as the generative AI models do not function like traditional databases.

Leaking sensitive information to 3rd party providers

During the fine-tuning process, for example, using a third-party model hosted outside your environment, you may easily end up leaking sensitive information. In this case, data access is longer under your company’s complete control, and that might have a potentially catastrophic impact on your business.

Copyright infringements

Copyrighted data may still slip in during the Interactions applications or users have with Generative AI models during the Inference stage exposing your business to litigation risks like this most recent lawsuit against Open AI.

Inference

Here is a list of risks associated with the Training stage:

Sensitive data may leak to the end user

There are several factors that contribute to this risk. One is the nature of the training data itself. If the training dataset includes sensitive information, the model may learn to generate content that incorporates or reveals that sensitive data. Another factor is the potential for biases or unintended associations in the data, which can lead to biased or inappropriate outputs.

To mitigate the risk of sensitive data leakage, it is important to implement robust privacy and security measures throughout the generative AI pipeline. More on this in Part 3 of the series.

Sensitive data may be learned during interactions

With generative AI models like ChatGPT, there is a possibility that the model may learn new sensitive data during interactions. While the model's training typically involves extensive datasets, it can still acquire information during real-time interactions that may not have been present in the original training data.

For example, unauthorized copyrighted data can slip into the AI models while Users or Applications interact with them.

This occurs because generative AI models have the ability to learn from the specific inputs and prompts they receive during interactions. If users provide sensitive information or context during these interactions, the model may incorporate and reproduce that information in its responses.

To mitigate this risk, interactions should be monitored for sensitive data and appropriate actions should be taken in case sensitive data is detected. In part 3, we will learn a bit more about how to mitigate these risks.

Unauthorized Sensitive Data might be retrieved unintentionally or by malicious actors

There is a risk that unauthorized sensitive data may be unintentionally accessed or retrieved by individuals who lack proper authorization. This can occur due to various reasons, including accidental data leaks, breaches, or malicious actions by unauthorized actors.

To mitigate this risk, organizations should implement robust security measures and access controls. This includes implementing strong authentication and authorization protocols, treatment of sensitive data, monitoring and auditing data access and usage, and implementing security best practices. More on this in Part 3.

Lack of tailor-made monitoring tools

The absence of proper monitoring tools makes offering Generative AI-based features very dangerous. Monitoring is crucial for ensuring the responsible and safe usage of these models. We will explore how Aimon can help you gain visibility into your AI models in the next essay.

Prompt Injection

Prompt injection is the intentional crafting of a prompt in such a way that it may lead the model to generate a desired or specific response. This can be done by including biased information, leading questions, or specific phrasing that guides the model toward a particular answer. In the context of security, it is better known as a method to unethically manipulate the model into revealing information it shouldn't or to create biased or misleading outputs.

While there are guardrails being put in place by model providers to guard against this, due to the adversarial nature of this problem, attackers are becoming increasingly creative in crafting these attacks. This paper provides good reasoning and examples of transferable attacks on language models.

Impact of using Generative AI without necessary protections

Using Generative AI without the necessary safeguards can have severe consequences. It can result in catastrophic outcomes, ranging from detrimental effects on brand reputation and financial stability to regulatory penalties and disruptions in business operations. The impact can be devastating across various aspects of the organization. Here are some of those consequences:

Reputational damage
Regulatory fines
Financial losses

How Aimon helps mitigate these risks

Aimon is a complete AI-native Security solution. We help you secure the entire lifecycle of your Generative AI models. Please reach out to us at info@aimon.ai to learn more or read more about our solution in the next essay.

About Us

We are building Aimon AI, a complete AI-native Security Platform. We have over 25 years of combined experience in leading ML/AI, Security, Monitoring, Logging, and Analytics products at companies such as Netflix, AppDynamics, Thumbtack, and Salesforce. We are proud to be advised by renowned Privacy, Security, and AI experts. Please reach out to us at info@aimon.ai.

Puneet’s Substack

Discussion about this post