OpenAI - ChatGPT and Security and Risks

Did you know that the ChatGPT service is delivered from a Kubernetes instance in Central US? More presicly a Microsoft Azure datacenter hosted in Texas. At the time of writing this, ChatGPT and the other OpenAI hosted APIs are only available from within the US.

ChatGPT currently runs on Kubernetes on over 7,500 virtual machines. The platform is used both for the API as well as ChatGPT and other surrounding services.

This means that all prompts and API calls are directed to that Kubernetes cluster. At the moment there is no way to use ChatGPT within EU/EMEA. (While OpenAI is mentioning that they provide self-hosted instances, but there is little information about it yet…)

I wanted to write this blog post to clear any confusion related to the use of ChatGPT including privacy and security about it.

Much of the data that ChatGPT is trained (and is aware of) on is referenced here —> https://arxiv.org/pdf/2303.08774.pdf which is close to 570 GB of data. This means that ChatGPT is not aware of recent events by default. If you ask it about something which is does not have any data about it can «hallusinate». Meaning that it can generate false facts.

In addition to this if you provide it with context about what you want to ask about it, it will not be able to remember what you asked about, meaning that you are not able to teach or improve that data model that ChatGPT is based upon. This means that ChatGPT will not use the data you enter to «learn» new things. OpenAI might use prompts which users have enterered to improve the way that the AI interacts with the users, for instance when OpenAI launched the GPT-4 model in March 2023, they used a little over 5,000 prompts entered through ChatGPT (from November 2022 to February 2023) to improve the model to GPT-v4. As organizations that want to use GPT with their own data, there are some mechanisms for fine-tuning data, but this is not available even in the newer models such as ChatGPT (based upon GPT 3.5) and or GPT-v4.

Who has access to my data that is entered there?

OpenAI has some defined subcontractors who can access information, in addition to some OpenAI employees, as seen here OpenAI Sub-processor List – OpenAI API

Is information submitted via API used?

Information submitted via the API is not used by OpenAI to improve the data model, but is stored for 30 days to monitor for possible misuse. Data is stored on the platform in the USA and is not yet possible for storage within the EU. Access to the data within the 30 days is for specific OpenAI employees as well as approved resources from 3rd party service providers. Also we have the option now in the Web UI to define if we want ChatGPT/OpenAI to not use our data as well API data usage policies (openai.com)

Do I have the option to ask OpenAI not to use my information or store my information at all?

Yes you can apply for exception by using this form —> User Content Opt Out Request (google.com)

Can I set up OpenAI within EMEA/EU?

The only way is by establishing OpenAI services delivered from Microsoft Azure in West Europe (data center in Amsterdam) or France Central. This provides the same functionality (apart from the ChatGPT interface), but then delivered as a pure cloud service from Microsoft that you can use for developing your own applications that use GPT.

Does ChatGPT Have internet access?

As of now, ChatGPT has started to rollout Internet Browsing capabilities, which allows the service to search on the Internet. The AI itself does not have direct internet access, but uses an API which search on the Internet on behalf of itself. Feedback from the Internet searches will then be passed back to the LLM (ChatGPT service) which will then summarize the content.

Are there any other differences using Azure OpenAI versus the commercial edition of OpenAI and ChatGPT?

·Automatic encryption of data: Azure OpenAI Services automatically encrypts the data within the service with Microsoft’s managed keys. In addition, you can also encrypt the data stored within the service with your own keys.

·Alternative network connectivity options: Azure OpenAI Services supports other network connectivity options, such as private endpoints, which allow you to filter all communication to the service through a centralized network.

·Support for managed identity: Azure OpenAI Services supports the use of managed identity to access the service, as opposed to using only API keys to authenticate to the service.

2 thoughts on “OpenAI – ChatGPT and Security and Risks”

Steve Rajeckas
9. July 2023 at 18:55

Hey Marius, how do you know that ChatGPT is hosted entirely in the Texas data center?

Loading...

1. msandbu
  9. July 2023 at 19:39
  
  Hi, at the moment everything is hosted there yes. They are looking into other hosting options, but I have never heard anything back from their sales team yet 🙂
  
  Loading...

Share this:

2 thoughts on “OpenAI – ChatGPT and Security and Risks”

Leave a Reply Cancel reply