The big FAQ about Microsoft 365 Copilot

Since the initial release of Copilot for Microsoft 365 I have been bombarded with questions related to many different areas about the service. Therefore I decided to create a summary about many of the questions that I get. Some of them are basic, some of them are deep-dive. Hopefully some of you will find it useful.

What is the difference between the different Copilot offerings from Microsoft? The main difference is the context and data sources that are supported with each service.

What is Microsoft 365 Copilot? – It provides Generative AI within the Microsoft Office ecosystem. It essentially provides GPT 4o combined with function calls to interact with the different Office applications (create PowerPoint slide, create word document) but also RAG using (allow you to find information related to the data you have access to in Microsoft 365) which is called the Semantic Index.

Is it different from ChatGPT? Not really, since Copilot uses the same GPT language model underneath (which is currently GPT4o.) however since this is embedded into Microsoft 365, it has its own system prompt (basically instructions defined by Microsoft) , which allows it to interact with the different Office applications and its own content filter. One big difference is that since Copilot does not directly use the new LLMs from OpenAI, since Microsoft needs to ensure that the new models support and work properly with the office ecosystem, so its take a while before a new version of an LLM from OpenAI is going to be working in Copilot 365.

How does the Copilot Orchestrator work? and function calling? We can split it up into different parts.

  1. User Input:
    • We we enter a prompt or query that gets sent from Copilot Chat or within a app.
  2. Preliminary Checks:
    • Copilot verifies the query for security and Responsible AI compliance. If unsafe, the process ends.
  3. Reasoning:
    • The orchestrator creates a step-by-step plan to address the query.
  4. Context and Tool Selection:
    • Conversation context and Microsoft Graph data are retrieved.
    • The LLM refines the query or identifies the need for additional data.
    • The orchestrator selects the best plugin/tool for the task.
  5. Function Matching:
    • The orchestrator creates a detailed prompt combining the query, context, and chosen tools.
    • The LLM selects the optimal function and required parameters.
  6. Tool Execution:
    • The orchestrator sends an API request to the tool, retrieves data, and continues processing.
  7. Result Analysis:
    • The orchestrator refines the response through multiple LLM interactions until a final response is ready.
  8. Response Delivery:
    • The orchestrator submits the response to the LLM for finalization, ensures Responsible AI compliance, logs the response, and delivers it to the user.
  9. Conversation Update:
    • The conversation state is updated to prepare Copilot for the next query.

How does it work in regards to Declerative Agents and function calling?

  1. The orchestrator searches the agent’s installed plugins to find up to five best-matching functions for the user’s query.
  2. It follows this order for matching:
    • Exact match on function name (lexical).
    • Semantic match on function description.
    • Exact match on plugin name (adds all plugin functions as candidates).
    • Semantic match on plugin name (adds all plugin functions as candidates).
  3. The process continues until five candidate functions are identified.

Can I adjust the system prompt or change to another LLM in Copilot? (no and no) The system prompt includes the function calls so we cannot tamper with that.

What kind of data does Copilot have access to? (the same as you as a user, however there are some data source that is not directly made available to the Semantic Index, which is user’s archive mailbox, group mailboxes, or shared and delegate mailboxes that they have access to, which is not surfaced by Copilot)

Is my data used to train the models? (no) Microsoft is not doing that, in fact they do not need to. Since OpenAI is developing the models, so OpenAI collects tons of training data from ChatGPT which is used to train and create new models which then Microsoft get access and uses in Copilot.

Is the model learning based upon my behaviour? No, the models is and always will be static. However the Microsoft Graph ranks the most relevant content based on its knowledge of additional signals for users and their close network. This is known as personalization in Microsoft 365, which drives relevance for queries against the content in your organization. How this is done and what is relevant? that is not documented.

Why do I sometimes get worse / better result in ChatGPT? while Microsoft Copilot is currently locked to a dedicated version of GPT, ChatGPT introduces new ones rapidly. Microsoft has a exclusive partnership (currently) that allows them to use the ChatGPT models on their own services, but there is a delay from when OpenAI releases a new model until Microsoft is able to use it in Microsoft 365. The reason for this delay is that Microsoft needs to ensure that function calls are working with the newer models, which might not always be the case.

What is the Semantic Index? It is just an extension of Microsoft Search in Microsoft 365. By default Search in Microsoft 365 has been using Keywords and phrases ( Lexical search). When Copilot 365 license is enabled on a tenant, Microsoft will start creating mathematical vectors of the metadata stored in Microsoft Search. After this is enabled it will do something called hybrid search which combines (keywords + vector) search that provides much higher accuracy on search. While this information is not available directly from the Graph Explorer, but the metadata is contained there. This allows for a much more accurate search engine. In short, semantic index is just vector embeddings added to the metadata in Search.

Can I adjust how and which content that is relevant for Microsoft Graph? No this is done automatically, ufortunatley no options to adjust this.

Is Copilot 365 available trough an API? No not directly. You need to use the Direct Line API from Copilot Studio which can then be used as a way to get access to Copilot 365 data.

How does it generate a reply? When you trigger a task trough a prompt in Copilot 365 it will send a task trough the Microsoft Copilot Orchestrator (can be seen as a virtual assistant that is responsible for setting up its own tasks) finds relevant data from the Semantic Index, which will then be sent to the Azure OpenAI Service Subscription for Microsoft 365. This Azure OpenAI instance is dedicated for your tenant, and unlike a regular Azure OpenAI instance, has content filtering disabled. Also see the example described earlier.

Can I ensure that data is always processed in the same datacenter as my tenant? No you cannot, Microsoft 365 Copilot calls to the LLM are routed to the closest data centers in the region, but also can call into other regions where capacity is available during high utilization periods. For European Union (EU). EU traffic stays within the EU Data Boundary while worldwide traffic can be sent to the EU and other countries or regions for LLM processing. Also if the feature  Allow web search in Copilot enabled, Copilot can use Bing search, which is a global service.

Can I extend the data available in Copilt 365 with 3.party data sources? Yes, using something called Graph Connectors. It allows the Microsoft Search feature in Microsoft 365 to also index 3.party data sources. There are a bunch of prebuilt integrations available. It should note that this feature DOES NOT require Copilot 365, since this is a Microsoft search capability. The current issue with Graph Connectors is that from a search capability it can only search trough the “title” and not the actual content.

How much information can Copilot 365 handle? With GPT4o, it has a 128k context window, which means that it supports close to 96000 words. However the System Prompt from Microsoft takes up some space in this (including the function calling), so the total size is unknown. Secondly if the prompt or response is also getting information from Microsoft Graph there will be a lot of context there as well that also takes up space. A common answer to this is it depends.

  • When summarizing content or referencing content while using Copilot to create a draft, keeping the total of all of your referenced content to around 80,000 words or less.
  • Asking Copilot questions about the document works best if the document is less than about 7,500 words.
  • Rewrite works best on a document that is less than about 3,000 words.

Can administrators see how I use Copilot? Yes, using eDiscovery and even Purview unified logs they can see how you interact with Copilot, both prompt and results (also including which data that was used)

Does Copilot 365 work in TS/RDS/Citrix/VMware enviroment? No, Copilot for Microsoft apps does not work with Shared Computer licensing.

What is the difference between Microsoft 365 Copilot Chat and regular Copilot?

Aspect Microsoft 365 Copilot Chat Microsoft 365 Copilot
Licensing requirements Entra account + Microsoft 365 subscription Entra account + Microsoft 365 subscriptions (E3, E5, A3, A5, Business Standard & Premium)
Target group Commercial customers Enterprise, business, and education
Subscription fee No additional cost Add-on subscription
Access to organizational content No (only via custom agents) Yes
Data sources Copy + paste of text, uploaded files, web content Microsoft Graph, Microsoft 365 apps data, web content (optional)
Security & Compliance Enterprise Data Protection (EDP) EDP plus others like SAM
UI Accessible via web, desktop apps like Teams and Outlook, iOS and Android Same as M365 Copilot Chat plus integrated within Microsoft 365 apps
Agents Yes, both free and pay-as-you-go Yes, Microsoft Copilot Studio
Web search integration Yes Optional (controlled by user/admin)
Deployment Tasks like pin, enable web search, network, group policy for Copilot in Edge More tasks because more tools and more data to protect (organizational data)
GPT GPT-4o GPT-4o

Is Bing Chat or Copilot in Windows different? Yes! there is a big difference. Since regular Copilot only interact with the LLM or Bing Search to handle promtps and replies. Copilot for Windows also is another beast entirely and runs its own loca LLM, more description about that service can be found in the blog here –> How does Windows Recall work? – msandbu.org

What is enterprise data protection (EDP)? It means that prompt and replies are only handled within a secure enviroment at Microsoft. It also means that you cannot use external plugins, this ensures that data will not leave the enviroment. The EU Data Boundary doesn’t apply to web search queries.

Can I extend the features available in Copilot 365? Such as building a custom integration? Yes, this is called a declerative agent.

What are the restrictions on file size for the different filetypes?

File typeFile size limit
.doc150 MB
.docx512 MB
.html150 MB
.pdf512 MB
.ppt150 MB
.pptx512 MB
.txt150 MB
.xls150 MB
.xlsx150 MB

How can I debug an agent that is not working properly? (Just type) -developer on in the chat window (Its in Norwegian in the screenshot below but you will get a reply stating developermode activated)

How does Web Search Work? When web search is enabled, Microsoft 365 Copilot and Copilot Chat analyzes the prompt to identify if web information can improve the response. If needed, Copilot creates a search query and sends it to Bing.

  • The search query is not your full prompt; instead, it’s a few key words based on what you asked. (Name of person, location or content)
  • What’s NOT included in the query:
    • The full prompt (unless it’s very short).
    • Microsoft 365 files such as emails, documents, or uploaded files.
    • Entire web pages or PDFs summarized by Copilot in Edge.
    • Any personal details like username, domain, or tenant ID.

NOTE: The EU Data Boundary doesn’t apply to web search queries.

Why am I not seeing full information in the Copilot Dashboard? If you have less then 50 licenses assigned in your tenant you will only see Readiness page,, Adoption page with tenant-level metrics only, Sentiment with tenant-level survey results. You need to have more then 50 licenses assigned to see the full details. Also note that Readiness data in the dashboard represents data over the previous 28 days. There’s a four-day data delay from the current date.

Leave a Reply

Scroll to Top