How does Microsoft 365 Copilot protect my business data?

Microsoft 365 Copilot is an AI-powered productivity tool which has along with other GenAI tools introduced a quantum leap in terms of what ‘everyday’ technology can help us achieve. It and other tools like it are starting to be adopted as part of enterprise technology stacks across the world, and executives and technology professionals with a lean towards security and privacy are starting to probe into the specifics of how it works, and how it handles the data your organization’s users may be introducing into it. At a recent community event focused on AI in the Microsoft platform, one of the questions asked by a session attendee was “How does Microsoft 365 Copilot protect my business data?”. In this blog post I will be taking a closer look and formulating an answer based off the published documentation from Microsoft.

Microsoft 365 Copilot connects LLMs to your organizational data, and accesses ‘content and context’ through Microsoft Graph within your organization’s Microsoft 365 tenant. It can generate responses anchored in your organizational data, such as user documents, emails, calendar, chats, meetings, and contacts. That’s the ‘content’ part. It then combines this content with a user’s working ‘context’, such as the meeting you’re in right now, or the email exchanges you’ve had with someone else on a topic, or perhaps even the chat conversations you may have had over Teams rather than over email.

According to Microsoft’s published documentation, here are some quick facts:

Microsoft 365 Copilot only surfaces organizational data to which individual users have at least view permissions. The devil in the details here implies you need to have set up the permission models available in Microsoft 365 services, such as SharePoint Online, to help ensure the right users or groups have the right access to the right content within your organization’s Microsoft 365 environment.
Prompts, responses, and data accessed through Microsoft Graph are not used to train foundation LLMs (including those used by Microsoft 365 Copilot). When you enter prompts into Microsoft 365 Copilot, the information contained within your prompts, the data they retrieve, and the generated responses remain within the Microsoft 365 service boundary.
Microsoft 365 Copilot uses Azure OpenAI services for processing, not OpenAI’s publicly available services (such as the version of ChatGPT publicly available).
Azure OpenAI doesn’t cache customer content and Copilot modified prompts for Microsoft 365 Copilot.
In terms of content filtering, while abuse monitoring (which includes human review of content) is technically available in Azure OpenAI, Microsoft 365 Copilot services do not leverage this capability. Instead, Azure OpenAI Service includes a content filtering system that works by running both the input prompt and the response through classification models designed to identify and block the output of harmful content.

Great, so with some of the details of how it orchestrates under the hood out of the way, let’s get into the specifics of how Microsoft 365 Copilot protects your organizational data. Essentially, as mentioned in the first bullet above, it relies on permissions model within your Microsoft 365 tenant to prevent ‘data leaks’ between users, groups, and tenants. So this means:

Microsoft 365 Copilot presents only data that an individual can access using the same underlying controls for data access used in other Microsoft 365 services, because Semantic Index honors the user identity-based access boundary to restrict the grounding process to just the content that an individual user is authorized to access.
If your organization has deployed Microsoft Purview Information Protection and it encrypts data, Microsoft 365 Copilot honors the usage rights granted to the user. This encryption can be applied either through sensitivity labeling or through Information Rights Management (IRM).
Broadly speaking in terms of the risks of your organization’s data ‘bleeding’ into another, Microsoft 365 isolation controls achieve logical isolation of one tenant’s content from another through Microsoft Entra authorization and role-based access controls.

In a nutshell what all of this points at is the importance of having a well defined permissions model set up within your organization’s Microsoft 365 tenant. It remains of paramount importance that only those users that need access to specific information due to the nature of their job functions are granted permissions to it. Setting up a proper permissions model is only one half of what you should be ensuring. From an Organizational Change Management (OCM) standpoint, it is always helpful to continue reinforcing to your users the importance of good permissions hygiene, and to minimize org-wide sharing of content to just the content that is relevant for wider consumption.

Additional information:

Data, Privacy, and Security for Microsoft 365 Copilot available at Microsoft Learn.

Thanks for reading, and please reach out if you have a question or just want to chat more!