Client-based auto-labeling recommendations for Australian Government compliance with PSPF
This article provides guidance for Australian Government organizations on client-based sensitivity auto-labeling capabilities. Its purpose is to demonstrate how auto-labeling can help to improve data security posture while adhereing with requirements outlined in the Protective Security Policy Framework (PSPF) and Information Security Manual (ISM).
Auto-labeling overview details where auto-labeling is suitable in a modern Government work environment and reduces security risks.
In the Australian Government context, client-based auto-labeling is useful for recommending labels based on:
- Sensitive content detection
- Markings applied by external organizations
- Markings applied by non-Microsoft tools
- Historical markings
- Paragraph markings
Client-based auto-labeling is configured directly within a sensitivity label's configuration. This method of auto-labeling applies to Office or online clients and interactively identifies sensitive content, notifies the user, and then either:
- Automatically applies the sensitivity label relevant to the most sensitive content detected in an item; or
- Recommends to the user that they apply the label.
PSPF Policy 8 Requirement 2 and ISM 0271 make it clear that a user should be responsible for applying classifications to items rather than an automated service. Because of this, client-based auto-labeling should be configured to provide user recommendations only:
Requirement | Detail |
---|---|
Protective Security Policy Framework (PSPF) Policy 8 Requirement 2 a.i. – Assessing sensitive and security classified information (v2018.6) | To decide which security classification to apply, the originator must assess the value, importance, or sensitivity of official information by considering the potential damage to government, the national interest, organizations, or individuals, that would arise if the information’s confidentiality was compromised. |
ISM Security Control: 0271 (June 2024) | Protective marking tools don't automatically insert protective markings into emails. |
In the following example, the user commenced writing about Project Budgerigar. The client-based auto-labeling action triggered the detection of Sensitive Information Type (SIT). The result was a client-based auto-labeling recommendation appearing at the top of the email:
Client-based auto-labeling actions can be triggered based of detection of SITs (including Exact Data Match SITs) and trainable classifiers. A combination of SITs and classifiers can also be used.
Client-based auto-labeling scenarios for Australian Government
Client-based auto-labeling helps to protect sensitive information by identifying items, which are under-classified. Under-classification information represents significant risk to Australian Government. Client-based auto-labeling helps ensure the correct label application, and items are marked and protected appropriately. The correct label ensures that only appropriate distribution of information is allowed.
Accurate classification helps ensure that need-to-know principles are maintained and that access to information is restricted. These concepts relate to Protective Security Policy Framework (PSPF) Policy 9 Requirement 2:
Requirement | Detail |
---|---|
PSPF Policy 9 Requirement 2 – Limiting access to sensitive and classified information and resources | To reduce the risk of unauthorized disclosure, entities must ensure access to sensitive and security classified information or resources is only provided to people with a need-to-know. |
Recommending labels based on sensitive content detection
Detecting sensitive content dynamically and recommending that users apply an appropriate label and marking helps ensure need-to-know. It also ensures that appropriate protections are in place relevant to the sensitivity of an item. Label recommendations still ensure agency is retained with the user to make the decision.
Client-based auto-labeling is used to increase item sensitivity, if needed. For Australian Government, the benefits are seen particularly at the high end of the sensitivity taxonomy. For OFFICIAL Sensitive and PROTECTED labels (including sublabels), organizations should compile a list of SITs and aligning them with appropriate labels. For example:
Label | SIT | Use |
---|---|---|
OFFICIAL Sensitive Personal Privacy | Australian Health Records Act Enhanced. This prebuilt SIT seeks to identify occurrences of: - Australian Tax File Number (TFN) - My Health Record - All Full Names - All Medical Terms and Conditions - Australia Physical Addresses |
Health information relating to an individual is protected under the privacy act and may be appropriate for labeling as 'OFFICIAL Sensitive Personal Privacy.' |
OFFICIAL Sensitive Legislative Secrecy | A 'Legislative Secrecy Keywords' custom SIT containing key words, such as: - 'Legislative Secrecy Warning:' |
As recommended in PSPF Policy 8, a text-based warning notice should be placed at the top and bottom of items relating to legislative information. Organizations will likely be applying these notices via document templates or a similar means. These warning notices could be used to identify items, which should be marked with the 'OFFICIAL Sensitive Legislative Secrecy' label. |
PROTECTED | A Codeword or list of codewords associated with initiatives that should have their information classified as PROTECTED. For example: - 'Project Budgerigar' A list of keywords relating to subjects, which can be considered highly sensitive and for which loss of information may result in damage or loss of confidence in Government. For example: - 'data breach' - 'highly sensitive' - 'against the law' - 'code of practice' - 'breach of trust' |
A list of keywords could be used to detect items that contain information relating to a classified project, initiative, system, or application. Adding a list of topics considered sensitive to an organization to a SIT allows Microsoft 365 to prompt users to increase the sensitivity label applied to an item when the keywords are detected. Doing so helps the user to consider need-to-know and allow for protections to be applied to items to prevent inappropriate distribution of information (for example, DLP, encryption, and other controls). |
The strategies outlined in the previous table can also be used to locate and act on sensitive information via other Microsoft 365 capabilities such as:
Recommendations based on external agency markings
Many of the controls discussed in this document are enacted based on the labels applied to items. Information generated externally may have text-based protective markings in place but may not have sensitivity labels relevant to your organization applied. The impact of this might be that the items aren't protected from data loss via DLP policies, and alerting may not be generated when items are saved to lower sensitivity locations.
Situations where this could occur include:
- When items are generated by other government organizations that adhere to PSPF. In these situations the entities markings and/or labels won't align with your own without configuration.
- When items are generated by another government organization that doesn't or only partially aligns with the PSPF framework (for example, NSW Government).
- When items are generated and classified by foreign governments that may or may not have Australian classification equivalence.
To avoid loss of information generated elsewhere, but which your organization is a custodian of, client-based auto-labeling is used to recommend that equivalent labels are applied to items.
Such configurations make use of SITs to identify markings or classifications applied externally. These SITs are then need to be added to the auto-labeling configuration of the relevant sensitivity labels.
Some examples of where SITs may be used to recommend labels based on markings applied externally include:
Label | SIT | Use |
---|---|---|
OFFICIAL Sensitive | OFFICIAL Sensitive Regex SIT | To identify items marked as OFFICIAL: Sensitive but without the OFFICIAL Sensitive label applied to them, including items generated by other organizations. |
PROTECTED | PROTECTED Regex SIT | To identify items marked as PROTECTED but without the PROTECTED label applied. |
OFFICIAL Sensitive | OFFICIAL Sensitive – NSW Government | Information, marked with OFFICIAL Sensitive – NSW Government and received by a Federal Government organization aren't labeled by default and therefore doesn't have protections configured that align with the OFFICIAL Sensitive security classification. Marking these items as OFFICIAL Sensitive when modified by your users helps to protect the contained information. Visual markings applied by NSW Government agencies would still be present on the item, making it clear that the item was generated elsewhere1. |
OFFICIAL Sensitive - Legal Privilege | OFFICIAL Sensitive – Legal (NSW Gov) OFFICIAL Sensitive – Law enforcement (NSW Gov) |
This configuration would ensure that information marked with either of the NSW State Government legal-related markings is treated in line with OFFICIAL: Sensitive Legal Privilege while it resides within a Federal Government environment. |
SECRET | CONFIDENTIEL UE | CONFIDENTIEL UE is a classification used by members of the European Union. Example mappings provided in PSPF Policy 7 – Security Governance for International Sharing imply that this information should be treated in line with SECRET. Detecting CONFIDENTIAL UE markings and applying a SECRET label helps to ensure that such information can be identified and potentially removed in line with labels for Information that shouldn't be placed on Microsoft 365 |
Note
1 An alternative approach might be to include an OFFICIAL Sensitive – NSW Government label within your organizations label taxonomy. This label could be published to an administrative account only, which keeps it within scope of service-based auto-labeling policies but without users having the ability to apply it to items directly. This idea is further discussed in labels for organizations with differing label taxonomies.
Recommendations based on markings applied by non-Microsoft tools
Many Government organizations currently, or have previously, made use of non-Microsoft tools to apply markings to files and email. These tools are configured to apply one or more of:
- X-Protective-Marking x-headers to email,
- Text-based headers and footers to email and documents,
- Subject-based email markings; and/or
- File metadata via document properties.
For organizations transitioning from non-Microsoft tools to native Microsoft Purview capabilities, these existing properties or markings can be used to determine which sensitivity label should be applied to an item.
Important
Client based auto-labeling complements service based auto-labeling, and both should be used together.
Service-based auto-labeling doesn't detect content or label email residing within user mailboxes. Client-based auto-labeling is used to ensure that markings applied to pre-existing items are maintained when they're forwarded or replied to. For example, consider a pre-existing PROTECTED email with a text-based PROTECTED marking applied to it but no sensitivity label. When a user attempts to forward it or reply to it, client-based auto-labeling can identify the item PROTECTED based on the existing markings and then recommend that the user applies the PROTECTED label to the item.
The following client-based auto-labeling example configurations ensure items containing an existing marking have the correct sensitivity label applied. These configurations also identify markings applied previously by a legacy non-Microsoft classification tools and markings on items generated by external PSPF compliant organizations:
Label | SIT requirement | Regular Expression |
---|---|---|
OFFICIAL Sensitive | SIT that detects the following marking syntax: - OFFICIAL Sensitive - OFFICIAL: Sensitive -OFFICIAL: Sensitive -[SEC=OFFICIAL:Sensitive] |
OFFICIAL[:\- ]\s?Sensitive(?!(?:\s\|\/\/\|\s\/\/\s)[Pp]ersonal[- ][Pp]rivacy)(?!(?:\s\|\/\/\|\s\/\/\s)[Ll]egislative[- ][Ss]ecrecy)(?!(?:\s\|\/\/\|\s\/\/\s)[Ll]egal[- ][Pp]rivilege)(?!(?:\s\|\/\/\|\s\/\/\s)NATIONAL[ -]CABINET) |
PROTECTED | SIT that detects the following marking syntax: - PROTECTED - [SEC=PROTECTED] |
PROTECTED(?!,\sACCESS=)(?!(?:\s\|\/\/\|\s\/\/\s)[Pp]ersonal[- ][Pp]rivacy)(?!(?:\s\|\/\/\|\s\/\/\s)[Ll]egislative[- ][Ss]ecrecy)(?!(?:\s\|\/\/\|\s\/\/\s)[Ll]egal[- ][Pp]rivilege)(?!(?:\s\|\/\/\|\s\/\/\s)NATIONAL[ -]CABINET)(?!(?:\s\|\/\/\|\s\/\/\s)CABINET) |
Note
For more examples of SIT syntax for Australian Government, see the comprehensive list of SIT syntax to detect protective markings in Australian Government.
Recommendations based on historical markings
Government marking requirements do change periodically, as occurred in October of 2018 when markings (for example, CONFIDENTIAL and For Official Use Only (FOUO)) were discontinued. Government organizations are likely to have a significant amount of information residing on their systems with these historical markings applied.
Handling these historical markings are typically outside of the scope of any new Microsoft Purview deployments. However, if your organization wishes to bring historical marking into scope, historical markings could be split into two categories; those that have a modern equivalent and those that don't. PSPF Policy 8 Annex E provides a full list of historical classifications and markings along with their current handling requirements.
An easy option for historical markings that align with a modern equivalent is to configure auto-labeling to recommend application of the equivalent label when these items are modified. With this configuration, the user experience is:
- When the user opens and attempts to reply or forward a legacy email, the historical marking is detected. A label recommendation is provided to the user for the new email.
- When legacy file is opened, modified, and saved by a user, then Office clients detects the previous marking and prompts the user to apply a modern equivalent to the item before saving.
The previous actions help to ensure that appropriate controls are applied to historical items.
Tip
Australian Government Records management requirements may be relevant when dealing with historical markings. If an item has been declared as a record, then it is locked, preventing editing. This means that a new marking won't apply as it would result in a change to the item, which affects the retention period for the item. However, if an item with a historical marking is saved as a new item (for example, used as a template), then recommending a label based on the historical marking can be useful.
The following are examples of how SITs based on historical markings could be configured and used with client-based auto-labeling to suggest a label based on a historical marking:
Label | SIT | Use |
---|---|---|
OFFICIAL Sensitive | For Official Use Only SIT containing the following keywords: - For Official Use Only - For-Official-Use-Only - FOUO X-IN-CONFIDENCE SIT containing the following keywords: - X-IN-CONFIDENCE |
Client-based auto-labeling could be used to identify legacy content with these historical markings applied and suggest a modern alternative on new or edited items based on the legacy items. |
SITs and DLP policies should be configured to check for historical markings and ensure that relevant controls are applied to these items. This ensures item with a historical marking attached to an email and sent externally has modern labels and associated controls applied to it.
Recommending labels based on paragraph markings
Some Government organizations make use of paragraph markings in documents. Recommendations for paragraphs are created with set of SITs to help identify the sensitivity that can be applied to an item based on its contained paragraph markings. However, the document label aggregates to the highest marking.
To achieve this we could use:
- An OFFICIAL keyword SIT detecting the
(O)
paragraph marking and recommending that the OFFICIAL label is applied when detected. - An OFFICIAL Sensitive keyword SIT detecting the
(O:S)
paragraph marking and recommending that the OFFICIAL label is applied when detected. - A PROTECTED keyword SIT detecting the
(P)
paragraph marking and recommending that the PROTECTED label is applied when detected. - A SECRET keyword SIT detecting the
(S)
paragraph marking and recommending that the SECRET label is applied when detected.
The SECRET marking SIT is useful example for identifying information that shouldn't be stored within the platform. Checking for items containing such markings may allow you to identify or prevent data breach. For more information on this concept, see labels for Information that shouldn't be placed on Microsoft 365.
Note
Straightforward keyword SITs like this may generate false positives as, if (P), for example, was to appear in a document or email without being intended as a paragraph marking, the service will recommend that the user marks the item as PROTECTED. For this reason, SITs to identify paragraph markings should be carefully considered before implementation to determine if false positives are likely to be generated.
Example client-based auto-labeling configuration
These examples are based on the use of SITs and classifiers to identify protective markings or sensitive information. Once identified, an appropriate label is then recommended to the user. These examples are boilerplate Australian Government examples and organizations should work to develop their own SITs to identify organization specific information:
Label | Suggested SITs | Regular expression example |
---|---|---|
UNOFFICIAL | UNOFFICIAL Regex SIT intended to detect an UNOFFICIAL marking. | UNOFFICIAL |
OFFICIAL | OFFICIAL Regex SIT intended to detect an OFFICIAL marking OFFICIAL Paragraph Marking SIT with case sensitive keywords. |
(?<!UN)OFFICIAL (O) |
OFFICIAL Sensitive (Category) | N/A | - |
OFFICIAL Sensitive | OFFICIAL: Sensitive Regex SIT intended to detect variations of OFFICIAL Sensitive markings without inclusion of Information Management Markers (IMMs) or Caveats. SITs relating to processes or systems that the disclosure of information about may result in medium business impact and limited damage to an individual, organization, or government. Prebuilt SITs of: - All Credential Types - Credit Card Number 'OFFICIAL: Sensitive Paragraph Marking' SIT with case sensitive keywords |
OFFICIAL[:\- ]\s?Sensitive(?!(?:\s\|\/\/\|\s\/\/\s)[Pp]ersonal[- ][Pp]rivacy)(?!(?:\s\|\/\/\|\s\/\/\s)[Ll]egislative[- ][Ss]ecrecy)(?!(?:\s\|\/\/\|\s\/\/\s)[Ll]egal[- ][Pp]rivilege)(?!(?:\s\|\/\/\|\s\/\/\s)NATIONAL[ -]CABINET) (O:S) |
OFFICIAL Sensitive Personal Privacy | OFFICIAL: Sensitive Personal Privacy Regex intended to detect the marking. Prebuilt SITs of: - Australia Bank Account Number - Australia Driver's License - Australia Medical Account Number - Australia Passport Number - Australia Tax File Number |
OFFICIAL[:\- ]\s?Sensitive(?:\s\|\/\/\|\s\/\/\s\|,\sACCESS=)Personal[ -]Privacy |
OFFICIAL Sensitive Legal Privilege | OFFICIAL: Sensitive Legal Privilege Regex SIT intended to detext the marking. Prebuilt trainable classifier of: - Legal Affairs |
OFFICIAL[:\- ]\s?Sensitive(?:\s\|\/\/\|\s\/\/\s\|,\sACCESS=)Legal[ -]Privilege |
OFFICIAL Sensitive Legislative Secrecy | OFFICIAL: Sensitive Legislative Secrecy Regex SIT intended to detect the marking. | OFFICIAL[:\- ]\s?Sensitive(?:\s\|\/\/\|\s\/\/\s\|,\sACCESS=)Legislative[ -]Secrecy |
OFFICIAL Sensitive NATIONAL CABINET | OFFICIAL: Sensitive NATIONAL CABINET Regex SIT intended to detect the marking. | OFFICIAL[:\- ]\s?Sensitive(?:\s\|\/\/\|\s\/\/\s\|,\sCAVEAT=SH:)NATIONAL[ -]CABINET |
PROTECTED (Category) | N/A | - |
PROTECTED | Protected Regex SIT intended to detect the marking. PROTECTED Paragraph Marking SIT with case sensitive keywords of: Other keyword SITs relating to processes or systems that the disclosure of information about may result in high business impact and damage to the national interest and individuals. |
PROTECTED(?!,\sACCESS=)(?!(?:\s\|\/\/\|\s\/\/\s)[Pp]ersonal[- ][Pp]rivacy)(?!(?:\s\|\/\/\|\s\/\/\s)[Ll]egislative[- ][Ss]ecrecy)(?!(?:\s\|\/\/\|\s\/\/\s)[Ll]egal[- ][Pp]rivilege)(?!(?:\s\|\/\/\|\s\/\/\s)NATIONAL[ -]CABINET)(?!(?:\s\|\/\/\|\s\/\/\s)CABINET) (P) |
PROTECTED Personal Privacy | PROTECTED Personal Privacy Regex SIT intended to detect the marking. | PROTECTED(?:\s\|\/\/\|\s\/\/\s\|,\sACCESS=)Personal[ -]Privacy |
PROTECTED Legal Privilege | PROTECTED Legal Privilege Regex SIT intended to detect the marking. | PROTECTED(?:\s\|\/\/\|\s\/\/\s\|,\sACCESS=)Legal[ -]Privilege |
PROTECTED Legislative Secrecy | PROTECTED Legislative Secrecy Regex SIT intended to detect the marking. | PROTECTED(?:\s\|\/\/\|\s\/\/\s\|,\sACCESS=)Legislative[ -]Secrecy |
PROTECTED NATIONAL CABINET | PROTECTED NATIONAL CABINET Regex SIT intended to detect the marking. | PROTECTED(?:\s\|\/\/\|\s\/\/\s\|,\sCAVEAT=SH:)NATIONAL[ -]CABINET |
PROTECTED CABINET | PROTECTED CABINET Regex SIT intended to detect the marking. | PROTECTED(?:\s\|\/\/\|\s\/\/\s\|,\sCAVEAT=SH:)CABINET |