Improving the security posture of your Google Workspace environment - Data Classification

Marcin_Milewski

 Editor's Note: This is the 3rd edition of Marcin's series on securing your Workspace environment. Be sure to check out his articles discussing DNS records and BeyondCorp Enterprise.

AXfLAJsrbRM9np2.png

In today's world, enterprise companies generate and manage huge amounts of data every day.  We share the documents with coworkers or external entities, that contain data ranges from need-to-know information, to critical confidential files. In some cases data might get out of control, and bring us a lot of challenges. One effective strategy for managing the collaboration data is data classification.  

In this article we will explore Data Classification related features that we can use to classify the data within Google Workspace environments, we will go through the functionalities that can help us to classify the data at scale, and example use cases.  Before we will do that, it’s important to understand, why you might even want to classify the data?

Why should we classify the data?

Data Classification is an important part of managing and protecting information within each modern organization, companies however, might have different reasons for classifying, let’s take a look at some of them.

Security and Compliance
When the data is classified and categorized (e.g. based on its sensitivity level) it’s much easier to control it. The companies are able to implement appropriate security measures for categories, for example sensitive data such as customer credit card numbers can be given higher security controls compared to less critical data. Categorization also aids in compliance with various regulations such as GDPR, HIPAA and others, which mandate strict handling and protection of sensitive information.

Data Management & operational efficiency
Classified data is effective to manage - the end user can search for specific information based on its category or sensitivity,  while a litigation team applies retention rules based on certain classification fields. 

Companies can make better decisions when they identify the patterns,  when the data is classified, auditing capabilities help us to learn more about how and which data is used.

Risk Mitigation
Data Classification plays a crucial role in risk management. By identifying which data is sensitive, companies can apply different security policies and measures, to protect against data breaches or over-sharing. 

Data can be classified as a result of data loss prevention action, when the PII data is found, a rule can prevent users from certain activities - such as sharing the file outside of the company.  

Data Governance
Effective Data Governance relies on a structured approach to data management, where the data classification is a foundational element, that ensures that policies and procedures of data handling are consistently applied across the organization. 

Google Workspace features for Data Classification

In this section we will take a look at the functionalities that can be used for data classification within Google Workspace environments, we will explore different use cases, and features that help admins to classify the data at scale. 

Google Drive Labels

Marcin_Milewski_10-1724134010209.png

The main functionality of Google Workspace that enables users for file classification Is Drive Labels.

As administrators we define classification labels to apply to files that are stored in Drive.  The primary purpose of labels is to store metadata of files, these can be simple, like one value tag to store department information, or they can have many structured fields that include selections, dates, numbers or categories - depending on your company needs.

Drive Labels has different use cases, including: 

  • Data classification to follow an information governance strategy, by using Sensitivity label, we can restrict access to files marked as “Confidential” when Labels are used as conditions for DLP rules. 
  • Apply policy to items. To meet compliance requirements such as handling PII data, we can automatically label documents, or apply retention policies to the labeled files. 
  • Improved search, end-users can find files easier by searching using labels fields. 

Different types of classification

Companies are classifying data in different ways, Workspace offers flexibility in applying classification, depending on the requirements we can use one or a combination of a few different methods.

 

Marcin_Milewski_11-1724134065480.png

Manual Classification
Users who have labels provided, can classify the files manually, either by applying badge label or metadata label. Labels can help them to find the files of specific categories easier and faster.  End user might be required to pick an option for every new document, In such cases, they are seeing a notification banner.  

DLP Classification
Data Loss prevention rules can automatically label the drive files based on the findings (e.g. PII data detected). Workspace DLP offers a variety of predefined content detectors, and possibility to use custom detectors (e.g. Regex based). 

Default Classification
Admins can set policy, to automatically set labels on files created in certain departments. In such configuration every newly created file is getting classified, e.g. files owned by financial teams are getting sensitive labels by default. Such labels can be later adjusted by the end-user if we allow for that.

Programmatic Classification
Drive Labels offers an API that can be used to classify the data at scale. Customers are utilizing such APIs to apply classification labels in bulk or integrate the feature with 3rd party solutions. 

AI Classification 
Customers who are using Gemini Enterprise and AI Security add-on, can benefit from AI classification. This feature uses artificial intelligence to automatically label sensitive content. The customer goes through an initial training period, during which the AI model is created and learns organization’s criteria for content to label. AI classification then classifies the files at scale, across all licensed users (both new and existing files).

 

Marcin_Milewski_12-1724134161918.png

Example use case

This example configuration, prevents the end-users from sharing the documents classified as ‘Sensitive’ and ‘Confidential’ outside of organization. Sensitivity labels can be applied either manually, automatically or as a result of the DLP rule detection. 

  1. Navigate to Apps > Google Workspace > Drive and Docs > Labels > Manage Labels

Marcin_Milewski_13-1724134195049.png

2. Select ‘Badged label’ and configure the options as required. In this example we give options to the end-users to pick the right document sensitivity.  When the label is prepared, publish it and adjust the permissions as needed in the right corner of the screen.

Marcin_Milewski_14-1724134260398.pngMarcin_Milewski_15-1724134283963.pngMarcin_Milewski_16-1724134305315.png

3. When the label is ready, we can configure the Data Protection Rule. Navigate to the Rules, select Create Rule > Data protection.  Name the Rule, and select the scope.
Marcin_Milewski_17-1724134346813.png

4. Select Google Drive under Apps.  

Marcin_Milewski_18-1724134374651.png

5. In the conditions fields, select previously created Drive Labels and field options that you want to restrict. 

Marcin_Milewski_19-1724134405667.png

6. Select the action to block external sharing. Define the alert severity and notifications. Save the Rule.

 

Marcin_Milewski_20-1724134445959.png

 

7. To test the blocking mechanism, navigate to the Google Docs, create a test document and apply previously created test label.

Marcin_Milewski_21-1724134481561.png
8.When you try to share the document to an external recipient, you will be prevented by the DLP rule.

Marcin_Milewski_22-1724134514377.png

In conclusion, Google Workspace provides various data classification options to help organizations protect sensitive information and ensure compliance with regulations. By understanding the different classification labels and using the available tools, organizations can effectively manage and control access to their data.

Lastly, you should remember that the data classification is an ongoing process that requires regular review and updates to ensure continued protection. By leveraging Google Workspace's data classification capabilities, organizations can safeguard their sensitive information, enhance security, and maintain trust with their stakeholders.

5 3 3,109