BigQuery and Sheets: A perfect match to simplify data governance

Sheets makes it easy to manage access and data rights in a cloud data warehouse

Cloud platforms for data warehousing and business intelligence are fast gaining traction. Unlike traditional server-driven, on-premises environments, cloud options make it fast and easy to leverage immense computing power at reasonable costs. Flexibility and scalability are hallmarks of platforms like Google BigQuery.

Organizations also value the open approach to data that cloud platforms facilitate. This “democratization” of data enables far more users—including those with little programming knowledge—to access insights in company data. As a result, companies can operate more nimbly, with stronger information available for decision-making across the organization.

Yet with all these positives, many still hesitate to move their analytics to the cloud.

Why? A perceived lack of security and control. Data governance and user access are two of the most common concerns associated with data warehousing in the cloud.

In reality, cloud platforms provide numerous ways to maintain control:

  •  They facilitate access control by individual user, groups of users and type of data.
  • They simplify the process of adding and managing metadata.
  • They provide extensive audit trails of data modifications.
  • They even offer a popular “rewind” capability that lets you roll back data changes to a previous version.

Sheets operates as a control panel

 

How can you put controls like these to use? Let’s start with the Google framework. These three applications are central to managing security and metadata:

  • BigQuery​, the scalable, serverless enterprise data warehouse.
  • Cloud Composer, ​the workflow orchestration service that manages data transformations. Cloud Composer leverages the popular Apache Airflow platform and Python programming language.
  • Sheets,​ the application commonly known for its spreadsheet and financial modeling capabilities, used here as a control panel to manage data governance, metadata and user access.

Each has its own powerful capabilities. Together, all three work in harmony to orchestrate robust data insights and strong data controls for the organization.

 

What’s missing? For one, the cumbersome ETL tools and processes associated with more traditional data warehouses. In a cloud environment, where processing capacity is unlimited, there’s no need to transform data before its import. Instead, individual users can transform data within BigQuery, based on their specific business needs. This shortens time to action and gives analysts and other users more flexibility.

Another difference is the use of Sheets, Google’s spreadsheet application. When combined with Cloud Composer and BigQuery, it can operate as the administration layer of the data warehouse. Sheets stores the metadata and access controls. Cloud Composer is set up to check Sheets on a daily (or more frequent) basis, then update the data warehouse with appropriate descriptions, rights and permissions.

It’s an innovative and user-friendly combination that greatly streamlines security, access and data governance, compared to traditional approaches. It also reduces the need for administrators with vast programming expertise, thanks to its intuitive interface. As a result, there’s less manual work and stronger security for the organization.

Best practices for security and data governance

Whether it’s in the cloud or an on-premises data warehouse, strong security and data governance revolve around a few core principles. As you create the governance playbook for your cloud data warehouse, concentrate your efforts in these three areas:

  • Data classification. ​
    When you categorize your data, you can create levels of business rules and access rights. Typical categories include restricted, sensitive and open. Sensitive data might include bank account numbers or protected health information (PHI) which are subject to industry regulations. Data can reside in multiple categories.
  • Metadata. ​
    This​ ​data about your data helps analysts and business users understand all the information available in the data warehouse and apply it correctly. Strong metadata leads to more accurate analyses. Metadata includes attributes like the data’s source, its purpose, any transformations that have occurred within the warehouse and its update frequency.
  • Access control lists (ACLs). ​
    ​ACLs​ ​assign user permissions across the data warehouse. Common permissions include read-only vs. read/write access, view-only vs. import vs. export capabilities, as well as access to specific data sets by data type, or by individual, role or user group.

What varies by platform are how you implement and control these elements. In a BigQuery data warehouse, a Sheets interface can handle all the above items in a user-friendly, automated manner.

Using Sheets eliminates the need to make adjustments or monitor access in the data warehouse itself, saving time and streamlining the process. A single Sheets document can track and control access across at all levels of the data warehouse, including fields, tables, datasets, and projects. It’s a smart way to enjoy the benefits of a cloud environment, without taxing in-house staff or administrators.

Ready for more? ​

Read our ​case study​ and learn how a major healthcare supplier saved time and improved security by using Google Sheets to manage access to its BigQuery environment.

Want to learn more about cloud data warehousing and how it can help your business better prepare for the future?
Download the report