Introduction to the new Azure Purview service

The amount of data stored and available to a business has increased since the year 2000. This increase in data has introduced a gap where companies cannot track their various data sets and create effective governance strategies. These gaps in knowledge have led to problems for various groups looking to provide value to the business. For example,

  • Knowledge of data sources is based on common knowledge in the business
  • Data is not visible to users unless they are part of a specific team
  • Creating documentation for data sources can be difficult and users may not trust the data.
  • Understanding the risk in an organisation’s data requires people to monitor data for sensitive content.

There are serval software applications and vendors that allow a business to catalogue their data assets to mitigate this knowledge gap. This catalogue software is typically costly and would require companies to buy licences for their employees. This cost acts as a barrier for SME companies to invest in this technology.

Microsoft has a new service in preview that will provide a unified data governance service to help organisations to manage their data assets. Purview helps an organisation create an up-to-date map of the business’s data using automated discovery.

To test out the functionality provided by Azure Data Factory, you can create these resources yourself in your tenancy or make your Microsoft Azure free account today.

How to create service

To create an instance of Azure purview service, carry out the following steps:

  1. Open the Azure portal using the link https://portal.azure.com/
  2. Create a new resource group called AZ-Purview-Demo-Initials (Use the initials in your name). For this scenario I am using AZ-Purview-Demo-SB

Azure Purview

Create a new Azure resource and search for the Azure Purview service as shown

Azure-Purview-002

Selecting the Azure Purview service will bring up the following blade that will allow you to create the service and read the documentation

Purview-003

Selecting create will bring up the following blade where you need to provide information to create the service. In this first window you need to provide the resource group that was created earlier and the location for the service. For this demo I am using East US as this was one of locations available at the time of writing. 

Purview-004

In the Configuration window, select the platform size and the catalog options

Purview-005

The next option that you need to define is the tags for the service. This information is not mandatory to create the service but might be useful to track costs against a specific use.

Once the tag information has been created, select review and create, check the information you provided for the service and select create.

Purview-006

Once the create option has been clicked, the following page (blade) is shown with the status of creating the service.

Purview-007

Once the service has been created click on the Azure Purview Studio.

Purview-008

Capturing your first data source on Azure-Purview

Once you launch the Purview Studio you will get the following IDE (Integrated Development Environment).

Purview-010

Let’s start by first creating a collection. To do this select the Sources option on the left-hand side of the studio. On the next screen select new Collection. A collection can be used to group data by business function or environment.

Provide a meaningful connection name and select create

Purview-011

After creating a collection, create some data sources and ingesting the metadata. There are several metadata connectors provided by Microsoft currently, but these will grow over time.

Before we proceed further, an azure credential is required by Azure Purview service and this credential will need access to the relevant data sources. Azure Purview can be configured to use an existing key vault by going to the Management Console in your purview instance.

Once a credential is created, a data source can be created in the studio. Select the data source as shown.

Purview-012

Let’s start by first creating some data sources and ingesting the metadata. There are a number of metadata connectors provided by Microsoft, but these will grow over time.

Purview-013

Provide connection properties with your data source. I am connecting to an Azure SQL DB that is already provisioned and contains the Adventure Works database in this scenario.

Purview-014

Registering a data source in this fashion will add it to the collection. 

Purview-015

Start the scan.  Starting the scan will create a job, and once the scan is complete, Someone can review the metadata at a later date. 

 

Purview-016

I will list the outcome of the scan in a future post.

Azure Purview Pricing

The Azure Purview service is charged per VCore-hour of scanning your sources with the price rounded to the closest minute. This price is valid through to the end of February and while this service is in preview. While in preview, scanning of Power BI and SQL Server sources are free. For further information refer to Azure Purview Pricing

Next Steps

The next step is to further explore the service and build and end-end data governance platform and start implementing it at a customer. I am looking forward to the future growth of the product and how it will stack with other productions like Informatica and Collibra.

You can read more of my posts on Blog HomePage

Sign up with your email address to be the first to know about new publications