Automating Asset Grouping in Azure Purview

The Team was working on a data governance requirement. It required the Team to group the same type of assets into a logical collection. They were using Azure Purview as the data governance tool. However, it was a tedious task to achieve this manually. Therefore, there was a need to automate it. This blog deals with the solution to automate the process.

The Team used PyApacheAtlas to automate the movement of assets into collections. It is used to perform the most common operations of Azure Purview programmatically. One can follow the steps below to automate moving assets into collections.

Step-1: Establish a Connection

Establish a Connection between Azure Purview and PyApacheAtlas using Azure CLI or Service Principal Authentication.


REFERENCE_NAME_PURVIEW = <Purview Name>

PROJ_PATH = Path(__file__).resolve().parent

CREDS = DefaultAzureCredential()

CLIENT = PurviewClient(account_name= REFERENCE_NAME_PURVIEW, authentication= CREDS)


DefaultAzureCredential : Provides a default TokenCredential authentication flow for applications deployed to Azure. A default credential capable of handling most Azure SDK authentication scenarios. The identity it uses depends on the environment. When an access token is needed, it requests one using the identities in turn, stopping when one provides a token:

  1. A service principal configured by environment variables.
  2. WorkloadIdentityCredential when the Azure workload identity webhook sets environment variable configuration.
  3. An Azure-managed identity.
  4. On Windows only: a user who has signed in with a Microsoft application, such as Visual Studio
  5. Check ~azure.identity.SharedTokenCacheCredential for more details.
  6. The identity should be logged into the Azure CLI or Azure PowerShell, or the Azure Developer CLI.

PurviewClient provides communication between your application and the Azure Purview service. Simplifies the requirements for knowing the endpoint URL and requires only the Purview account name.

Step-2: Move a specific Asset to Collection using GUID.

Below is the function to move an Asset to a specified collection based on Asset GUID using PyApacheAtlas.


def move_asset_to_collection(asset_guid, collection_name):

    result=CLIENT.collections.move_entities(guids=asset_guid, collection=collection_name)

    return result


CLIENT.collections.move_entities method is used to move one or more entities based on the GUID(Global Unique Identifier of an Entity) provided to the specified collection.

The parameters are a list of GUID’s and Collection-friendly names, typically a 6-letter pseudo-random string like “kd2cbh” which can be obtained in the purview portal.

Step-3: Move Assets to Collection based on its type.

Below is the function to move all the Assets to a provided collection based on Asset type using PyApacheAtlas.


def get_all_entities_same_type(type_name,collection_name):

    try:

        search_results = CLIENT.discovery.browse(entityType=type_name)

        Entity_guid = [result[‘id’] for result in search_results[‘value’]]

        result=move_asset_to_collection(Entity_guid,collection_name)

        return result

    except Exception as e:

        return e


CLIENT.discovery.browse method helps execute a search for Purview, based on the entity against the /catalog/api/browse endpoint.

The Parameters are entityType (String). The entity type to browse is the root-level entry point. This must be a valid Purview built-in or custom type.

Conclusion

It helped us achieve the collection of assets efficiently and remove the manual intervention.

“TekLink’s team exceeded Kellogg Latin America’s expectations with the implementation of Anaplan. Not only their diligence and technical mastery were evident, but also provided critical and out-of-the-box solutions to meet the project’s criteria and expand its scope.”
Francisco Ibarra
Francisco Ibarra

Sr. Manager

“TekLink provided worry free BEx to AO Migration by analyzing and converting our 500+ BEx workbooks to Analysis for Office while also increasing adoption by running power user workshops.”
Lakshmi Thota
Lakshmi Thota

Sr. Manager

"We partnered with TekLink based upon a previous positive experience where they did a great job for us as well as hearing positive feedback about their excellent level of service. We’ve also interviewed many of their experts and consistently found their candidates to be the most technically sound with excellent BW and HANA knowledge above and beyond that of their competitors. Teklink has certainly helped us to stabilize and improve the reliability of our BI operations"
Patrick Bachman
Patrick Bachman

IT Architect

Contact Us to know more