Hopp til hovedinnhold

Setup and administration for PostgreSQL gateway

Use the PostgreSQL gateway to ingest data into Cognite Data Fusion (CDF) from popular ETL tools. This service provides a PostgreSQL interface to CDF and the CDF resource types appear as tables in the PostgreSQL database. You can ingest data directly into CDF resources, like assets, events, and datapoints and to CDF RAW.

When should you use the PostgreSQL gateway?

Consider using the PostgreSQL gateway if:

  • You're integrating a new data source that can be accessed through Azure Data Factory or other ETL tool that supports writing to PostgreSQL. The PostgreSQL gateway can be used as a sink in ADF.
  • You have a previously built an extractor that can push data to PostgreSQL, but not to CDF.

Consider other solutions if:

  • You need very high performance especially for ingestion of RAW rows or time series datapoints, in the order of tens of thousands rows/points per second (10-50k/s as a ballpark figure).
NOTE
  • Microsoft's Azure Data Factory is the officially supported ETL tool.
  • PostgreSQL gateway is only intended for ingestion. Querying data from CDF for analysis and visualization is not supported.

This article explains how a Microsoft Entra ID (formerly Azure Active Directory) administrator can control access to ingesting data into CDF from your existing ETL tools, such as Azure Data Factory, that support writing to PostgreSQL, using the CDF PostgreSQL gateway.

When you have registered the PostgreSQL gateway, users can sign in with their organizational ID to integrate data in a CDF project.

NOTE

To perform the steps below, you need to be an administrator of Microsoft Entra ID.

Before you start

Make sure you have registered the Cognite API and the CDF portal application in Microsoft Entra ID and set up Microsoft Entra ID and CDF groups to control access to CDF data.

Step 1: Register an app in Microsoft Entra ID to use with PostgreSQL gateway

  1. Sign in to the Azure portal as an admin.

  2. If you have access to multiple tenants, use the Directory + subscription filter in the top menu to select the tenant in which you want to register an application.

  3. Search for and select Microsoft Entra ID.

  4. Under Manage, select App registrations > New registrations.

  5. In the Register an application window, enter the app name, and then select Register.

  6. Copy and make a note of the Application (client) ID. This value is required to create user credentials for the PostgreSQL gateway.

  7. Under Manage, select Certificates & secrets > New client secret.

  8. Enter a client secret description and an expiry time, and then select Add.

  9. Copy and make a note of the client secret in the Value field.

    NOTE

    Make sure you copy this value now. This value will be hidden after you leave this page.

Step 2: Create a group in Microsoft Entra ID and add the registered app as its member

  1. Open the overview window in Microsoft Entra ID and select Manage > Groups.

  2. Create a group, read more here.

  3. Open the group. Under Manage, select Members > Add members.

  4. Find the app you created above and click Select.

  5. Return to the overview, and then copy and make a note of the Object Id.

  1. Sign in to the CDF portal application as an admin.

  2. In the top menu, select Manage & Configure > Manage access.

  3. In the Access management window, select Groups > Create new group.

  1. In the Create a new group window, enter a unique name for the group.

  2. Add relevant capabilities. The minimum requirement is to add the project:list and groups:list capabilities. You also need to add read and write capabilities for the CDF resources you want to ingest data into. For instance, if you're ingesting assets, add asset:read and asset:write.

    informasjon

    If you revoke the capabilities in the CDF group, you also revoke access for the PostgreSQL gateway.

  3. Link the CDF group to a Microsoft Entra ID (ME-ID) group:

    1. In the Source ID field, enter the Object Id for the ME-ID group exactly as it exists in ME-ID.

    2. In the Source name field, enter the name of the group in Microsoft Entra ID.

  4. Select Create.

Step 4: Create user credentials for the PostgreSQL gateway

  1. Send a POST request to this server https://skywarp.{cluster}.cognite.ai/create.

    cluster is where your CDF instance is installed. If you don't know the cluster name, contact Cognite support.

  2. Get the Bearer token from Microsoft Entra ID for authentication. To get a bearer token from Postman, see the illustration below.

  3. Specify these OIDC credentials in the request body.

{
"auth":
{
"token_url": "https://login.microsoftonline.com/<YOUR_ME-ID_TENANT_ID>/oauth2/v2.0/token",
"client_id": "<YOUR_ME-ID_CLIENT_ID>",
"client_secret": "<YOUR_ME-ID_CLIENT_SECRET>",
"scope": "<BASE URL>/.default"
},
"project": "<YOUR_CDF_PROJECT_NAME>"
}

Where:

  • token_url contains the ID of your Microsoft Entra ID tenant. To find your tenant ID, see How to find tenant ID.
  • client_id is the Application (client) ID you copied in Step 1.6.
  • client_secret is the Client Secret you copied in Step 1.9.
  • scope is the base URL + .default of the cluster where your CDF instance is installed. The format is https://{{cluster}}.cognitedata.com/.default.
  • project is the CDF project name.

The service returns a username and password with the same CDF capabilities as the group you created in CDF. Make sure to keep these, as they are required when you set up the gateway connection in your ETL tool.

NOTE

Make sure you copy the username and password. If you lose the password, it can't be recovered.

Using Postman

Open the PostgreSQL collection in Postman. To update the authorization, navigate to Authorization tab in the collection overview.

  • Select Type as OAuth 2.0 and Add auth data to as Request Headers.

  • Set the Header Prefix to Bearer.

  • Select Configure New Token and specify these configuration options:

    • Enter a Token Name.

    • Select the Grant Type as Client Credentials.

    • Enter the Access Token URL as https://login.microsoftonline.com/{{tenant-id}}/oauth2/v2.0/token. Replace the tenant-id.

    • Input the Client ID and Client Secret obtained from the previous steps.

    • The Scope is $baseUrl/$scope where $baseUrl is https://{{cluster}}.cognitedata.com/api/v1/projects/{{project}} and $scope is set to .default.

tips

While using a scope for the first time, the admin has to define the scope explicitly. The admin must then consent to use this scope for the authorization process.

  • Select Client Authentication as Send as Basic Auth header.
  • Select Get New Access Token > Proceed > Use Token.

You have configured a new token using Client Credentials grant type.

Here's what a user credentials request would look like on the Body tab:

Client credentials

Step 5: Set up and connect your ETL tool to the PostgreSQL gateway

  1. Configure your ETL tool as a pipeline between the source system and the PostgreSQL gateway.

You'll find detailed information in the documentation provided by your tool vendor. For example, you can find the Azure Data Factory documentation here.

Troubleshooting

User configuration error

If the connection between the source and sink tests successfully in ADF, but the pipeline fails with a user configuration error, try these solutions:

  • The Write method field on the Sink tab in ADF is set to Bulk insert.

  • The group for the destination CDF resource type, for instance, assets, is set up with correct capabilities in Manage & Configure > Access management in CDF.