Set up extraction pipelines with remote configuration files
You can set up the CDF extraction pipelines to use versioned extractor configuration files stored in the cloud. To deploy your extractors, you must supply a minimal configuration file containing the sign-in credentials and pull the remaining configuration from the cloud. You can generate the configuration files in the cloud with continuous integration systems such as GitHub Actions or directly with the CDF API.
Before you start
Make sure the extractor has the extractionconfigs:WRITE
capability to access CDF.
Configure extractors with GitHub Actions
-
Configure your extractor with a minimal configuration file. Refer to the extractor documentation for details. All the Cognite extractors have a similar configuration. This is an example for the Cognite DB extractor:
type: remote
cognite:
host: ${BASE_URL}
project: ${PROJECT}
idp-authentication:
client-id: ${CLIENT_ID}
secret: ${CLIENT_SECRET}
token-url: ${TOKEN_URL}
scopes:
- ${BASE_URL}/.default
extraction-pipeline:
external-id: db-extractor-pipelineAt startup, the extractor attempts to read the configuration files from the extraction pipeline with the external ID
pipeline-external-id
. The extractor continues to check for updates every few minutes. -
Create configuration files directly with the CDF API or set up a continuous integration pipeline in GitHub Actions.
To use the action, you must create a GitHub Workflow:
name: Update extractor configuration
on:
push
jobs:
deploy-configs:
runs-on: ubuntu-latest
name: Deploy Configs
steps:
- uses: actions/checkout@v2
- name: get commit message
id: commitmsg
run: 'echo ::set-output name=commitmessage::$(git log --format=%B -n 1 ${{ github.event.after }})'
- name: Deploy
uses: cognitedata/upload-config-action@v1
with:
base-url: ${{ secrets.BASE_URL }}
token-url: ${{ secrets.TOKEN_URL }}
cdf-project-name: ${{ secrets.PROJECT }}
client-id: ${{ secrets.CLIENT_ID }}
client-secret: ${{ secrets.CLIENT_SECRET }}
root-folder: 'root_dir/'
deploy: 'true'
revision-message: '${{ steps.commitmsg.outputs.commitmessage }}' -
Place the configuration files in the folder
root-folder
. The configuration file name must be identical to the external ID of the extraction pipeline. For instance,db-extractor-pipeline.yml
.The extractor finds and runs the configuration file at startup.