Configure the SAP extractor
To configure the SAP extractor, you must create a configuration file. The file must be in YAML format.
You can use the sample minimal configuration file included with the extractor packages as a starting point for your configuration settings.
The configuration file contains the global parameter version
, which holds the version of the configuration schema. This article describes version 1.
You can set up extraction pipelines to use versioned extractor configuration files stored in the cloud.
Logger
Use the optional logger
section to set up logging to a console or files.
Parameter | Description |
---|---|
console | Set up console logger configuration. See the Console section. |
file | Set up file logger configuration. See the File section. |
Console
Use the console
subsection to enable logging to a standard output, such as a terminal window.
Parameter | Description |
---|---|
level | Select the verbosity level for console logging. Valid options, in decreasing verbosity levels, are DEBUG , INFO , WARNING , ERROR , and CRITICAL . |
File
Use the file
subsection to enable logging to a file. The files are rotated daily.
Parameter | Description |
---|---|
level | Select the verbosity level for file logging. Valid options, in decreasing verbosity levels, are DEBUG , INFO , WARNING , ERROR , and CRITICAL . |
path | Insert the path to the log file. |
retention | Specify the number of days to keep logs for. The default value is 7 days. |
Cognite
Use the cognite
section to describe which CDF project the extractor will load data into and how to connect to the project.
Parameter | Description |
---|---|
project | Insert the CDF project name. This is a required parameter. |
host | Insert the base URL of the CDF project. The default value is https://api.cognitedata.com. |
api-key | We've deprecated API key authentication. |
idp-authentication | Insert the credentials for authenticating to CDF using an external identity provider. You must enter either an API key or use IdP authentication. |
Identity provider (IdP) authentication
Use the idp-authentication
subsection to enable the extractor to authenticate to CDF using an external identity provider, such as Azure AD.
Parameter | Description |
---|---|
client-id | Enter the client ID from the IdP. This is a required parameter. |
secret | Enter the client secret from the IdP. This is a required parameter. |
scopes | List the scopes. This is a required parameter. |
resource | Insert token requests. This is an optional parameter. |
token-url | Insert the URL to fetch tokens from. You must enter either a token URL or an Azure tenant. |
tenant | Enter the Azure tenant. You must enter either a token URL or an Azure tenant. |
Extractor
Use the optional extractor
section to add tuning parameters.
Parameter | Description |
---|---|
mode | Set the execution mode. Options are single or continuous . Use continous to run the extractor in a continuous mode, executing the Odata queries defined in the endpoints section. The default value is single |
upload-queue-size | Enter the size of the upload queue. The default value is 50 000 rows. |
parallelism | Insert the number of parallel queries to run. The default value is 4 queries. |
state-store | Set to true to configure state store. The default value is no state store, and the incremental load is deactivated. See the State store section. |
chunk_size | Enter the number of rows to be extracted from SAP OData on every run. The default value is 1000 rows, as recommended by SAP. |
delta_padding_minutes | Extractor internal parameter to control the incremental load padding. Do not change. |
State store
Use the state store
subsection to save extraction states between runs. Use this if data is loaded incrementally. We support multiple state stores, but you can only configure one at a time.
Parameter | Description |
---|---|
local | Local state store configuration. See the Local section. |
raw | RAW state store configuration. See the RAW section. |
Local
Use the local
section to store the extraction state in a JSON file on a local machine.
Parameter | Description |
---|---|
path | Insert the file path to a JSON file. |
save-interval | Enter the interval in seconds between each save. The default value is 30 seconds. |
RAW
Use the RAW
section to store the extraction state in a table in the CDF staging area.
Parameter | Description |
---|---|
database | Enter the database name in the CDF staging area. |
table | Enter the table name in the CDF staging area. |
upload-interval | Enter the interval in seconds between each save. The default value is 30 seconds. |
Metrics
Use the metrics
section to describe where to send performance metrics for remote monitoring of the extractor. We recommend sending metrics to a Prometheus pushgateway, but you can also send metrics as time series in the CDF project.
Parameter | Description |
---|---|
push-gateways | List the Pushgateway configurations. See the Pushgateways section. |
cognite | List the Cognite metrics configurations. See the Cognite section. |
Pushgateways
Use the pushgateways
subsection to define a list of metric destinations, each on the following schema:
Parameter | Description |
---|---|
host | Enter the address of the host to push metrics to. This is a required parameter. |
job-name | Enter the value of the exported_job label to associate metrics with. This separates several deployments on a single pushgateway, and should be unique. This is a required parameter. |
username | Enter the credentials for the pushgateway. This is a required parameter. |
password | Enter the credentials for the pushgateway. This is a required parameter. |
clear-after | Enter the number of seconds to wait before clearing the pushgateway. When this parameter is present, the extractor will stall after the run is complete before deleting all metrics from the pushgateway. The recommended value is at least twice that of the scrape interval on the pushgateway. This is to ensure that the last metrics are gathered before the deletion. |
push-interval | Enter the interval in seconds between each push. The default value is 30 seconds. |
Cognite
Use the cognite
subsection to sent metrics as time series to the CDF project configured in the cognite main section above. Only numeric metrics, such as Prometheus counters and gauges, are sent.
Parameter | Description |
---|---|
external-id-prefix | Insert a prefix to all time series used to represent metrics for this deployment. This creates a scope for the set of time series created by these metrics exported and should be deployment-unique across the entire project. This is a required parameter. |
asset-name | Enter the name of the asset to attach to time series. This will be created if it doesn't already exist. |
asset-external-id | Enter the external ID for the asset to create if the asset doesn't already exist. |
push-interval | Enter the interval in seconds between each push. The default value is 30 seconds. |
SAP
Use the SAP
section to enter the mandatory SAP parameters.
Parameter | Description |
---|---|
gateway_url | Insert the SAP NetWeaver Gateway URL. This is a required parameter. |
client | Enter the SAP client number. This is a required parameter. |
username | Enter the SAP username to connect to the SAP NetWeaver Gateway. This is a required parameter. |
password | Enter the password to connect to the SAP NetWeaver Gateway. This is a required parameter. |
certificates | Certificates needed for authentication towards SAP instance. This is an optional parameter. See the Certificates section. |
endpoints | List the SAP OData endpoints. Each endpoint corresponds to a valid SAP OData service. See the Endpoints section. |
Certificates
Use the certificates
subsection the certificates to be used for authentication towards SAP instances.
There are three certificates needed to perform the authentication: certificate authority (ca_cert
), public key (public_key
), and private key (private_key
)
Please check this documentation on how to generate the three certificates from a .p12 certificate file, if needed.
When setting the certificate authentication, note thatthree certificates are needed and they must be placed in the same folder where the extractor will be running.
Parameter | Description |
---|---|
ca_cert | Enter the name of the CA certificate file. |
public_key | Enter the name of the public key file. |
private_key | Enter the name private key file. |
Endpoints
Use the endpoint
subsection to specify the OData endpoints.
Parameter | Description |
---|---|
name | Enter the name of the SAP OData endpoint. This is a required parameter. |
destination | Insert the CDF staging database and table to upload to. This is a required value. See the RAW destination section. |
sap_service | Enter the name of the SAP OData service. This is a required parameter. |
sap_entity | Enter the name of the SAP OData entity related to the SAP OData service. This is a required parameter. |
sap_key | Enter the name of the primary key field related to the SAP OData entity. This is a required parameter. |
incremental_field | Enter the name of the field to be used as reference for the incremental runs. This is an optional parameter. If you leave this field empty, the extractor will fetch full data loads every run. |
schedule | Schedule the interval which the OData queries will be executed towards the SAP Odata service. See the Schedule section. |
extract_schema | Extract the SAP OData entity schema to the CDF staging area. It expects database and table parameters, same as RAW destination. This is an optional parameter. |
Schedule
Use the schedule
subsection to schedule runs when the extractor runs as a service.
Parameter | Description |
---|---|
type | Insert the schedule type. Valid options are cron and interval . cron uses regular cron expressions.interval expects an interval-based schedule. |
expression | Enter the cron or interval expression to trigger the query. For example, 1h repeats the query hourly, and 5m repeats the query every 5 minutes. |
RAW destination
The raw destination
subsection enables writing data to the CDF staging area.
Parameter | Description |
---|---|
database | Enter the CDF staging database to upload data into. This will be created if it doesn't exist. This is a required parameter. |
table | Enter the CDF staging table to upload data into. This will be created if it doesn't exist. This is a required parameter. |