Skip to main content

Before you start

  • Assign access capabilities for the extractor to write data to the respective CDF destination resources.
  • Set up the SAP endpoints you want to extract data from.
  • Check the server requirements for the extractor.
  • Create a configuration file according to the configuration settings. The file must be in YAML format.

Connect to SAP

The extractor supports two different protocols for connecting to SAP: OData and SOAP.
  • OData
  • SOAP
The extractor connects to OData V2 and OData V4 endpoints in the SAP NetWeaver Gateway. SAP OData is available in multiple SAP ERP versions, such as:
SAP versionDescription
SAP ERP 6.0We recommend the SAP Gateway Service builder, which automates and generates OData entities from the SAP standard data schemas. There are multiple ways of mapping SAP entities to OData entities. To ensure that schema mapping is done correctly on the entity level, use the Import DDIC Structure function for structures or database table that you expose through the OData service. See this guide for creating an OData service in SAP ERP 6.0.
SAP S/4HANA OnPremiseWe recommend using the standard OData endpoints delivered in the SAP S/4HANA installation. You’ll find the endpoints at the SAP Business Accelerator Hub. To extract data using SAP OData predefined schemas, activate the standard endpoints on the /IWFND/MAINT_SERVICES transaction. For instance, the standard endpoint for SAP plant maintenance (PM) work orders.
SAP S/4HANA CloudThe OData endpoints are available inside the cloud communication scenarios predefined by SAP. You’ll find the standard endpoints at the SAP Business Accelerator Hub. For instance, the standard endpoint for SAP plant maintenance (PM) work orders.
See the SAP documentation for more information about SAP OData endpoints.
We recommend using OData to connect to and extract data from SAP.

Secure SAP connections with self-signed certificates

The extractor uses HTTPS to securely connect to SAP OData services. If your SAP system uses an internal or self-signed certificate, update the extractor’s certificate bundle to include it so connections between SAP, the VM environment, and the CDF service remain trusted.

Export the existing certificate bundle

  1. Locate the public CA bundle in the certifi library:
    import certifi
    
    # Get the path to the CA bundle
    ca_bundle_path = certifi.where()
    print(f"CA bundle located at: {ca_bundle_path}")
    
  2. Copy the file from the printed path and save it on your desktop as cacert.pem.
  3. Add the SAP self-signed certificate
    1. Obtain the SAP self-signed certificate in PEM format.
    2. Open the certificate in a text editor.
    3. Open cacert.pem from step 1.
    4. Prepend the SAP certificate contents to the top of the file.
    5. Save the updated file.
Now, cacert.pem contains both public and SAP self-signed certificates.

Configure the extractor

  1. Set the environment variable REQUESTS_CA_BUNDLE to the path of the updated cacert.pem:
  • Windows (PowerShell)
  • Linux/MacOS (Bash)
$env:REQUESTS_CA_BUNDLE="C:\Users\<username>\Desktop\cacert.pem"
  1. Run the extractor in the same shell where the environment variable is defined.
The certificate must be manually added to the bundle on the VM to ensure both SAP and CDF connections work correctly.

Run as a Windows executable file

1

Download the SAP extractor package

Navigate to Data management > Integrate > Extractors and find the SAP extractor’s package for Windows executable. Download and decompress the zip file.
2

Run the executable file

Open a command line window and run the executable file with the configuration file as an argument.In this example, the configuration file is named config.yml and saved in the same folder as the executable file:
> .\sap_extractor_standalone<VERSION>-win32.exe .\config.yml
You stop the extractor by pressing Ctrl+C on your keyboard. The log file is stored in the configured path.

Run as a Windows service

1

Download the Windows service package

Navigate to Data management > Integrate > Extractors and find the SAP extractor’s installation package for Windows service. Download and decompress the zip file to the same directory as a configuration file.
You must name the configuration file config.yml.
2

Install the service

As an administrator, open up a command line window in the folder you placed the executable file and the configuration file. Run the following command:
> .\sap_extractor_service<VERSION>-win32.exe install
3

Configure the service

Open Services in Windows and find the Cognite SAP extractor service. Right-click the service and select Properties. Configure the service according to your requirements.

Run as a Linux executable

1

Download the Linux executable package

Navigate to Data management > Integrate > Extractors and find the SAP extractor’s package for Linux executable. Download and decompress the zip file.
2

Run the executable file

Open a command line window and run the executable file with the configuration file as an argument.In this example, the configuration file is named config.yml and saved in the same folder as the executable file:
> ./sap_extractor-<VERSION>-linux path/to/the/folder/config.yaml

Pagination

When you extract data from SAP OData, use different pagination methods: no pagination, client-side pagination, and server-side pagination. Use the pagination-type configuration setting to specify the pagination type to use when running full-load queries.
Pagination is available when you extract data from SAP OData endpoints.
  • No pagination
  • Client-side pagination
  • Server-side pagination
No pagination means that the extractor fetches all data available from the SAP OData endpoint without using chunking logic on the server or client. Therefore, the SAP extractor may time out and return an error while waiting for the response from SAP.
The pagination type you use in your extractor configuration depends on the SAP OData endpoint implementation your extractor connects to.

Load data incrementally (OData only)

If the OData entities have an incremental field, you can set up the extractor to only process new or updated entities since the last extraction. For example, you can use the S/4HANA standard OData entity Maintenance Order and enter the field LastChangeDateTime in the incremental_field configuration parameter for incremental delta queries to the CDF staging area.
For the incremental load to work properly, the incremental field in SAP must be a Edm.DateTimeOffset field. For example, LastChangeDateTime for SAP OData entity Maintenance Order.
The extractor depends on client-side pagination to do incremental load queries. Therefore, when configuring the SAP extractor to use a custom SAP OData endpoint, make sure your SAP implementation supports both client-side pagination and the $orderby operation. See Client-side pagination implementation for more information on the implementation provided by SAP.

Schedule automatic runs

To schedule automatic runs on Windows, you can run the extractors in Windows Task Scheduler. To schedule automatic runs on Mac OS and Linux, use cron expressions. To enter a new cron job, run crontab -e to edit the cron table file with the default system text editor. Here’s the format for a job in the cron table:
showLineNumbers
<minute>  <hour>  <day of month (1-31)>  <month (1-12)>  <day of week (0-6 starting on Sunday)>  <command>

Attachments

The extractor can extract attachments stored in SAP document frameworks, such as Generic Object Services (GOS), and ingest these into CDF Files. The extractor connects to the SAP OData endpoint API_CV_ATTACHMENT_SRV and fetches files that are linked to a standard SAP OData entity, such as maintenance orders. See the attachments configuration section.
You can extract attachments when you’ve connected to SAP S/4HANA servers.