Pular para o conteúdo principal

Set up the SAP extractor

Follow the steps below to set up the extractor.

Before you start

  1. Set up the SAP endpoints you want to extract data from.

  2. Check the server requirements for the extractor.

  3. Create a configuration file according to the configuration settings. The file must be in YAML format.

Connect to SAP

The extractor supports two different protocols for connecting to SAP: OData and SOAP.

OData

The extractor connects to OData V2 and OData V4 endpoints in the SAP NetWeaver Gateway. SAP OData is available in multiple SAP ERP versions, such as:

SAP versionDescription
SAP ERP 6.0We recommend the SAP Gateway Service builder, which automates and generates OData entities from the SAP standard data schemas.

There are multiple ways of mapping SAP entities to OData entities. To ensure that schema mapping is done correctly on the entity level, use the Import DDIC Structure function for structures or database table that you expose through the OData service.

See this guide for creating an OData service in SAP ERP 6.0.
SAP S/4HANA OnPremiseWe recommend using the standard OData endpoints delivered in the SAP S/4HANA installation. You'll find the endpoints at the SAP Business Accelerator Hub.

To extract data using SAP OData predefined schemas, activate the standard endpoints on the /IWFND/MAINT_SERVICES transaction. For instance, the standard endpoint for SAP plant maintenance (PM) work orders.
SAP S/4HANA CloudThe OData endpoints are available inside the cloud communication scenarios predefined by SAP. You'll find the standard endpoints at the SAP Business Accelerator Hub. For instance, the standard endpoint for SAP plant maintenance (PM) work orders.

See the SAP documentation for more information about SAP OData endpoints.

observação

We recommend using OData to connect to and extract data from SAP.

SOAP

The extractor connects to SAP ABAP Web Services exposed through the SOAP protocol.

Every published SAP Web service will have a corresponding XML-based description, accessible in the Web Service Description Language (a WSDL document). The extractor connects to the Web service by reading the definitions in the WSDL URL and triggering the defined operations.

For more information about SAP ABAP Web Services, see SAP's official documentation.

Run as a Windows executable file

  1. Navigate to Data management > Integrate > Extractors and find the SAP extractor's package for Windows executable.

  2. Download and decompress the zip file.

  3. Open a command line window and run the executable file with the configuration file as an argument.

In this example, the configuration file is named config.yml and saved in the same folder as the executable file:


> .\sap_extractor_standalone<VERSION>-win32.exe .\config.yml

You stop the extractor by pressing Ctrl+C on your keyboard. The log file is stored in the configured path.

Run as a Windows service

  1. Navigate to Data management > Integrate > Extractors and find the SAP extractor's installation package for Windows service.

  2. Download and decompress the zip file to the same directory as a configuration file.

Naming the configuration file

You must name the configuration file config.yml.

  1. As an administrator, open up a command line window in the folder you placed the executable file and the configuration file.

  2. Run the following command:


> .\sap_extractor_service<VERSION>-win32.exe install

  1. Open Services in Windows and find the Cognite SAP extractor service.

  2. Right-click the service and select Properties.

  3. Configure the service according to your requirements.

Run as a Linux executable

  1. Navigate to Data management > Integrate > Extractors and find the SAP extractor's package for Linux executable.

  2. Download and decompress the zip file.

  3. Open a command line window and run the executable file with the configuration file as an argument.

In this example, the configuration file is named config.yml and saved in the same folder as the executable file:


> ./sap_extractor-<VERSION>-linux path/to/the/folder/config.yaml

Pagination

When you extract data from SAP OData, use different pagination methods: no pagination, client-side pagination, and server-side pagination.

Use the pagination-type configuration setting to specify the pagination type to use when running full-load queries.

tip

Pagination is available when you extract data from SAP OData endpoints.

No pagination

No pagination means that the extractor fetches all data available from the SAP OData endpoint without using chunking logic on the server or client. Therefore, the SAP extractor may time out and return an error while waiting for the response from SAP.

Client-side pagination

OData client-side pagination uses query parameters from the client to define a record offset. This limits the volume of data retrieved from the server.

A client-side pagination request is built with the following URI parameters:

  • $top: specifies the number of records to return in a single batch. The default and maximum value is 1,000 records.
  • $skip: specifies the number of records to bypass (skip) from the total data set before returning the desired subset.

For example, a query with $skip=2000&$top=500 returns the fifth page of data, assuming there's data available and the page size is 500 records.

Server-side pagination

The SAP server controls server-side pagination. The server generates a cursor to control the next batch of requests. This cursor represents a pointer to the start of the next page in the full data set and is returned to the calling SAP extractor. The server uses a $skiptoken value to resume pagination from the position identified by the cursor.

Tip

The pagination type you use in your extractor configuration depends on the SAP OData endpoint implementation your extractor connects to.

Load data incrementally (OData only)

If the OData entities have an incremental field, you can set up the extractor to only process new or updated entities since the last extraction.

For example, you can use the S/4HANA standard OData entity Maintenance Order and enter the field LastChangeDateTime in the incremental_field configuration parameter for incremental delta queries to the CDF staging area.

Tip

For the incremental load to work properly, the incremental field in SAP must be a Edm.DateTimeOffset field. For example, LastChangeDateTime for SAP OData entity Maintenance Order.

note

The extractor depends on client-side pagination to do incremental load queries. Therefore, when configuring the SAP extractor to use a custom SAP OData endpoint, make sure your SAP implementation supports both client-side pagination and the $orderby operation. See Client-side pagination implementation for more information on the implementation provided by SAP.

Schedule automatic runs

To schedule automatic runs on Windows, you can run the extractors in Windows Task Scheduler.

To schedule automatic runs on Mac OS and Linux, use cron expressions. To enter a new cron job, run crontab -e to edit the cron table file with the default system text editor.

Here's the format for a job in the cron table:

<minute>  <hour>  <day of month (1-31)>  <month (1-12)>  <day of week (0-6 starting on Sunday)>  <command>

Attachments

The extractor can extract attachments stored in SAP document frameworks, such as Generic Object Services (GOS), and ingest these into CDF Files. The extractor connects to the SAP OData endpoint API_CV_ATTACHMENT_SRV and fetches files that are linked to a standard SAP OData entity, such as maintenance orders.

See the attachments configuration section.

note

You can extract attachments when you've connected to SAP S/4HANA servers.