Configure the EDM extractor
To configure the EDM extractor, you must create a configuration file. The file must be in YAML format.
You can use the sample minimal configuration file as a starting point for your configuration settings.
You can set parameters through environment variables using the pattern <configuration section> <parameter>. The environment variables override values set in the configuration file (including default values for parameters not listed in the configuration file). For example, setting the COGNITE_PROJECT environment variable overrides the
project parameter under the
Cognite section (the matching is case insensitive).
cognite section to configure which CDF project the extractor loads data into and how to connect to the project. This section is mandatory and should always contain the project and authentication configuration.
|Insert the CDF client ID. This is mandatory if you're using OIDC authentication.
|Insert the CDF client secret for your CDF project. This is mandatory if you're using OIDC authentication.
|Insert the token URL. This is mandatory if you're using OIDC authentication. You'll find further details on the OIDC token URL here
|Insert the base URL of the CDF project.
|Insert the CDF project name you want to ingest data into. This field is required.
|Enter the name of the CDF RAW database to extract data into. If it doesn't already exist, the extractor creates the database.
|Enter the size for the queue between extraction and writing objects to CDF RAW. The default value is 500. A larger queue will allow for buffering more extracted entities from EDM, but lead to higher memory usage.
logging section to set up logging to a standard output, such as a terminal window.
|Select the verbosity level for console logging. Valid options are
|Set the log level for the CDF Java SDK. Valid options are
error. This is an optional value. The default value is
edm section to configure the parameters needed to connect to your EDM instance.
|If you set
token, enter the URL for DSIS authentication
|Select how to authenticate to DSIS. Valid options are
token expects the bearer token from the DSIS authentication server.
dsisPassword to authenticate.
false to disable certificate verification for the DSIS auth Server specified with the
dsisAuthServer option. This field is optional and verification defaults to
|Enter the time in seconds between each scan for changes to incremental entities. The default value is 180. This field is optional.
|Set up a cron schedule for starting the extraction of complete entities.
true to run a complete extraction on startup bypassing any schedule set in
crontabComplete. This is typically used during development. The default value is
false. This field is optional.
|Insert the path to the EDM server.
|Insert the path to the EDM Carto services to extract Carto entities from EDM, such as CD_CARTO.
|Insert the number of threads to extract data from EDM. Be aware that a high volume load on the EDM server may cause a crash. The default value is 2. This field is optional.
|Insert the number of threads to process and upload extracted entities to CDF RAW. The default value is 4. This field is optional.
|Extractor internal parameter. Do not change.
|Enter the measurement system defined in EDM/DSIS. If you set a value, the
measurementsystem will be appended to all queries. The default value is blank. This field is optional.
|Enter the maximum time in seconds to wait for a connection to DSIS. The default value is 5. This field is optional.
|Set the maximum timeout in seconds for read requests to DSIS. This field is optional. If you don't enter a value or the value is less then 120, the default value is set to 120.
|List the EDM entities to extract completely on each run, even if the entities are incremental. This can be used as a workaround if EDM is missing
update_date on incremental entities.
|Enable debug logging of EDM traffic for troubleshooting. Valid strings are
OFF. This field is optional.
true to extract the EDM cartography database used (entities
carto alias). These will be extracted into CDF RAW tables named
|Insert the base URL of the DSIS. The extractor combines
dsisEdmProject to connect to your EDM instance. This is a required field.
|Enter the name of the EDM virtual database. This is a required field.
|Enter the name of the EDM project. The project name
ALL_PROJECTS can be used to query across all projects in EDM. This is a required field.
|Enter the name of the EDM authentication REALM. This is a required field.
|Enter the DSIS client ID. This is a required field.
|Enter a valid DSIS username. This is a required field.
|Enter a valid DSIS password. This is a required field.
|List the EDM entities you want to extract. See also the
entities section below.
|Set up a cron schedule for triggering a consistency check of the state of RAW compared to EDM to mark deleted objects in EDM as deleted in RAW.
true to run a consistency check between the state in RAW and EDM on startup bypassing any schedule set in
entities section to list the entities you want to extract from the EDM server to CDF. You'll find the list of supported EDM entities here.
|Enter the name of the EDM entity, for instance, CD_WELL. This is a required field.
|OData-based custom filters for EDM entity level. Use this to filter unwanted records from EDM.
|List of properties on the entity to include in the extraction. If not specified, all fields are included. This field can't be combined with
|List of properties to exclude from the extraction of an entity. Don't use this parameter together with
|This parameter is only relevant when the entity is listed in the
entitiesCompleteExtraction parameter and confirms for the extractor that it should both extract incrementally and complete. The entry under
entitiesCompleteExtraction parameter should have a filter specified to only select entities that are missed by the incremental extraction. Use in combination with a filter to get all entities with create/update_date set to NULL. See the sample configuration file. Use this parameter for special cases only. By default, this parameter is disabled.
|The number of entities to retrieve per request. The default value is 8, or 50 for the
related parameter. Some heavy entities, such as CD_WELL, have further configuration in the extractor to reduce the number of entities to retrieve in a batch to reduce the load on the EDM server. This value will override these built-in standard values. This field is optional.
true to include related entities in the object extracted. By default set to
|Set to an optional list of related entities to include. If not set, all related entities are included.
false to exclude an entity from the set of entities that are consistency checked periodically (according to the schedule set in
consistencySchedule). The consistency check verifies whether all extracted objects in RAW still exist in EDM. If they don't exist, they're marked as deleted in RAW. The default value is