> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cognite.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Configuration settings

> Complete reference for configuring the Cognite DB extractor, including authentication, queries, and destination settings.

To configure the DB extractor, you must create a configuration file. The file must be in YAML format.

<Tip>
  You can set up extraction pipelines to use versioned extractor configuration files stored in the cloud.
</Tip>

## Using values from environment variables

The configuration file allows substitutions with environment variables. For example:

```yaml theme={"languages":{"custom":["/_languages/kuiper.json","../_languages/kuiper.json"]}}
cognite:
  secret: ${COGNITE_CLIENT_SECRET}
```

This will load the value from the `COGNITE_CLIENT_SECRET` environment variable into the `cognite/secret` parameter. You can also do string interpolation with environment variables, for example:

```yaml theme={"languages":{"custom":["/_languages/kuiper.json","../_languages/kuiper.json"]}}
url: http://my-host.com/api/endpoint?secret=${MY_SECRET_TOKEN}
```

<Info>
  Implicit substitutions only work for unquoted value strings. For quoted strings, use the `!env` tag to activate environment substitution:

  ```yaml theme={"languages":{"custom":["/_languages/kuiper.json","../_languages/kuiper.json"]}}
  url: !env 'http://my-host.com/api/endpoint?secret=${MY_SECRET_TOKEN}'
  ```
</Info>

## Using values from Azure Key Vault

The DB extractor also supports loading values from Azure Key Vault. To load a configuration value from Azure Key Vault, use the `!keyvault` tag followed by the name of the secret you want to load. For example, to load the value of the `my-secret-name` secret in Key Vault into a `password` parameter, configure your extractor like this:

```yaml theme={"languages":{"custom":["/_languages/kuiper.json","../_languages/kuiper.json"]}}
password: !keyvault my-secret-name
```

To use Key Vault, you also need to include the `azure-keyvault` section in your configuration, with the following parameters:

| Parameter               | Description                                                                                                                                                                                                                                                                                                        |
| ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `keyvault-name`         | Name of Key Vault to load secrets from                                                                                                                                                                                                                                                                             |
| `authentication-method` | How to authenticate to Azure. Either `default` or `client-secret`. For `default`, the extractor will look at the user running the extractor, and look for pre-configured Azure logins from tools like the Azure CLI. For `client-secret`, the extractor will authenticate with a configured client ID/secret pair. |
| `client-id`             | Required for using the `client-secret` authentication method. The client ID to use when authenticating to Azure.                                                                                                                                                                                                   |
| `secret`                | Required for using the `client-secret` authentication method. The client secret to use when authenticating to Azure.                                                                                                                                                                                               |
| `tenant-id`             | Required for using the `client-secret` authentication method. The tenant ID of the Key Vault in Azure.                                                                                                                                                                                                             |

Example:

```yaml theme={"languages":{"custom":["/_languages/kuiper.json","../_languages/kuiper.json"]}}
azure-keyvault:
  keyvault-name: my-keyvault-name
  authentication-method: client-secret
  tenant-id: 6f3f324e-5bfc-4f12-9abe-22ac56e2e648
  client-id: 6b4cc73e-ee58-4b61-ba43-83c4ba639be6
  secret: 1234abcd
```

## Base configuration object

| Parameter   | Type                       | Description                                                                                                                                                                                                                                               |
| ----------- | -------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `version`   | either string or integer   | Configuration file version                                                                                                                                                                                                                                |
| `type`      | either `local` or `remote` | Configuration file type. Either `local`, meaning the full config is loaded from this file, or `remote`, which means that only the `cognite` section is loaded from this file, and the rest is loaded from extraction pipelines. Default value is `local`. |
| `cognite`   | object                     | The cognite section describes which CDF project the extractor will load data into and how to connect to the project.                                                                                                                                      |
| `logger`    | object                     | The optional `logger` section sets up logging to a console and files.                                                                                                                                                                                     |
| `metrics`   | object                     | The `metrics` section describes where to send metrics on extractor performance for remote monitoring of the extractor. We recommend sending metrics to a Prometheus pushgateway, but you can also send metrics as time series in the CDF project.         |
| `queries`   | list                       | List of queries to execute                                                                                                                                                                                                                                |
| `databases` | list                       | List of databases to connect to                                                                                                                                                                                                                           |
| `extractor` | object                     | General extractor configuration                                                                                                                                                                                                                           |

## `cognite`

Global parameter.

The cognite section describes which CDF project the extractor will load data into and how to connect to the project.

| Parameter             | Type    | Description                                                                                                                                                                            |
| --------------------- | ------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `project`             | string  | Insert the CDF project name.                                                                                                                                                           |
| `idp-authentication`  | object  | The `idp-authentication` section enables the extractor to authenticate to CDF using an external identity provider (IdP), such as Microsoft Entra ID (formerly Azure Active Directory). |
| `data-set`            | object  | Enter a data set the extractor should write data into                                                                                                                                  |
| `extraction-pipeline` | object  | Enter the extraction pipeline used for remote config and reporting statuses                                                                                                            |
| `host`                | string  | Insert the base URL of the CDF project. Default value is `https://api.cognitedata.com`.                                                                                                |
| `timeout`             | integer | Enter the timeout on requests to CDF, in seconds. Default value is `30`.                                                                                                               |
| `external-id-prefix`  | string  | Prefix on external ID used when creating CDF resources                                                                                                                                 |
| `connection`          | object  | Configure network connection details                                                                                                                                                   |

### `idp-authentication`

Part of `cognite` configuration.

The `idp-authentication` section enables the extractor to authenticate to CDF using an external identity provider (IdP), such as Microsoft Entra ID (formerly Azure Active Directory).

| Parameter     | Type    | Description                                                                                                                                                                                  |
| ------------- | ------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `authority`   | string  | Insert the authority together with `tenant` to authenticate against Azure tenants. Default value is `https://login.microsoftonline.com/`.                                                    |
| `client-id`   | string  | **Required.** Enter the service principal client id from the IdP.                                                                                                                            |
| `tenant`      | string  | Enter the Azure tenant.                                                                                                                                                                      |
| `token-url`   | string  | Insert the URL to fetch tokens from.                                                                                                                                                         |
| `secret`      | string  | Enter the service principal client secret from the IdP.                                                                                                                                      |
| `resource`    | string  | Resource parameter passed along with token requests.                                                                                                                                         |
| `audience`    | string  | Audience parameter passed along with token requests.                                                                                                                                         |
| `scopes`      | list    | Enter a list of scopes requested for the token                                                                                                                                               |
| `min-ttl`     | integer | Insert the minimum time in seconds a token will be valid. If the cached token expires in less than `min-ttl` seconds, it will be refreshed even if it is still valid. Default value is `30`. |
| `certificate` | object  | Authenticate with a client certificate                                                                                                                                                       |

#### `scopes`

Part of `idp-authentication` configuration.

Enter a list of scopes requested for the token

Each element of this list should be a string.

#### `certificate`

Part of `idp-authentication` configuration.

Authenticate with a client certificate

| Parameter       | Type   | Description                                                                                |
| --------------- | ------ | ------------------------------------------------------------------------------------------ |
| `authority-url` | string | Authentication authority URL                                                               |
| `path`          | string | **Required.** Enter the path to the .pem or .pfx certificate to be used for authentication |
| `password`      | string | Enter the password for the key file, if it is encrypted.                                   |

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="cognite.data-set" /> `data-set`

Part of [`cognite`](#cognite) configuration.

Enter a data set the extractor should write data into

| Parameter     | Type    | Description          |
| ------------- | ------- | -------------------- |
| `id`          | integer | Resource internal id |
| `external-id` | string  | Resource external id |

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="cognite.extraction-pipeline" /> `extraction-pipeline`

Part of [`cognite`](#cognite) configuration.

Enter the extraction pipeline used for remote config and reporting statuses

| Parameter     | Type    | Description          |
| ------------- | ------- | -------------------- |
| `id`          | integer | Resource internal id |
| `external-id` | string  | Resource external id |

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="cognite.connection" /> `connection`

Part of [`cognite`](#cognite) configuration.

Configure network connection details

| Parameter                                | Type    | Description                                                                                                                                      |
| ---------------------------------------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------ |
| `disable-gzip`                           | boolean | Whether or not to disable gzipping of json bodies.                                                                                               |
| `status-forcelist`                       | string  | HTTP status codes to retry. Defaults to 429, 502, 503 and 504                                                                                    |
| `max-retries`                            | integer | Max number of retries on a given http request. Default value is `10`.                                                                            |
| `max-retries-connect`                    | integer | Max number of retries on connection errors. Default value is `3`.                                                                                |
| `max-retry-backoff`                      | integer | Retry strategy employs exponential backoff. This parameter sets a max on the amount of backoff after any request failure. Default value is `30`. |
| `max-connection-pool-size`               | integer | The maximum number of connections which will be kept in the SDKs connection pool. Default value is `50`.                                         |
| `disable-ssl`                            | boolean | Whether or not to disable SSL verification.                                                                                                      |
| [`proxies`](#cognite.connection.proxies) | object  | Dictionary mapping from protocol to url.                                                                                                         |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="cognite.connection.proxies" /> `proxies`

Part of [`connection`](#cognite.connection) configuration.

Dictionary mapping from protocol to url.

## <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="logger" /> `logger`

Global parameter.

The optional `logger` section sets up logging to a console and files.

| Parameter                    | Type    | Description                                                                                                                   |
| ---------------------------- | ------- | ----------------------------------------------------------------------------------------------------------------------------- |
| [`console`](#logger.console) | object  | Include the console section to enable logging to a standard output, such as a terminal window.                                |
| [`file`](#logger.file)       | object  | Include the file section to enable logging to a file. The files are rotated daily.                                            |
| `metrics`                    | boolean | Enables metrics on the number of log messages recorded per logger and level. This requires `metrics` to be configured as well |

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="logger.console" /> `console`

Part of [`logger`](#logger) configuration.

Include the console section to enable logging to a standard output, such as a terminal window.

| Parameter | Type                                                     | Description                                                                                                                                                                      |
| --------- | -------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `level`   | either `DEBUG`, `INFO`, `WARNING`, `ERROR` or `CRITICAL` | Select the verbosity level for console logging. Valid options, in decreasing verbosity levels, are `DEBUG`, `INFO`, `WARNING`, `ERROR`, and `CRITICAL`. Default value is `INFO`. |

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="logger.file" /> `file`

Part of [`logger`](#logger) configuration.

Include the file section to enable logging to a file. The files are rotated daily.

| Parameter   | Type                                                     | Description                                                                                                                                                                   |
| ----------- | -------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `level`     | either `DEBUG`, `INFO`, `WARNING`, `ERROR` or `CRITICAL` | Select the verbosity level for file logging. Valid options, in decreasing verbosity levels, are `DEBUG`, `INFO`, `WARNING`, `ERROR`, and `CRITICAL`. Default value is `INFO`. |
| `path`      | string                                                   | **Required.** Insert the path to the log file.                                                                                                                                |
| `retention` | integer                                                  | Specify the number of days to keep logs for. Default value is `7`.                                                                                                            |

## <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="metrics" /> `metrics`

Global parameter.

The `metrics` section describes where to send metrics on extractor performance for remote monitoring of the extractor. We recommend sending metrics to a [Prometheus pushgateway](https://prometheus.io), but you can also send metrics as time series in the CDF project.

| Parameter                                 | Type   | Description                                                                                       |
| ----------------------------------------- | ------ | ------------------------------------------------------------------------------------------------- |
| [`push-gateways`](#metrics.push-gateways) | list   | List of prometheus pushgateway configurations                                                     |
| [`cognite`](#metrics.cognite)             | object | Push metrics to CDF timeseries. Requires CDF credentials to be configured                         |
| [`server`](#metrics.server)               | object | The extractor can also be configured to expose a HTTP server with prometheus metrics for scraping |

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="metrics.push-gateways" /> `push-gateways`

Part of [`metrics`](#metrics) configuration.

List of prometheus pushgateway configurations

Each element of this list should be a the push-gateways sections contain a list of metric destinations.

| Parameter       | Type                   | Description                                                                                                                                                                                                                                                                                                                                                                                      |
| --------------- | ---------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `host`          | string                 | Enter the address of the host to push metrics to.                                                                                                                                                                                                                                                                                                                                                |
| `job-name`      | string                 | Enter the value of the `exported_job` label to associate metrics with. This separates several deployments on a single pushgateway, and should be unique.                                                                                                                                                                                                                                         |
| `username`      | string                 | Enter the credentials for the pushgateway.                                                                                                                                                                                                                                                                                                                                                       |
| `password`      | string                 | Enter the credentials for the pushgateway.                                                                                                                                                                                                                                                                                                                                                       |
| `clear-after`   | either null or integer | Enter the number of seconds to wait before clearing the pushgateway. When this parameter is present, the extractor will stall after the run is complete before deleting all metrics from the pushgateway. The recommended value is at least twice that of the scrape interval on the pushgateway. This is to ensure that the last metrics are gathered before the deletion. Default is disabled. |
| `push-interval` | integer                | Enter the interval in seconds between each push. Default value is `30`.                                                                                                                                                                                                                                                                                                                          |

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="metrics.cognite" /> `cognite`

Part of [`metrics`](#metrics) configuration.

Push metrics to CDF timeseries. Requires CDF credentials to be configured

| Parameter                               | Type    | Description                                                                                      |
| --------------------------------------- | ------- | ------------------------------------------------------------------------------------------------ |
| `external-id-prefix`                    | string  | **Required.** Prefix on external ID used when creating CDF time series to store metrics.         |
| `asset-name`                            | string  | Enter the name for a CDF asset that will have all the metrics time series attached to it.        |
| `asset-external-id`                     | string  | Enter the external ID for a CDF asset that will have all the metrics time series attached to it. |
| `push-interval`                         | integer | Enter the interval in seconds between each push to CDF. Default value is `30`.                   |
| [`data-set`](#metrics.cognite.data-set) | object  | Data set the metrics will be created under                                                       |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="metrics.cognite.data-set" /> `data-set`

Part of [`cognite`](#metrics.cognite) configuration.

Data set the metrics will be created under

| Parameter     | Type    | Description          |
| ------------- | ------- | -------------------- |
| `id`          | integer | Resource internal id |
| `external-id` | string  | Resource external id |

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="metrics.server" /> `server`

Part of [`metrics`](#metrics) configuration.

The extractor can also be configured to expose a HTTP server with prometheus metrics for scraping

| Parameter | Type    | Description                                                             |
| --------- | ------- | ----------------------------------------------------------------------- |
| `host`    | string  | Host to run the prometheus server on. Default value is `0.0.0.0`.       |
| `port`    | integer | Local port to expose the prometheus server on. Default value is `9000`. |

## <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="queries" /> `queries`

Global parameter.

List of queries to execute

Each element of this list should be a description of a SQL query against a database

| Parameter                             | Type                                                                                       | Description                                                                                                                                                                                                                                                                                                                                                                                                   |
| ------------------------------------- | ------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `database`                            | string                                                                                     | **Required.** Enter the name of the database to connect to. This must be one of the database names configured in the `databases` section.                                                                                                                                                                                                                                                                     |
| `name`                                | string                                                                                     | **Required.** Enter a name of this query that will be used for logging and tagging metrics. The name must be unique for each query in the configuration file.                                                                                                                                                                                                                                                 |
| `query`                               | string                                                                                     | **Required.** SQL query to execute. Supports interpolation with `{incremental_field}` and `{start_at}`                                                                                                                                                                                                                                                                                                        |
| [`destination`](#queries.destination) | configuration for either RAW, Events, Assets, Time series, Sequence, Files, Nodes or Edges | **Required.** The destination of the data in CDF. <br /><br />**Examples:**<br />`{'destination': {'type': 'raw', 'database': 'my-database', 'table': 'my-table'}}`<br />`{'destination': {'type': 'events'}}`<br />`{'destination': {'type': 'time_series', 'destination_mode': 'cdm', 'data-model': {'space': 'my-space'}}}`<br />`{'destination': {'type': 'time_series', 'destination_mode': 'classic'}}` |
| `primary-key`                         | string                                                                                     | Insert the format of the row key in CDF RAW. This parameter supports case-sensitive substitutions with values from the table columns. For example, if there's a column called index, setting `primary-key: row_{index}` will result in rows with keys `row_0`, `row_1`, etc. This is a required value if the destination is a `raw` type.<br /><br />**Example:**<br />`row_{index}`                          |
| `incremental-field`                   | string                                                                                     | Insert the table column that holds the incremental field. Include to enable incremental loading, otherwise the extractor will default to a full run every time. To use incremental load, a state store is required                                                                                                                                                                                            |
| `initial-start`                       | either string, number or integer                                                           | Enter the `{start_at}` for the first run. Later runs will use the value stored in the state store. Will only be used on the initial run, subsequent runs will use the stored state. Required when incremental-field is set.                                                                                                                                                                                   |
| [`schedule`](#queries.schedule)       | configuration for either Fixed interval or CRON expression                                 | Enter the schedule for when this query should run. Make sure not to schedule runs too often, but leave some room for the previous execution to be done. Required when running in continuous mode, ignored otherwise.<br /><br />**Examples:**<br />`{'schedule': {'type': 'interval', 'expression': '1h'}}`<br />`{'schedule': {'type': 'cron', 'expression': '0 7-17 * * 1-5'}}`                             |
| `collection`                          | string                                                                                     | Specify the collection on which the query will be executed. This parameter is mandatory when connecting to `mongodb` databases.                                                                                                                                                                                                                                                                               |
| `container`                           | string                                                                                     | Specify the container on which the query will be executed. This parameter is mandatory when connecting to `cosmosdb` databases.                                                                                                                                                                                                                                                                               |
| `sheet`                               | string                                                                                     | Specify the sheet on which the query will be executed. This parameter is mandatory when connecting to `spreadsheet` files.                                                                                                                                                                                                                                                                                    |
| `skip_rows`                           | string                                                                                     | Specify the number of rows to be skipped when reading a spreadsheet. This parameter is optional when connecting to `spreadsheet` files.                                                                                                                                                                                                                                                                       |
| `has_header`                          | string                                                                                     | Specify if the extractor should skip the file header while reading a spreadsheet. This parameter is optional when connecting to `spreadsheet` files.                                                                                                                                                                                                                                                          |
| `parameters`                          | string                                                                                     | Specify the parameters to be used when querying to AWS DynamoDB. This parameter is mandatory when connectong to `dynamodb` databases.                                                                                                                                                                                                                                                                         |

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="queries.destination" /> `destination`

Part of [`queries`](#queries) configuration.

The destination of the data in CDF.

Either one of the following options:

* [RAW](#queries.destination.raw)
* [Events](#queries.destination.events)
* [Assets](#queries.destination.assets)
* [Time series](#queries.destination.time_series)
* [Sequence](#queries.destination.sequence)
* [Files](#queries.destination.files)
* [Nodes](#queries.destination.nodes)
* [Edges](#queries.destination.edges)

**Examples:**

```yaml theme={"languages":{"custom":["/_languages/kuiper.json","../_languages/kuiper.json"]}}
destination:
  type: raw
  database: my-database
  table: my-table
```

```yaml theme={"languages":{"custom":["/_languages/kuiper.json","../_languages/kuiper.json"]}}
destination:
  type: events
```

```yaml theme={"languages":{"custom":["/_languages/kuiper.json","../_languages/kuiper.json"]}}
destination:
  type: time_series
  destination_mode: cdm
  data-model:
    space: my-space
```

```yaml theme={"languages":{"custom":["/_languages/kuiper.json","../_languages/kuiper.json"]}}
destination:
  type: time_series
  destination_mode: classic
```

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="queries.destination.raw" /> `raw`

Part of [`destination`](#queries.destination) configuration.

The raw destination writes data to the CDF staging area (RAW). The raw destination requires the `primary-key` parameter in the query configuration.

| Parameter  | Type         | Description                                                                                             |
| ---------- | ------------ | ------------------------------------------------------------------------------------------------------- |
| `type`     | always `raw` | Type of CDF destination, set to `raw` to write data to RAW.                                             |
| `database` | string       | **Required.** Enter the CDF RAW database to upload data into. This will be created if it doesn't exist. |
| `table`    | string       | **Required.** Enter the CDF RAW table to upload data into. This will be created if it doesn't exist.    |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="queries.destination.events" /> `events`

Part of [`destination`](#queries.destination) configuration.

The `events` destination inserts the resulting data as CDF events. The events destination is configured by setting the `type` parameter to `events`. No other parameters are required.

To ingest data into a events, the query must produce columns named

* `externalId`

In addition, columns named

* `startTime`
* `endTime`
* `description`
* `source`
* `type`
* `subType`

may be included and will be mapped to corresponding fields in CDF events. Any other columns returned by the query will be mapped to key/value pairs in the `metadata` field for events.

| Parameter | Type            | Description                                                       |
| --------- | --------------- | ----------------------------------------------------------------- |
| `type`    | always `events` | Type of CDF destination, set to `events` to write data to events. |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="queries.destination.assets" /> `assets`

Part of [`destination`](#queries.destination) configuration.

The `assets` destination inserts the resulting data as CDF assets. The assets destination is configured by setting the `type` parameter to `assets`. No other parameters are required.

To ingest data into a assets, the query must produce columns named

* `name`

In addition, columns named

* `externalId`
* `parentExternalId`
* `description`
* `source`

may be included and will be mapped to corresponding fields in CDF assets. Any other columns returned by the query will be mapped to key/value pairs in the `metadata` field for assets.

| Parameter | Type            | Description                                                       |
| --------- | --------------- | ----------------------------------------------------------------- |
| `type`    | always `assets` | Type of CDF destination, set to `assets` to write data to assets. |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="queries.destination.time_series" /> `time_series`

Part of [`destination`](#queries.destination) configuration.

The `time_series` destination inserts the resulting data as data points in time series. The time series destination is configured by setting the `type` parameter to `time_series`. No other parameters are required.

To ingest data into a time series, the query must produce columns named

* `externalId`
* `timestamp`
* `value`

In addition, include a column called `status` to give the datapoint a status code. Statuses include a category, and an optional comma-separated list of modifyer flags. Some examples for status codes include `Good` (which is assumed if status is omitted), `UNCERTAIN, HIGH` and `bad`.

The extractor will insert data points into time series identified by the `externalId` column. If a time series does not exist, the extractor will create a minimal time series with only an external ID and the `isString` property inferred from the type of first data point processed for that time series. All other time series attributes need to be added separately.

| Parameter                                                   | Type                      | Description                                                                                                                                                             |
| ----------------------------------------------------------- | ------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `type`                                                      | always `time_series`      | Type of CDF destination, set to `time_series` to write data to time series.                                                                                             |
| `destination_mode`                                          | either `cdm` or `classic` | Mode of the db extractor. Can be 'cdm' for Data models or 'classic' for legacy Timeseries.                                                                              |
| [`data-model`](#queries.destination.time_series.data-model) | object                    | Defines Data Model mapping information, including the target Space and Data Model. This property is only applicable when the destination `type` is set to `timeseries`. |

##### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="queries.destination.time_series.data-model" /> `data-model`

Part of [`time_series`](#queries.destination.time_series) configuration.

Defines Data Model mapping information, including the target Space and Data Model. This property is only applicable when the destination `type` is set to `timeseries`.

| Parameter | Type   | Description                                                                          |
| --------- | ------ | ------------------------------------------------------------------------------------ |
| `space`   | string | Enter the CDF space name. An error will occur if the specified space does not exist. |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="queries.destination.sequence" /> `sequence`

Part of [`destination`](#queries.destination) configuration.

The `sequence` destination writes data to a CDF sequence.

The column set of the query result will determine the columns of the sequence.

The result must include a column named `row_number`, which must include an integer indicating which row number in the sequence to ingest the row into.

| Parameter     | Type                                 | Description                                                                                                                                                                                                                             |
| ------------- | ------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `type`        | always `sequence`                    | Type of CDF destination, set to `sequence` to write data to a sequence.                                                                                                                                                                 |
| `external-id` | string                               | **Required.** Configured sequence external ID                                                                                                                                                                                           |
| `value-types` | either `convert`, `drop` or `assert` | How types are converted into the expected types in CDF. Convert attempts to make a conversion, which may fail. Drop drops the row if there is a mismatch. Assert fails the query if the types do not match. Default value is `convert`. |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="queries.destination.files" /> `files`

Part of [`destination`](#queries.destination) configuration.

The `files` destination inserts the resulting data as CDF files. The files destination is configured by setting the `type` parameter to `files`. No other parameters are required.

To ingest data into a files, the query must produce columns named

* `name`
* `externalId`
* `content`

`content` will be treated as binary data and uploaded to CDF files as the content of the file

In addition, columns named

* `source`
* `mimeType`
* `directory`
* `sourceCreatedTime`
* `sourceModifiedTime`
* `asset_ids`

may be included and will be mapped to corresponding fields in CDF files. Any other columns returned by the query will be mapped to key/value pairs in the `metadata` field for files.

| Parameter                                             | Type                      | Description                                                                                                                                                        |
| ----------------------------------------------------- | ------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `type`                                                | always `files`            | Type of CDF destination, set to `files` to write data to CDF files.                                                                                                |
| `content-column`                                      | string                    | Column used as file content. Default value is `content`.                                                                                                           |
| `destination-mode`                                    | either `cdm` or `classic` | Mode of the db extractor. Can be 'cdm' for Data models or 'classic' for legacy Files.                                                                              |
| [`data-model`](#queries.destination.files.data-model) | object                    | Defines Data Model mapping information, including the target Space and Data Model. This property is only applicable when the destination `type` is set to `files`. |

##### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="queries.destination.files.data-model" /> `data-model`

Part of [`files`](#queries.destination.files) configuration.

Defines Data Model mapping information, including the target Space and Data Model. This property is only applicable when the destination `type` is set to `files`.

| Parameter | Type   | Description                                                                          |
| --------- | ------ | ------------------------------------------------------------------------------------ |
| `space`   | string | Enter the CDF space name. An error will occur if the specified space does not exist. |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="queries.destination.nodes" /> `nodes`

Part of [`destination`](#queries.destination) configuration.

The `nodes` destination inserts the resulting data into nodes containers. The nodes destination is configured by setting the `type` parameter to `nodes`. You have to add data-model.

To ingest data into a nodes, the query must produce columns named

* `externalId`

In addition, other columns (according to data model fields) may be included and will be mapped to corresponding fields in CDF nodes. Any other columns returned by the query will not be ingested or throw an error.

| Parameter                                 | Type           | Description                                                                          |
| ----------------------------------------- | -------------- | ------------------------------------------------------------------------------------ |
| `type`                                    | always `nodes` | Type of CDF destination, set to `nodes` to write data to CDF nodes.                  |
| `space`                                   | string         | Enter the CDF space name. An error will occur if the specified space does not exist. |
| [`view`](#queries.destination.nodes.view) | object         | Enter the CDF view/container configuration. This is required to write to nodes.      |

##### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="queries.destination.nodes.view" /> `view`

Part of [`nodes`](#queries.destination.nodes) configuration.

Enter the CDF view/container configuration. This is required to write to nodes.

| Parameter     | Type   | Description                                                                                                   |
| ------------- | ------ | ------------------------------------------------------------------------------------------------------------- |
| `external-id` | string | Enter the CDF view/container external ID. An error will occur if the specified view/container does not exist. |
| `space`       | string | Enter the CDF view/container space. An error will occur if the specified space does not exist.                |
| `version`     | string | Enter the version of the view/container. It can be null if the view/container is not versioned.               |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="queries.destination.edges" /> `edges`

Part of [`destination`](#queries.destination) configuration.

The `edges` destination inserts the resulting data into edges containers. The edges destination is configured by setting the `type` parameter to `edges`. You have to add data-model.

To ingest data into a edges, the query must produce columns named

* `externalId`
* `startNodeExternalId`
* `startNodeSpace`
* `endNodeExternalId`
* `endNodeSpace`
* `typeExternalId`
* `typeSpace`

| Parameter                                 | Type           | Description                                                                          |
| ----------------------------------------- | -------------- | ------------------------------------------------------------------------------------ |
| `type`                                    | always `edges` | Type of CDF destination, set to `edges` to write data to CDF edges.                  |
| `space`                                   | string         | Enter the CDF space name. An error will occur if the specified space does not exist. |
| [`view`](#queries.destination.edges.view) | object         | Enter the CDF view/container configuration. This is required to write to edges.      |

##### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="queries.destination.edges.view" /> `view`

Part of [`edges`](#queries.destination.edges) configuration.

Enter the CDF view/container configuration. This is required to write to edges.

| Parameter     | Type   | Description                                                                                                   |
| ------------- | ------ | ------------------------------------------------------------------------------------------------------------- |
| `external-id` | string | Enter the CDF view/container external ID. An error will occur if the specified view/container does not exist. |
| `space`       | string | Enter the CDF view/container space. An error will occur if the specified space does not exist.                |
| `version`     | string | Enter the version of the view/container. It can be null if the view/container is not versioned.               |

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="queries.schedule" /> `schedule`

Part of [`queries`](#queries) configuration.

Enter the schedule for when this query should run. Make sure not to schedule runs too often, but leave some room for the previous execution to be done. Required when running in continuous mode, ignored otherwise.

Either one of the following options:

* [Fixed interval](#queries.schedule.fixed_interval)
* [CRON expression](#queries.schedule.cron_expression)

**Examples:**

```yaml theme={"languages":{"custom":["/_languages/kuiper.json","../_languages/kuiper.json"]}}
schedule:
  type: interval
  expression: 1h
```

```yaml theme={"languages":{"custom":["/_languages/kuiper.json","../_languages/kuiper.json"]}}
schedule:
  type: cron
  expression: 0 7-17 * * 1-5
```

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="queries.schedule.fixed_interval" /> `fixed_interval`

Part of [`schedule`](#queries.schedule) configuration.

| Parameter    | Type              | Description                                                                                                                                                                             |
| ------------ | ----------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `type`       | always `interval` | **Required.** Type of time interval configuration. Use `interval` to configure a fixed interval.                                                                                        |
| `expression` | string            | **Required.** Enter a time interval, with a unit. Avaiable units are `s` (seconds), `m` (minutes), `h` (hours) and `d` (days).<br /><br />**Examples:**<br />`45s`<br />`15m`<br />`2h` |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="queries.schedule.cron_expression" /> `cron_expression`

Part of [`schedule`](#queries.schedule) configuration.

| Parameter    | Type          | Description                                                                                                                                                             |
| ------------ | ------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `type`       | always `cron` | **Required.** Type of time interval configuration. Use `cron` to configure CRON schedule.                                                                               |
| `expression` | string        | **Required.** Enter a CRON expression. See [crontab.guru](https://crontab.guru) for a guide on writing CRON expressions.<br /><br />**Example:**<br />`*/15 8-16 * * *` |

## <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases" /> `databases`

Global parameter.

List of databases to connect to

Each element of this list should be a configuration for a database the extractor will connect to

Either one of the following options:

* [ODBC](#databases.odbc)
* [PostgreSQL](#databases.postgresql)
* [Oracle DB](#databases.oracle_db)
* [Snowflake](#databases.snowflake)
* [MongoDB](#databases.mongodb)
* [Azure Cosmos DB](#databases.azure_cosmos_db)
* [Local spreadsheet files](#databases.local_spreadsheet_files)
* [Amazon Dynamo DB](#databases.amazon_dynamo_db)
* [Amazon Redshift](#databases.amazon_redshift)
* [Google BigQuery](#databases.google_bigquery)

**Example:**

```yaml theme={"languages":{"custom":["/_languages/kuiper.json","../_languages/kuiper.json"]}}
databases:
- type: odbc
  name: my-odbc-database
  connection-string: DRIVER={Oracle 19.3};DBQ=localhost:1521/XE;UID=SYSTEM;PWD=oracle
- type: postgres
  name: postgres-db
  host: pg.company.com
  user: postgres
  password: secret123Pas$word
- type: postgres
  name: postgres-db
  host: pg.company.com
  user: postgres
  password: secret123Pas$word
  source:
    name: test_source
    external-id: test_source_id_1
```

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.odbc" /> `odbc`

Part of [`databases`](#databases) configuration.

Open Database Connectivity (ODBC) is a generic protocol for querying databases. To connect to a database using ODBC, you must first download and install an ODBC driver for your database system on the machine running the extractor. Consult the documentation or contact the vendor of your database system to find its driver.

**Example:**

```yaml theme={"languages":{"custom":["/_languages/kuiper.json","../_languages/kuiper.json"]}}
type: odbc
name: asset-database
connection-string: Driver={ODBC Driver 17 for SQL Server};Server=10.24.5.162;Database=assets;UID=extractorUser;PWD=myPassword;
```

| Parameter                          | Type                                         | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| ---------------------------------- | -------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `type`                             | always `odbc`                                | Select the type of database connection. Set to `odbc` for ODBC databases.                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| `connection-string`                | string                                       | **Required.** Enter the ODBC connection string. This will differ between database vendors.<br /><br />**Examples:**<br />`DRIVER={Oracle 19.3};DBQ=localhost:1521/XE;UID=SYSTEM;PWD=oracle`<br />`DSN={MyDatabaseDsn}`                                                                                                                                                                                                                                                                                                  |
| `response-encoding`                | string                                       | Override the encoding to expect on database responses if the driver does not adhere to the ODBC standard. Default is to follow the ODBC standard.<br /><br />**Examples:**<br />`utf8`<br />`iso-8859-1`                                                                                                                                                                                                                                                                                                                |
| `query-encoding`                   | string                                       | Override the encoding to use on database queries if the driver does not adhere to the ODBC standard. Default is to follow the ODBC standard.<br /><br />**Examples:**<br />`utf8`<br />`iso-8859-1`                                                                                                                                                                                                                                                                                                                     |
| `timeout`                          | integer                                      | Enter the timeout in seconds for the ODBC connection and queries. The default value is no timeout.<br /> <br /> Some ODBC drivers don't accept either the `SQL_ATTR_CONNECTION_TIMEOUT` or the `SQL_ATTR_QUERY_TIMEOUT` option. The extractor will log an exception with the message `Could not set timeout on the ODBC driver - timeouts might not work properly`. Extractions will continue regardless but without timeouts. To avoid this logline, you can disable timeouts for the database causing these problems. |
| `batch-size`                       | integer                                      | Enter the number of rows to fetch from the database at a time. You can decrease this number if the machine with the extractor runs out of memory. Note that this will increase the run time. Default value is `1000`.                                                                                                                                                                                                                                                                                                   |
| `name`                             | string                                       | Enter a name for the database that will be used throughout the `queries` section and for logging. The name must be unique for each database in the configuration file.                                                                                                                                                                                                                                                                                                                                                  |
| `timezone`                         | configuration for either  or offset from UTC | Specify how the extractor should handle timestamps from the source when timezone data is absent. Either `local` for the local timezone on the machine the extractor is running on, `utc` for UTC, or a number for a numerical offset from UTC. Default value is `local`.<br /><br />**Examples:**<br />`utc`<br />`-8`<br />`5.5`                                                                                                                                                                                       |
| [`source`](#databases.odbc.source) | object                                       | Represents the system from which the data originates, including identifying information such as the name and a unique external identifier.                                                                                                                                                                                                                                                                                                                                                                              |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.odbc.source" /> `source`

Part of [`odbc`](#databases.odbc) configuration.

Represents the system from which the data originates, including identifying information such as the name and a unique external identifier.

| Parameter     | Type   | Description                                                                        |
| ------------- | ------ | ---------------------------------------------------------------------------------- |
| `name`        | string | User given name of the source system or application from which the data originates |
| `external-id` | string | A user given unique identifier (external-id) for the source.                       |

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.postgresql" /> `postgresql`

Part of [`databases`](#databases) configuration.

**Example:**

```yaml theme={"languages":{"custom":["/_languages/kuiper.json","../_languages/kuiper.json"]}}
type: postgres
name: my-database
host: 10.42.39.12
user: extractor-user
password: mySecretPassword
```

| Parameter                                | Type                                         | Description                                                                                                                                                                                                                                                                                                                       |
| ---------------------------------------- | -------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `type`                                   | always `postgres`                            | **Required.** Type of database connection, set to `postgres` for PostgreSQL databases.                                                                                                                                                                                                                                            |
| `host`                                   | string                                       | **Required.** Enter the hostname or address of postgres database<br /><br />**Examples:**<br />`123.234.123.234`<br />`postgres.my-domain.com`<br />`localhost`                                                                                                                                                                   |
| `user`                                   | string                                       | **Required.** Enter the username for postgres database                                                                                                                                                                                                                                                                            |
| `password`                               | string                                       | **Required.** Enter the password for postgres database                                                                                                                                                                                                                                                                            |
| `database`                               | string                                       | Enter the database name to use. The default is to use the user name.                                                                                                                                                                                                                                                              |
| `port`                                   | integer                                      | Enter the port to connect to. Default value is `5432`.                                                                                                                                                                                                                                                                            |
| `timeout`                                | integer                                      | Enter the timeout in seconds for the database connection and queries. The default value is no timeout.                                                                                                                                                                                                                            |
| `batch-size`                             | integer                                      | Enter the number of rows to fetch from the database at a time. You can decrease this number if the machine with the extractor runs out of memory. Note that this will increase the run time. Default value is `1000`.                                                                                                             |
| `name`                                   | string                                       | Enter a name for the database that will be used throughout the `queries` section and for logging. The name must be unique for each database in the configuration file.                                                                                                                                                            |
| `timezone`                               | configuration for either  or offset from UTC | Specify how the extractor should handle timestamps from the source when timezone data is absent. Either `local` for the local timezone on the machine the extractor is running on, `utc` for UTC, or a number for a numerical offset from UTC. Default value is `local`.<br /><br />**Examples:**<br />`utc`<br />`-8`<br />`5.5` |
| [`source`](#databases.postgresql.source) | object                                       | Represents the system from which the data originates, including identifying information such as the name and a unique external identifier.                                                                                                                                                                                        |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.postgresql.source" /> `source`

Part of [`postgresql`](#databases.postgresql) configuration.

Represents the system from which the data originates, including identifying information such as the name and a unique external identifier.

| Parameter     | Type   | Description                                                                        |
| ------------- | ------ | ---------------------------------------------------------------------------------- |
| `name`        | string | User given name of the source system or application from which the data originates |
| `external-id` | string | A user given unique identifier (external-id) for the source.                       |

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.oracle_db" /> `oracle_db`

Part of [`databases`](#databases) configuration.

The Cognite DB Extractor can connect directly to an Oracle Database version 12.1 or later.

**Example:**

```yaml theme={"languages":{"custom":["/_languages/kuiper.json","../_languages/kuiper.json"]}}
type: oracle
name: my-database
host: 10.42.39.12
user: extractor-user
password: mySecretPassword
```

| Parameter                               | Type                                         | Description                                                                                                                                                                                                                                                                                                                       |
| --------------------------------------- | -------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `type`                                  | always `oracle`                              | Type of database connection, set to `oracle` for Oracle databases.                                                                                                                                                                                                                                                                |
| `host`                                  | string                                       | **Required.** Enter the hostname or address of oracle database<br /><br />**Examples:**<br />`123.234.123.234`<br />`database.my-domain.com`<br />`localhost`                                                                                                                                                                     |
| `user`                                  | string                                       | **Required.** Enter the user name                                                                                                                                                                                                                                                                                                 |
| `password`                              | string                                       | **Required.** Enter the user password                                                                                                                                                                                                                                                                                             |
| `port`                                  | integer                                      | Enter the port to connect to. Default value is `1521`.                                                                                                                                                                                                                                                                            |
| `service-name`                          | string                                       | Optionally specify the service name of the database to connect to                                                                                                                                                                                                                                                                 |
| `timeout`                               | integer                                      | Timeout for statements to the database                                                                                                                                                                                                                                                                                            |
| `batch-size`                            | integer                                      | Enter the number of rows to fetch from the database at a time. You can decrease this number if the machine with the extractor runs out of memory. Note that this will increase the run time. Default value is `1000`.                                                                                                             |
| `name`                                  | string                                       | Enter a name for the database that will be used throughout the `queries` section and for logging. The name must be unique for each database in the configuration file.                                                                                                                                                            |
| `timezone`                              | configuration for either  or offset from UTC | Specify how the extractor should handle timestamps from the source when timezone data is absent. Either `local` for the local timezone on the machine the extractor is running on, `utc` for UTC, or a number for a numerical offset from UTC. Default value is `local`.<br /><br />**Examples:**<br />`utc`<br />`-8`<br />`5.5` |
| [`source`](#databases.oracle_db.source) | object                                       | Represents the system from which the data originates, including identifying information such as the name and a unique external identifier.                                                                                                                                                                                        |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.oracle_db.source" /> `source`

Part of [`oracle_db`](#databases.oracle_db) configuration.

Represents the system from which the data originates, including identifying information such as the name and a unique external identifier.

| Parameter     | Type   | Description                                                                        |
| ------------- | ------ | ---------------------------------------------------------------------------------- |
| `name`        | string | User given name of the source system or application from which the data originates |
| `external-id` | string | A user given unique identifier (external-id) for the source.                       |

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.snowflake" /> `snowflake`

Part of [`databases`](#databases) configuration.

| Parameter                                         | Type                                         | Description                                                                                                                                                                                                                                                                                                                       |
| ------------------------------------------------- | -------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `type`                                            | always `snowflake`                           | Type of database connection, set to `snowflake` for Snowflake data warehouses.                                                                                                                                                                                                                                                    |
| `authentication_type`                             | string                                       | Authentication type for Snowflake. Can be either `plain` or `oauth`.                                                                                                                                                                                                                                                              |
| `user`                                            | string                                       | User name for Snowflake. `Required` if in case you are using Plain Authentication.                                                                                                                                                                                                                                                |
| `password`                                        | string                                       | Password for Snowflake. `Required` if in case you are using Plain Authentication.                                                                                                                                                                                                                                                 |
| [`private_key`](#databases.snowflake.private_key) | object                                       | Private key for Snowflake. Include to make use of private-key Authentication.                                                                                                                                                                                                                                                     |
| `client_id`                                       | string                                       | Client ID for OAuth authentication. `Required` if in case you are using OAuth Authentication.                                                                                                                                                                                                                                     |
| `client_secret`                                   | string                                       | Client Secret for OAuth authentication. `Required` if in case you are using OAuth Authentication.                                                                                                                                                                                                                                 |
| `access_token_generate_url`                       | string                                       | Azure OAuth 2.0 token endpoint. `Required` if in case you are using OAuth Authentication.                                                                                                                                                                                                                                         |
| `oauth_scopes`                                    | string                                       | Scopes for OAuth User. `Required` if in case you are using OAuth Authentication.                                                                                                                                                                                                                                                  |
| `account`                                         | string                                       | **Required.** Snowflake account ID                                                                                                                                                                                                                                                                                                |
| `compute_warehouse`                               | string                                       | Snowflake warehouse to use. `Required` if in case you are using OAuth Authentication, Optional if in case you are using Plain Authentication.                                                                                                                                                                                     |
| `organization`                                    | string                                       | **Required.** Snowflake organzation name                                                                                                                                                                                                                                                                                          |
| `database`                                        | string                                       | **Required.** Snowflake database to use                                                                                                                                                                                                                                                                                           |
| `schema`                                          | string                                       | **Required.** Snowflake schema to use                                                                                                                                                                                                                                                                                             |
| `name`                                            | string                                       | Enter a name for the database that will be used throughout the `queries` section and for logging. The name must be unique for each database in the configuration file.                                                                                                                                                            |
| `timezone`                                        | configuration for either  or offset from UTC | Specify how the extractor should handle timestamps from the source when timezone data is absent. Either `local` for the local timezone on the machine the extractor is running on, `utc` for UTC, or a number for a numerical offset from UTC. Default value is `local`.<br /><br />**Examples:**<br />`utc`<br />`-8`<br />`5.5` |
| [`source`](#databases.snowflake.source)           | object                                       | Represents the system from which the data originates, including identifying information such as the name and a unique external identifier.                                                                                                                                                                                        |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.snowflake.private_key" /> `private_key`

Part of [`snowflake`](#databases.snowflake) configuration.

Private key for Snowflake. Include to make use of private-key Authentication.

| Parameter    | Type   | Description                                                                    |
| ------------ | ------ | ------------------------------------------------------------------------------ |
| `path`       | string | **Required.** Path to the private key PEM file for Snowflake.                  |
| `passphrase` | string | Passphrase for the private key file. Required if the private key is encrypted. |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.snowflake.source" /> `source`

Part of [`snowflake`](#databases.snowflake) configuration.

Represents the system from which the data originates, including identifying information such as the name and a unique external identifier.

| Parameter     | Type   | Description                                                                        |
| ------------- | ------ | ---------------------------------------------------------------------------------- |
| `name`        | string | User given name of the source system or application from which the data originates |
| `external-id` | string | A user given unique identifier (external-id) for the source.                       |

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.mongodb" /> `mongodb`

Part of [`databases`](#databases) configuration.

| Parameter                             | Type                                         | Description                                                                                                                                                                                                                                                                                                                       |
| ------------------------------------- | -------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `type`                                | always `mongodb`                             | Type of database connection, set to `mongodb` for MongoDB databases.                                                                                                                                                                                                                                                              |
| `uri`                                 | string                                       | **Required.** Adress and authentication data for the database as a Uniform Resource Identifier (URI). You can read more about MongoDB URIs [here](https://www.mongodb.com/docs/manual/reference/connection-string).<br /><br />**Example:**<br />`mongodb://mymongo:port/?retryWrites=true&connectTimeoutMS=10000`                |
| `database`                            | string                                       | **Required.** Name of the related MongoDB database to use.                                                                                                                                                                                                                                                                        |
| `name`                                | string                                       | Enter a name for the database that will be used throughout the `queries` section and for logging. The name must be unique for each database in the configuration file.                                                                                                                                                            |
| `timezone`                            | configuration for either  or offset from UTC | Specify how the extractor should handle timestamps from the source when timezone data is absent. Either `local` for the local timezone on the machine the extractor is running on, `utc` for UTC, or a number for a numerical offset from UTC. Default value is `local`.<br /><br />**Examples:**<br />`utc`<br />`-8`<br />`5.5` |
| [`source`](#databases.mongodb.source) | object                                       | Represents the system from which the data originates, including identifying information such as the name and a unique external identifier.                                                                                                                                                                                        |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.mongodb.source" /> `source`

Part of [`mongodb`](#databases.mongodb) configuration.

Represents the system from which the data originates, including identifying information such as the name and a unique external identifier.

| Parameter     | Type   | Description                                                                        |
| ------------- | ------ | ---------------------------------------------------------------------------------- |
| `name`        | string | User given name of the source system or application from which the data originates |
| `external-id` | string | A user given unique identifier (external-id) for the source.                       |

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.azure_cosmos_db" /> `azure_cosmos_db`

Part of [`databases`](#databases) configuration.

| Parameter                                     | Type                                         | Description                                                                                                                                                                                                                                                                                                                       |
| --------------------------------------------- | -------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `type`                                        | always `cosmosdb`                            | Type of database connection, set to `cosmosdb` for Cosmos DB databases.                                                                                                                                                                                                                                                           |
| `host`                                        | string                                       | **Required.** Host address for the database<br /><br />**Example:**<br />`https://my-cosmos-db.documents.azure.com`                                                                                                                                                                                                               |
| `key`                                         | string                                       | **Required.** Azure Key used to connect to the Cosms DB instance                                                                                                                                                                                                                                                                  |
| `database`                                    | string                                       | **Required.** Database name to use                                                                                                                                                                                                                                                                                                |
| `name`                                        | string                                       | Enter a name for the database that will be used throughout the `queries` section and for logging. The name must be unique for each database in the configuration file.                                                                                                                                                            |
| `timezone`                                    | configuration for either  or offset from UTC | Specify how the extractor should handle timestamps from the source when timezone data is absent. Either `local` for the local timezone on the machine the extractor is running on, `utc` for UTC, or a number for a numerical offset from UTC. Default value is `local`.<br /><br />**Examples:**<br />`utc`<br />`-8`<br />`5.5` |
| [`source`](#databases.azure_cosmos_db.source) | object                                       | Represents the system from which the data originates, including identifying information such as the name and a unique external identifier.                                                                                                                                                                                        |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.azure_cosmos_db.source" /> `source`

Part of [`azure_cosmos_db`](#databases.azure_cosmos_db) configuration.

Represents the system from which the data originates, including identifying information such as the name and a unique external identifier.

| Parameter     | Type   | Description                                                                        |
| ------------- | ------ | ---------------------------------------------------------------------------------- |
| `name`        | string | User given name of the source system or application from which the data originates |
| `external-id` | string | A user given unique identifier (external-id) for the source.                       |

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.local_spreadsheet_files" /> `local_spreadsheet_files`

Part of [`databases`](#databases) configuration.

The Cognite DB extractor can run against excel spreadsheets and other files containting tabular data. The currently supported file types are

* xlsx, xlsm and xlsb (modern Excel files)
* xls (legacy excel files)
* odf, ods and odt (OpenDocument Format, used by e.g. Libre Office and Open Office)
* csv (Comma separated values)

When using Excel or OpenDocument Format spreadsheets, you need to provide an additional `sheet` parameter in the associated [query configuration](#-queries).

| Parameter                                             | Type                                         | Description                                                                                                                                                                                                                                                                                                                       |
| ----------------------------------------------------- | -------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `type`                                                | always `spreadsheet`                         | Type of connection, set to `spreadsheet` for local spreadsheet files.                                                                                                                                                                                                                                                             |
| `path`                                                | string                                       | **Required.** Path to a single spreadsheet file<br /><br />**Examples:**<br />`/path/to/my/excel/file.xlsx`<br />`./relative/path/file.csv`<br />`C:\\Users\\Robert\\Documents\\spreadsheet.xls`                                                                                                                                  |
| `name`                                                | string                                       | Enter a name for the database that will be used throughout the `queries` section and for logging. The name must be unique for each database in the configuration file.                                                                                                                                                            |
| `timezone`                                            | configuration for either  or offset from UTC | Specify how the extractor should handle timestamps from the source when timezone data is absent. Either `local` for the local timezone on the machine the extractor is running on, `utc` for UTC, or a number for a numerical offset from UTC. Default value is `local`.<br /><br />**Examples:**<br />`utc`<br />`-8`<br />`5.5` |
| [`source`](#databases.local_spreadsheet_files.source) | object                                       | Represents the system from which the data originates, including identifying information such as the name and a unique external identifier.                                                                                                                                                                                        |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.local_spreadsheet_files.source" /> `source`

Part of [`local_spreadsheet_files`](#databases.local_spreadsheet_files) configuration.

Represents the system from which the data originates, including identifying information such as the name and a unique external identifier.

| Parameter     | Type   | Description                                                                        |
| ------------- | ------ | ---------------------------------------------------------------------------------- |
| `name`        | string | User given name of the source system or application from which the data originates |
| `external-id` | string | A user given unique identifier (external-id) for the source.                       |

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.amazon_dynamo_db" /> `amazon_dynamo_db`

Part of [`databases`](#databases) configuration.

| Parameter                                      | Type                                         | Description                                                                                                                                                                                                                                                                                                                       |
| ---------------------------------------------- | -------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `type`                                         | always `dynamodb`                            | Type of database connection, set to `dynamodb` for DynamoDB databases.                                                                                                                                                                                                                                                            |
| `aws-access-key-id`                            | string                                       | **Required.** AWS authentication access key ID                                                                                                                                                                                                                                                                                    |
| `aws-secret-access-key`                        | string                                       | **Required.** AWS authentication access key secret                                                                                                                                                                                                                                                                                |
| `region-name`                                  | string                                       | **Required.** AWS region where your database is located.<br /><br />**Example:**<br />`us-east-1`                                                                                                                                                                                                                                 |
| `name`                                         | string                                       | Enter a name for the database that will be used throughout the `queries` section and for logging. The name must be unique for each database in the configuration file.                                                                                                                                                            |
| `timezone`                                     | configuration for either  or offset from UTC | Specify how the extractor should handle timestamps from the source when timezone data is absent. Either `local` for the local timezone on the machine the extractor is running on, `utc` for UTC, or a number for a numerical offset from UTC. Default value is `local`.<br /><br />**Examples:**<br />`utc`<br />`-8`<br />`5.5` |
| [`source`](#databases.amazon_dynamo_db.source) | object                                       | Represents the system from which the data originates, including identifying information such as the name and a unique external identifier.                                                                                                                                                                                        |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.amazon_dynamo_db.source" /> `source`

Part of [`amazon_dynamo_db`](#databases.amazon_dynamo_db) configuration.

Represents the system from which the data originates, including identifying information such as the name and a unique external identifier.

| Parameter     | Type   | Description                                                                        |
| ------------- | ------ | ---------------------------------------------------------------------------------- |
| `name`        | string | User given name of the source system or application from which the data originates |
| `external-id` | string | A user given unique identifier (external-id) for the source.                       |

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.amazon_redshift" /> `amazon_redshift`

Part of [`databases`](#databases) configuration.

| Parameter                                     | Type                                         | Description                                                                                                                                                                                                                                                                                                                       |
| --------------------------------------------- | -------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `type`                                        | always `redshift`                            | Type of database connection, set to `redshift` for Redshift databases.                                                                                                                                                                                                                                                            |
| `aws-access-key-id`                           | string                                       | **Required.** AWS authentication access key ID                                                                                                                                                                                                                                                                                    |
| `aws-secret-access-key`                       | string                                       | **Required.** AWS authentication access key secret                                                                                                                                                                                                                                                                                |
| `region-name`                                 | string                                       | **Required.** AWS region where your database is located.<br /><br />**Example:**<br />`us-east-1`                                                                                                                                                                                                                                 |
| `database`                                    | string                                       | **Required.** Redshift database                                                                                                                                                                                                                                                                                                   |
| `secret-arn`                                  | string                                       | AWS Secret ARN                                                                                                                                                                                                                                                                                                                    |
| `cluster-identifier`                          | string                                       | Name of the Redshift cluster to connect. This parameter is required when connecting to a managed Redshift cluster.                                                                                                                                                                                                                |
| `workgroup-name`                              | string                                       | Name of the Redshift workgroup to connect. This parameter is mandatory when connecting to a Redshift Serverless database.                                                                                                                                                                                                         |
| `name`                                        | string                                       | Enter a name for the database that will be used throughout the `queries` section and for logging. The name must be unique for each database in the configuration file.                                                                                                                                                            |
| `timezone`                                    | configuration for either  or offset from UTC | Specify how the extractor should handle timestamps from the source when timezone data is absent. Either `local` for the local timezone on the machine the extractor is running on, `utc` for UTC, or a number for a numerical offset from UTC. Default value is `local`.<br /><br />**Examples:**<br />`utc`<br />`-8`<br />`5.5` |
| [`source`](#databases.amazon_redshift.source) | object                                       | Represents the system from which the data originates, including identifying information such as the name and a unique external identifier.                                                                                                                                                                                        |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.amazon_redshift.source" /> `source`

Part of [`amazon_redshift`](#databases.amazon_redshift) configuration.

Represents the system from which the data originates, including identifying information such as the name and a unique external identifier.

| Parameter     | Type   | Description                                                                        |
| ------------- | ------ | ---------------------------------------------------------------------------------- |
| `name`        | string | User given name of the source system or application from which the data originates |
| `external-id` | string | A user given unique identifier (external-id) for the source.                       |

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.google_bigquery" /> `google_bigquery`

Part of [`databases`](#databases) configuration.

The Cognite DB Extractor can run against Google BigQuery using Google SQL(like) query.

Because this extends the Google SDK, you also authenticate with the Google suggested authentication methods by setting the `GOOGLE_APPLICATION_CREDENTIALS` environment variable to the path of your authentication key

| Parameter                                     | Type                                         | Description                                                                                                                                                                                                                                                                                                                       |
| --------------------------------------------- | -------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `type`                                        | always `bigquery`                            | Type of database connection, set to `bigquery` for Google BigQuery                                                                                                                                                                                                                                                                |
| `name`                                        | string                                       | Enter a name for the database that will be used throughout the `queries` section and for logging. The name must be unique for each database in the configuration file.                                                                                                                                                            |
| `timezone`                                    | configuration for either  or offset from UTC | Specify how the extractor should handle timestamps from the source when timezone data is absent. Either `local` for the local timezone on the machine the extractor is running on, `utc` for UTC, or a number for a numerical offset from UTC. Default value is `local`.<br /><br />**Examples:**<br />`utc`<br />`-8`<br />`5.5` |
| [`source`](#databases.google_bigquery.source) | object                                       | Represents the system from which the data originates, including identifying information such as the name and a unique external identifier.                                                                                                                                                                                        |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="databases.google_bigquery.source" /> `source`

Part of [`google_bigquery`](#databases.google_bigquery) configuration.

Represents the system from which the data originates, including identifying information such as the name and a unique external identifier.

| Parameter     | Type   | Description                                                                        |
| ------------- | ------ | ---------------------------------------------------------------------------------- |
| `name`        | string | User given name of the source system or application from which the data originates |
| `external-id` | string | A user given unique identifier (external-id) for the source.                       |

## <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="extractor" /> `extractor`

Global parameter.

General extractor configuration

| Parameter                               | Type                            | Description                                                                                                                                                                                   |
| --------------------------------------- | ------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [`state-store`](#extractor.state-store) | object                          | Include the state store section to save extraction states between runs. Use this if data is loaded incrementally. We support multiple state stores, but you can only configure one at a time. |
| `upload-queue-size`                     | integer                         | Maximum size of upload queue. Upload to CDF will be triggered once this limit is reached. Default value is `100000`.                                                                          |
| `parallelism`                           | integer                         | Maximum number of parallel queries. Default value is `4`.                                                                                                                                     |
| `mode`                                  | either `continuous` or `single` | Extractor mode. Continuous runs the configured queries using the schedules configured per query. Single runs the queries once each.                                                           |

### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="extractor.state-store" /> `state-store`

Part of [`extractor`](#extractor) configuration.

Include the state store section to save extraction states between runs. Use this if data is loaded incrementally. We support multiple state stores, but you can only configure one at a time.

| Parameter                               | Type   | Description                                                                          |
| --------------------------------------- | ------ | ------------------------------------------------------------------------------------ |
| [`raw`](#extractor.state-store.raw)     | object | A RAW state store stores the extraction state in a table in CDF RAW.                 |
| [`local`](#extractor.state-store.local) | object | A local state store stores the extraction state in a JSON file on the local machine. |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="extractor.state-store.raw" /> `raw`

Part of [`state-store`](#extractor.state-store) configuration.

A RAW state store stores the extraction state in a table in CDF RAW.

| Parameter         | Type    | Description                                                                          |
| ----------------- | ------- | ------------------------------------------------------------------------------------ |
| `database`        | string  | **Required.** Enter the database name in CDF RAW.                                    |
| `table`           | string  | **Required.** Enter the table name in CDF RAW.                                       |
| `upload-interval` | integer | Enter the interval in seconds between each upload to CDF RAW. Default value is `30`. |

#### <a class="anchorWithStickyNavbar_src-theme-Heading-styles-module" name="extractor.state-store.local" /> `local`

Part of [`state-store`](#extractor.state-store) configuration.

A local state store stores the extraction state in a JSON file on the local machine.

| Parameter       | Type    | Description                                                             |
| --------------- | ------- | ----------------------------------------------------------------------- |
| `path`          | string  | **Required.** Insert the file path to a JSON file.                      |
| `save-interval` | integer | Enter the interval in seconds between each save. Default value is `30`. |
