Hosted extractors run inside Cognite Data Fusion (CDF) and are intended for live data streams with low latency.
A hosted extractor job reads from a source, transforms the data using a built-in format or a mapping, and writes to a destination.
A hosted extractor source represents an external source system on the internet. The source resource in CDF contains all the information the extractor needs to connect to the external source system.
A source can have many jobs, each streaming different data from the source system.
A hosted extractor job represents the running extractor. Jobs produce logs and metrics that give the state of the job. A job can have nine different states:
|Paused||The job is temporarily stopped.|
|Waiting to start||The job is temporarily stopped and pending start. This state typically only lasts a few seconds.|
|Stopping||The job is running, but is supposed to stop. This state should only last for a few seconds at most.|
|Startup error||The job failed to start and will not attempt to restart. Check the configuration settings for the job to resolve this state.|
|Connection error||The job failed to connect to the source system and is currently retrying.|
|Connected||The job is connected to the source system, but has not yet received any data.|
|Transform error||The job is connected to the source system, received data but failed to transform and ingest the data into a CDF resource type.|
|Destination error||The job successfully transformed data, but failed to ingest data into a CDF resource type.|
|Running||The job is streaming data into CDF.|
Jobs report metrics on their current execution. The following metrics are currently reported:
|Source messages||The number of input messages received from the source system.|
|Transform failures||The number of input messages that failed to transform.|
|Destination input values||The number of messages that successfully transformed and were given to destinations for uploading to CDF.|
|Destination requests||The number of requests made to CDF for this job.|
|Destination write failures||The number of requests to CDF that failed for this job.|
|Destination skipped values||The number of values that were invalid and were skipped before ingestion into CDF.|
|Destination failed values||The number of values that were not written to CDF due to failed requests.|
|Destination uploaded values||The number of values that were successfully ingested into CDF.|
A hosted extractor writes to a destination. The destination only contains credentials for CDF.
Multiple jobs can share a single destination, in which case they will make requests together, reducing the number of requests made to CDF APIs. Metrics will still be reported individually.
A mapping is a custom transformation, translating the source format to a format that can be ingested into CDF. Read more in Custom data formats for hosted extractors