cdf data upload command to upload data files from your local machine to Cognite Data Fusion (CDF). The command processes structured directories containing data files and their corresponding manifest files. The manifest files define how the data should be uploaded.
Prerequisites
Before uploading data, ensure you have:- Installed and configured the Cognite Toolkit.
- Authenticated with CDF using appropriate credentials.
- Prepared your data files in the required directory structure.
- Created manifest files for each data file you want to upload.
Directory structure
The upload command requires a directory containing your data files. Organize the directory using this structure:- Searches for all YAML files with the
.Manifest.yamlsuffix. - For each manifest file, locates the data file with the same prefix.
- If
--deploy-resources, deploys all resources in the resource directory. - Uploads the data files for each manifest.
--deploy-resources flag when running the upload command, the Cognite Toolkit creates these resources before uploading the data.
When you download a resource, the Cognite Toolkit automatically places the downloaded files in your selected directory with the specified format.
Upload data
Confirm your directory layout
Use the layout in Directory structure: manifest files with the
.Manifest.yaml suffix, data files whose names share the same prefix as the manifest, and optional resources/ when definitions must exist before upload (for example RAW table metadata).Run the upload command
From a terminal, run Replace
cdf data upload with the path to the directory that contains your manifests and data files:path/to/input_dir with your directory. For flags and arguments, run cdf data upload --help.Deploy related resources (optional)
If you use a
resources/ folder, pass --deploy-resources so the Cognite Toolkit creates or updates those resources before uploading the data files:Assets
Kind: Assets Supported data file formats: .ndjson, .csv, .parquet API endpoint: /assets You can upload assets from a record (.ndjson) or tabular (.csv, .parquet) format.
The manifest file for assets can specify a data set or a hierarchy:
myDataSet.Manifest.yaml
myHierarchy.Manifest.yaml
myHierarchy-part-0001.Assets.ndjson
myDataSet.Assets.csv
Canvas
Kind: IndustrialCanvas Supported data file format: .ndjson API endpoint: No public API You can upload canvases using a record (.ndjson) format. The manifest file specifies the external IDs of the canvases to upload:
myCanvases.Manifest.yaml
myCanvases-part-0001.IndustrialCanvas.ndjson
You usually download canvases with the
cdf data download canvas command, for example to copy canvases between CDF projects.Charts
Kind: Charts Supported data file format: .ndjson API endpoint: No public API You can upload charts using a record (.ndjson) format. The manifest file specifies the external IDs of the charts to upload:
myCharts.Manifest.yaml
myCharts-part-0001.Charts.ndjson
Charts are typically downloaded using the
cdf data download charts command.Datapoints
Kind: Datapoints Supported data file formats: .csv, .parquet API endpoint: /timeseries/data You can upload datapoints using tabular (.csv, .parquet) formats. The manifest file specifies the mapping between the columns in the data file and the time series identifiers. The identifiers can be either internal IDs, external IDs, instance IDs, or a mix of these. In addition, you need to specify the column containing the timestamps:
myDatapoints.Manifest.yaml
myDatapoints.Datapoints.csv
Events
Kind: Events Supported data file formats: .ndjson, .csv, .parquet API endpoint: /events You can upload events using record (.ndjson) and tabular (.csv) formats. The manifest file specifies the target data set or the root asset for the event hierarchy:
myDataSet.Manifest.yaml
myHierarchy.Manifest.yaml
myEvents-part-0001.Events.ndjson
myEvents.Events.csv
Files
Kind: FileContent Supported data file format: Any format for the file content itself API endpoint: /filesFileMetadata with content
Requires alpha flagextend-upload to be enabled
Supported metadata file format: .ndjson, .csv, .parquet
You can upload FileMetadata (asset-centric files) in either record (.ndjson) or tabular (.csv, .parquet) formats.
Template manifest file
myFileMetadata.Manifest.yaml
fileDirectory:
$FILENAME with the actual
file name. For example, file1.txt will be uploaded with:
file1.FileMetadata.yaml
FileMetadata IDs manifest file
You can also upload files using a manifest file that instructs the Cognite Toolkit to simply read the files from CSV, Parquet, or NDJSON files.myFileMetadata.Manifest.yaml
myFileMetadata.Manifest.yaml, you can have myFileMetadata.csv with the following content:
myFileMetadata.csv
$FILEPATH column are relative to the location of the manifest file.
Combined with this layout, the directory is expected to look like this:
Cognite File with content
Requires alpha flagextend-upload to be enabled
Supported metadata file format: .ndjson, .csv, .parquet
You can upload Cognite Files (instances in the data model) in either record (.ndjson) or tabular (.csv, .parquet) formats.
Template manifest file
myCogniteFile.Manifest.yaml
fileDirectory:
$FILENAME with the actual
file name. For example, file1.txt will be uploaded with:
file1.CogniteFile.yaml
Cognite File IDs manifest file
You can also upload files using a manifest file that instructs the Cognite Toolkit to simply read the files from CSV, Parquet, or NDJSON files.myCogniteFile.Manifest.yaml
myCogniteFile.Manifest.yaml, you can have myCogniteFile.csv with the following content:
myCogniteFile.csv
$FILEPATH column are relative to the location of the manifest file. Combined with this layout, the directory is expected to look like this:
Legacy format for manifest file with content
Supported metadata file format: .ndjsonTemplate for manifest file
You can upload files with content using a template format for the manifest file. Classic template format:myFileMetadata.Manifest.yaml
myFileMetadata.Manifest.yaml
fileDirectory:
$FILENAME with the actual file name.
Instances
Kind: Instances Supported data file format: .ndjson (.csv and .parquet if alpha flagextend-upload is enabled)
API endpoint: /models/instances
You can upload instances using the record (.ndjson) format. The manifest file specifies the instance space for the instances:
myInstances.Manifest.yaml
myInstances-part-0001.Instances.ndjson
myInstances.Instances.csv
Raw
Kind: RawRows Supported data file formats: .ndjson, .csv, .parquet API endpoint: /raw/dbs//tables//rows You can upload rows to CDF RAW using the record (.ndjson) and tabular (.csv, .parquet) formats. The manifest file specifies the
database and table the data should be uploaded to. In addition, for table formats, you can specify the column that should be used as the unique row identifier. If not specified, CDF automatically generates row IDs.
myRawData.Manifest.yaml
myRawData-part-0001.RawRows.ndjson
myRawData.RawRows.csv
Time series
Kind: TimeSeries Supported data file formats: .ndjson, .csv, .parquet API endpoint: /timeseries You can upload time series definitions using the record (.ndjson) and tabular (.csv) formats. The manifest file specifies the target data set
or the root asset for the hierarchy the time series is connected to:
myDataSet.Manifest.yaml
myHierarchy.Manifest.yaml
myTimeSeries-part-0001.TimeSeries.ndjson
myTimeSeries.TimeSeries.csv