Skip to main content

About the Data Catalog

The Data Catalog is a collection of data sets that serves as a central and secure library of data assets you can search, explore, and manage in CDF. Use the Data Catalog to get an overview of the data that's available to you, and use it as a starting point for exploring data resources.

You can use the Data Catalog as:

  • A part of your onboarding workflow, to get overviews of existing data, and to verify that newly onboarded data appears as expected.
  • An entry point for searching for and exploring data in CDF.
  • A way to manage data governance and user access to data resources.

Before you start using the Data Catalog

  • You must create data sets before the data is visible in the Data Catalog.

  • The data you see in the Data Catalog also depends on your access rights, including organizational settings and which CDF projects you belong to.

  • To view and manage data from the Data Catalog, you need capabilities for data sets.

  • Note that staged data (RAW), and some image and video data aren't included in the Data Catalog.

Get started with the Data Catalog

  1. Navigate to the CDF portal application > Manage Data Catalog.
  2. Search, explore, and manage data sets from the Data Catalog page.

Explore and manage data from the Data Catalog

You can see an overview of your data sets and their associated data resources on the Data Catalog page. You can create data sets, view details in data sets, and edit data sets on this page.

Select Explore data from the Data Catalog page to open the Data explorer. Here you can search, browse through, or manage data for resource types within a data set, like files, events, or time series. You can, for example:

  • Preview data to get an overview of your data.
  • Navigate to data and narrow your search using filters.
  • View data
    • Metadata on resources like descriptions, IDs, and governance status.
    • Where the data came from (data sources), when the data was created in CDF, and when it was updated.
    • Open and view files, like PDF images, and see metadata on those files.
  • Manage data
    • Create extraction pipelines, and see details about how data is ingested from the extractors.
    • Manage groups that have access to specific data sets.
    • Create new data sets.