Ga verder naar hoofdinhoud

OData - best practices and troubleshooting

Get the most out of your OData client with these best practices and troubleshooting tips.

Performance

The performance of the OData services depends on the type of data you access. For example, reading 1M data points takes about 2:30 to 3:00 minutes (6K data points per second). Each full request takes an average of 120 ms, and the OData client might add additional overhead for processing and data handling.

Follow these general best practices to make sure you get the best and most reliable performance:

  • Don't use OR expressions or expanding tables.
  • Use multiple queries when possible.
  • Use incremental refresh.
  • Partition data sets if possible.
  • Keep only the data you need. Remove unnecessary columns and data.
  • Keep historical data in a separate report if you don't need it daily. Refresh the historical data report when you need it.

Write performant queries​

The OData services accept multiple concurrent requests, and processes the requests in parallel. OData clients can also dispatch multiple queries concurrently when possible.

It's better to compose and use multiple queries instead of a single complex query with, for example, OR expressions or expands. A single complex query must be iterated sequentially with the added round-trip latency for each request.

Download the data using multiple queries and join the resulting tables in your OData client to work with the tables as if they were a single table.

Use incremental refresh​

Incremental refresh enables large datasets with the following benefits:

  • Only changed data needs to be refreshed.
  • You don't have to maintain long-running connections to source systems.
  • Less data to refresh reduces the overall consumption of memory and other resources.

Each OData client might have different incremental refresh features. For Microsoft Power BI, see Incremental refresh for semantic models in Power BI.

Partition large data sets​

If you need to download large data sets, partition them and have a separate query to read each partition. Some OData clients, like Microsoft Power BI, can process multiple queries concurrently. Partitioning the data set can significantly improve performance.

For example, if you read data points from the last two years, try splitting the query into two queries, each reading one year of data. Then, merge (concatenate) the tables in Power BI.

Property naming in metadata and CDF RAW​

Property keys for metadata and the CDF staging area (RAW) must be valid identifiers and can only contain letters, numbers, or underscores. The OData services rewrite other characters to an underscore. For the best and most predictable results, make sure that ingested data follow this naming convention for property keys: ^[a-zA-Z][_a-za-z0-9]\*[a-zA-Z0-9]\$.

Troubleshooting​

Find information to help you troubleshoot issues using CDF as a data source for OData clients.

Queries take too long​

A CDF project can contain hundreds of millions of rows of data, and loading them all into an OData client isn't feasible. If your query takes hours, you're likely trying to load too much data.

See Filter items in data models and Filtering asset-centric resource types to learn about the filtering capabilities supported by the OData services.

Not getting all results​

If you get fewer results than expected, you may be using a filter function that CDF doesn't support, for example, startswith on the Name column for TimeSeries.

See Filter items in data models and Filtering asset-centric resource types to learn about the filtering capabilities supported by the OData services.

Unable to retrieve minimal values from CDF RAW​

If you're using data from the CDF staging area, CDF RAW, in an OData client, you can experience issues retrieving small numbers in exponential notation.

CDF RAW doesn't have a schema, but the OData libraries in some OData clients, like Power BI, try to select the correct format for the data. Currently, Power BI chooses the wrong decoder for small numbers in exponential notation, and you may get an error similar to this:

DataSource,Error: OData: Cannot convert the literal '2.89999206870561 to the expected type 'Edm.Decimal',

To resolve the issue, ingest the values into CDF RAW as strings instead of numbers, and convert the strings back to numbers in Power BI, for example, using the Decimal.From Power Query M-function. You won't lose precision, and because most JSON decoders accept strings for numbers, clients that expect numbers will still work.