OData - best practices and troubleshooting

Get the most out of your OData client with these best practices and troubleshooting tips.

Performance

The performance of the ~~OData~~ services depends on the type of data you access. For example, reading 1M data points takes about 2:30 to 3:00 minutes (6K data points per second). Each full request takes an average of 120 ms, and the ~~OData~~ client might add additional overhead for processing and data handling.

Follow these general best practices to make sure you get the best and most reliable performance:

Don't use OR expressions or expanding tables.
Use multiple queries when possible.
Use incremental refresh.
Partition data sets if possible.
Keep only the data you need. Remove unnecessary columns and data.
Keep historical data in a separate report if you don't need it daily. Refresh the historical data report when you need it.

Write performant queries

The ~~OData~~ services accept multiple concurrent requests, and processes the requests in parallel. ~~OData~~ clients can also dispatch multiple queries concurrently when possible.

It's better to compose and use multiple queries instead of a single complex query with, for example, OR expressions or expands. A single complex query must be iterated sequentially with the added round-trip latency for each request.

Download the data using multiple queries and join the resulting tables in your ~~OData~~ client to work with the tables as if they were a single table.

Use incremental refresh

Incremental refresh enables large datasets with the following benefits:

Only changed data needs to be refreshed.
You don't have to maintain long-running connections to source systems.
Less data to refresh reduces the overall consumption of memory and other resources.

Each ~~OData~~ client might have different incremental refresh features. For ~~Microsoft Power BI~~, see Incremental refresh for semantic models in Power BI.

Partition large data sets

If you need to download large data sets, partition them and have a separate query to read each partition. Some ~~OData~~ clients, like ~~Microsoft Power BI~~, can process multiple queries concurrently. Partitioning the data set can significantly improve performance.

For example, if you read data points from the last two years, try splitting the query into two queries, each reading one year of data. Then, merge (concatenate) the tables in ~~Power BI~~.

Property naming in metadata and CDF RAW

Property keys for metadata and the ~~CDF~~ staging area (~~RAW~~) must be valid identifiers and can only contain letters, numbers, or underscores. The ~~OData~~ services rewrite other characters to an underscore. For the best and most predictable results, make sure that ingested data follow this naming convention for property keys: ^[a-zA-Z][_a-za-z0-9]\*[a-zA-Z0-9]\$.

Troubleshooting

Find information to help you troubleshoot issues using ~~CDF~~ as a data source for ~~OData~~ clients.

Queries take too long

A ~~CDF~~ project can contain hundreds of millions of rows of data, and loading them all into an ~~OData~~ client isn't feasible. If your query takes hours, you're likely trying to load too much data.

See Filter items in data models and Filtering asset-centric resource types to learn about the filtering capabilities supported by the ~~OData~~ services.

Not getting all results

If you get fewer results than expected, you may be using a filter function that ~~CDF~~ doesn't support, for example, startswith on the Name column for TimeSeries.

See Filter items in data models and Filtering asset-centric resource types to learn about the filtering capabilities supported by the ~~OData~~ services.

Unable to retrieve minimal values from CDF RAW

If you're using data from the ~~CDF~~ staging area, ~~CDF RAW~~, in an ~~OData~~ client, you can experience issues retrieving small numbers in exponential notation.

~~CDF RAW~~ doesn't have a schema, but the ~~OData~~ libraries in some ~~OData~~ clients, like ~~Power BI~~, try to select the correct format for the data. Currently, ~~Power BI~~ chooses the wrong decoder for small numbers in exponential notation, and you may get an error similar to this:

DataSource,Error: OData: Cannot convert the literal '2.89999206870561 to the expected type 'Edm.Decimal',

To resolve the issue, ingest the values into ~~CDF RAW~~ as strings instead of numbers, and convert the strings back to numbers in ~~Power BI~~, for example, using the Decimal.From ~~Power Query~~ M-function. You won't lose precision, and because most JSON decoders accept strings for numbers, clients that expect numbers will still work.

Performance​

Write performant queries​​

Use incremental refresh​​

Partition large data sets​​

Property naming in metadata and CDF RAW​​

Troubleshooting​​

Queries take too long​​

Not getting all results​​

Unable to retrieve minimal values from CDF RAW​​