# Toolbox reference

This section describes the toolboxes developed for Industrial Data Analytics to provide subject matter experts (SMEs) out-of-the-box algorithms to process and manipulate data, conduct root cause analysis (RCA) and develop solutions without having to code.

The toolboxes cover basic operations, statistical methods, data transformation, and advanced models. They work out-of-the-box with Cognite Charts, and we will continuously add new algorithms, features, and functionality.

In this article:

# Operators

The Operators toolbox contains all the standard arithmetic and algebraic operations that you can use with time series data (addition, subtraction, multiplication, division) and more advanced calculations such as differentiation, integration, time series mapping, and more.

# Filter

Filters are algorithms that remove parts of a time series to capture the underlying signal. For example, low-pass filters remove the high-frequency noise of a time series. You can also use filters in conjunction with event detectors to remove undesired phenomena in a time series. For instance, you can use an anomaly detector to map the time series to a binary array indicating the presence or absence of an anomaly. Then apply a boolean mask on the raw time series to remove all detected anomalies.

# Detect

You can map time series to a set of discrete variables that indicate the presence or absence of an event. For instance, a steady and transient operation can be determined when large step-changes occur in the sensor reading (potentially due to valve changes, start-ups, etc.). Another example is anomalies where significant unexpected changes in the sensor reading can occur before returning to normal behavior (e.g., spike in value). The Detect module contains algorithms that perform this task of mapping continuous time series to discrete variables based on the behavior of the time series.

# Resample

Industrial data is in most cases non-uniformly sampled, and before using the data as part of a model with other time series, the data has to be pre-processed. Resampling data is a typical pre-processing step. The resample toolbox offers a variety of methods to resample your data. This toolbox provides classical resampling methods (e.g., interpolation) and advanced machine learning algorithms to down- or up-sample your data.

# Smooth

Smoothers modify time series to boost the main underlying trend and remove fine-scaled phenomena. You can do this in several different ways. Some examples of smoothers found in this toolbox are: filter to remove higher frequency phenomena from the raw data (e.g. Butterworth or Chebyshev low-pass filters), regression-based smoothers that estimate the coefficients of a parametric function to predict the underlying signal, or moving averages that applying a rolling operation on a user-defined window.

# Statistics

The Statistics toolbox offers various algorithms to describe, analyze, and model industrial time series data. This toolbox is ideal to describe your data, conduct root cause analysis and exploratory work. The algorithms range from descriptive statistics to linear and nonlinear regression analysis to ML methods (e.g., classification/clustering).

# Data quality

Accurate data is a fundamental part of any industrial model. This toolbox contains a collection of advanced algorithms to evaluate, monitor, and improve the data quality of time series. There are multiple dimensions regarding time series data quality: accuracy, timeliness, completeness, validity, consistency, uniqueness. The algorithms in this toolbox provide methods in all dimensions while focusing heavily on ACCURACY. If the data is not correct, the other dimensions are of little importance. Examples of functions found in this toolbox are data gap detection and filling, outlier detection and removal, and sensor drift.

# Regression

The Regression toolbox focuses on using classical methods (linear and nonlinear models) and machine learning regression algorithms to describe the relationship between industrial data and physical parameters. It enables you to conduct semiautomatic mapping parameters to historical data and forecast its behaviors several steps into the future.

# Oil and gas

This module contains algorithms particularly relevant to the oil and gas industry. You will find methods to estimate parameters such as the Productivity Index (gas flow rate divided by the difference between the reservoir and bottom hole pressures), pressure drop, single-phase flow rate, hydrostatic head, and many others.

# Forecast

The Forecast toolbox offers a variety of machine learning algorithms to forecast the behavior of industrial time series, with a particular focus on forecasting based on the correlation between a time series and physical parameters. Forecasting involves learning from historical data to make a prediction several time steps into the future. For industrial time series analysis, this typically involves pre-processing the data, training a parametric time series model, and then predicting the result by a user-defined number of steps into the future.

Last Updated: 6/1/2021, 9:48:27 AM