pecos.metrics module¶
The metrics module contains performance metrics to track system health.
-
pecos.metrics.
qci
(mask, tfilter=None, per_day=True)[source]¶ Compute the quality control index defined as:
\(QCI=\dfrac{\sum_{d\in D}\sum_{t\in T}X_{dt}}{|DT|}\)
where \(D\) is the set of data columns and \(T\) is the set of time stamps in the analysis. \(X_{dt}\) is a data point for column \(d\) time t` that passed all quality control test. \(|DT|\) is the number of data points in the analysis.
Parameters: mask : pd.Dataframe
Test results mask, returned from pm.get_test_results_mask()
tfilter : pd.Series (optional)
Time filter containing boolean values for each time index
per_day : boolean (optional)
Flag indicating if the results should be computed per day, default = True
Returns: QCI : pd.DataFrame
Quality control index
-
pecos.metrics.
rmse
(x1, x2, tfilter=None, per_day=True)[source]¶ Compute the root mean squared error defined as:
\(RMSE=\sqrt{\dfrac{\sum{(x_1-x_2)^2}}{n}}\)
where \(x_1\) is a time series, \(x_2\) is a time series, and \(n\) is a number of data points.
Parameters: x1 : pd.DataFrame with a single column or pd.Series
Data
x2 : pd.DataFrame with a single column or pd.Series
Data
tfilter : pd.Series (optional)
Time filter containing boolean values for each time index
per_day : boolean (optional)
Flag indicating if the results should be computed per day, default = True
Returns: RMSE : pd.DataFrame
Root mean squared error of the data
-
pecos.metrics.
time_integral
(data, tfilter=None, per_day=True)[source]¶ Compute the time integral of each column in the DataFrame defined as:
\(F=\int{fdt}\)
where \(f\) is a column of data \(dt\) is the time step between observations. The time integral is computed using the trapezoidal rule from numpy.trapz. Results are given in [original data units]*seconds. NaN values are set to 0 for integration.
Parameters: data : pd.DataFrame
Data
tfilter : pd.Series (optional)
Time filter containing boolean values for each time index
per_day : boolean (doptional)
Flag indicating if the results should be computed per day, default = True
Returns: F : pd.DataFrame
Time integral of the data, each column is named ‘Time integral of ‘ + original column name.