pecos.monitoring module¶

The monitoring module contains the PerformanceMonitoring class used to run quality control tests and store results.

class pecos.monitoring.PerformanceMonitoring[source]¶

Bases: object

PerformanceMonitoring class

Methods

`add_dataframe`(df, system_name[, ...])	Add DataFrame to the PerformanceMonitoring object.
`add_signal`(col_name, data)	Add signal to the PerformanceMonitoring DataFrame.
`add_time_filter`(time_filter)	Add a time filter to the PerformanceMonitoring object.
`add_translation_dictionary`(trans, system_name)	Add translation dictionary to the PerformanceMonitoring object.
`append_test_results`(mask, error_msg[, ...])	Append QC results to the PerformanceMonitoring object.
`check_corrupt`(corrupt_values[, key, ...])	Check for corrupt data.
`check_increment`(bound[, key, specs, ...])	Check range on data increments.
`check_missing`([key, min_failures])	Check for missing data
`check_range`(bound[, key, specs, ...])	Check data range.
`check_timestamp`(frequency[, ...])	Check time series for non-monotonic and duplicate timestamps.
`evaluate_string`(col_name, string_to_eval[, ...])	Returns the evaluated python equation written as a string (BETA).
`get_clock_time`()	Returns the time of day in seconds past midnight for each Timestamp in the DataFrame index.
`get_elapsed_time`()	Returns the elapsed time in seconds for each Timestamp in the DataFrame index.
`get_test_results_mask`([key])	Return a mask of data-times that failed a quality control test.

add_dataframe(df, system_name, add_identity_translation_dictionary=False)[source]¶

Add DataFrame to the PerformanceMonitoring object.

Parameters:

df : pd.Dataframe

Dataframe to add to the PerformanceMonitoring object

system_name : string

System name

add_identity_translation_dictionary : boolean (optional)

Add a 1:1 translation dictionary to the PerformanceMonitoring object using all column names in df, default = False

add_translation_dictionary(trans, system_name)[source]¶

Add translation dictionary to the PerformanceMonitoring object.

Parameters:

trans : dictionary

Translation dictionary

system_name : string

System name

add_time_filter(time_filter)[source]¶

Add a time filter to the PerformanceMonitoring object.

Parameters:

time_filter : pd.DataFrame with a single column or pd.Series

Time filter containing boolean values for each time index

add_signal(col_name, data)[source]¶

Add signal to the PerformanceMonitoring DataFrame.

Parameters:

col_name : string

Column name to add to translation dictionary

data : pd.DataFarame or pd.Series

Data to add to df

append_test_results(mask, error_msg, min_failures=1, variable_name=True)[source]¶

Append QC results to the PerformanceMonitoring object.

Parameters:

mask : pd.Dataframe

Result from quality control test, boolean values

error_msg : string

Error message to store with the QC results

min_failures : int (optional)

Minimum number of consecutive failures required for reporting, default = 1

variable_name : boolean (optional)

Add variable name to QC results, set to False for timestamp tests, default = True

check_timestamp(frequency, expected_start_time=None, expected_end_time=None, min_failures=1)[source]¶

Check time series for non-monotonic and duplicate timestamps.

Parameters:

frequency : int

Expected time series frequency, in seconds

expected_start_time : Timestamp (optional)

Expected start time. If not specified, the minimum timestamp is used

expected_end_time : Timestamp (optional)

Expected end time. If not specified, the maximum timestamp is used

min_failures : int (optional)

Minimum number of consecutive failures required for reporting, default = 1

check_range(bound, key=None, specs={}, rolling_mean=1, min_failures=1)[source]¶

Check data range.

Parameters:

bound : list of floats

[lower bound, upper bound], None can be used in place of a lower or upper bound

key : string (optional)

Translation dictionary key. If not specified, all columns are used in the test.

specs : dictionary (optional)

Constants used in bound

rolling_mean : int (optional)

Rolling mean window in number of time steps, default = 1

min_failures : int (optional)

Minimum number of consecutive failures required for reporting, default = 1

check_increment(bound, key=None, specs={}, increment=1, absolute_value=True, rolling_mean=1, min_failures=1)[source]¶

Check range on data increments.

Parameters:

bound : list of floats

[lower bound, upper bound], None can be used in place of a lower or upper bound

key : string (optional)

Translation dictionary key. If not specified, all columns are used in the test.

specs : dictionary (optional)

Constants used in bound

increment : int (optional)

Time step shift used to compute difference, default = 1

absolute_value : boolean (optional)

Take the absolute value of the increment data, default = True

rolling_mean : int (optional)

Rolling mean window in number of time steps, default = 1

min_failures : int (optional)

Minimum number of consecutive failures required for reporting, default = 1

check_missing(key=None, min_failures=1)[source]¶

Check for missing data

Parameters:

key : string (optional)

Translation dictionary key. If not specified, all columns are used in the test.

min_failures : int (optional)

Minimum number of consecutive failures required for reporting, default = 1

check_corrupt(corrupt_values, key=None, min_failures=1)[source]¶

Check for corrupt data.

Parameters:

corrupt_values : list of floats

List of corrupt data values

key : string (optional)

Translation dictionary key. If not specified, all columns are used in the test.

min_failures : int (optional)

Minimum number of consecutive failures required for reporting, default = 1

evaluate_string(col_name, string_to_eval, specs={})[source]¶

Returns the evaluated python equation written as a string (BETA). For each {keyword} in string_to_eval, {keyword} is first expanded to self.df[self.trans[keyword]], if that fails, then {keyword} is expanded to specs[keyword].

Parameters:

col_name : string

Column name for the new signal

string_to_eval : string

String to evaluate

specs : dictionary (optional)

Constants used as keywords

Returns:

signal : pd.DataFrame or pd.Series

DataFrame or Series with results of the evaluated string

get_elapsed_time()[source]¶

Returns the elapsed time in seconds for each Timestamp in the DataFrame index.

Returns:

elapsed_time : pd.DataFrame

Elapsed time of the DataFrame index

get_clock_time()[source]¶

Returns the time of day in seconds past midnight for each Timestamp in the DataFrame index.

Returns:

clock_time : pd.DataFrame

Clock time of the DataFrame index

get_test_results_mask(key=None)[source]¶

Return a mask of data-times that failed a quality control test.

Parameters:

key : string (optional)

Translation dictionary key. If not specified, all columns are used

Returns:

test_results_mask : pd.DataFrame

DataFrame containing boolean values for each data point, True = data point pass all tests, False = data point did not pass at least one test.