pecos.monitoring.PerformanceMonitoring module

class pecos.monitoring.PerformanceMonitoring.PerformanceMonitoring[source]

Bases: object

Performance Monitoring class

Methods

add_dataframe(df, system_name[, ...]) Add dataframe to the PerformanceMonitoring class
add_signal(col_name, df) Add signal to the PerformanceMonitoring dataframe
add_time_filter(time_filter) Add a time filter to the PerformanceMonitoring class
add_translation_dictonary(trans, system_name) Add translation dictonary to the PerformanceMonitoring class
append_test_results(mask, error_msg[, ...]) Append QC results to the PerformanceMonitoring class
check_corrupt(corrupt_values[, key, ...]) Check for corrupt data
check_increment(bound[, key, specs, ...]) Check range on data increments
check_missing([key, min_failures]) Check for missing data
check_range(bound[, key, specs, ...]) Check data range
check_timestamp(frequency[, ...]) Check time series for non-monotonic and duplicate timestamps.
evaluate_string(col_name, string_to_eval[, ...]) Returns the evaluated python equation written as a string (BETA).
get_clock_time() Returns clock time in seconds from the dataframe index
get_elapsed_time() Returns elapsed time in seconds from the dataframe index
get_test_results_mask([key]) Return a mask of data-times that failed a quality control test
add_dataframe(df, system_name, add_identity_translation_dictonary=False)[source]

Add dataframe to the PerformanceMonitoring class

Parameters:

df : pd.Dataframe

Dataframe to add to the Performance Monitoring class

system_name : string

System name

add_identity_translation_dictonary : bool (default = False)

Add a 1:1 translation dictonary to the Performance Monitoring class using all column names in df

add_translation_dictonary(trans, system_name)[source]

Add translation dictonary to the PerformanceMonitoring class

Parameters:

trans : dictonary

Translation dictonary

system_name : string

System name

add_time_filter(time_filter)[source]

Add a time filter to the PerformanceMonitoring class

Parameters:

time_filter : pd.Series

Time filter containing boolean values for each time index

add_signal(col_name, df)[source]

Add signal to the PerformanceMonitoring dataframe

Parameters:

col_name : string

Column name to add to translation dictonary

df : pd.DataFarame

DataFrame to add to df

append_test_results(mask, error_msg, min_failures=1, variable_name=True)[source]

Append QC results to the PerformanceMonitoring class

Parameters:

mask : pd.Dataframe

Result from QC test, boolean values.

error_msg : string

Error message to store with the QC results

min_failures : int

Minimum number of consecutive failures required for reporting

variable_name : bool (default = True)

Add variable name to QC results, set to False for timestamp tests

check_timestamp(frequency, expected_start_time=None, expected_end_time=None, min_failures=1)[source]

Check time series for non-monotonic and duplicate timestamps.

Parameters:

frequency : int

Expected timeseries frequency, in seconds

expected_start_time : Timestamp (default = None)

Expected start time. If not specified, the minimimum timestamp is used.

expected_end_time : Timestamp (default = None)

Expected end time. If not specified, the maximum timestamp is used.

min_failures : int (default = 1)

Minimum number of consecutive failures required for reporting

check_range(bound, key=None, specs={}, rolling_mean=1, min_failures=1)[source]

Check data range

Parameters:

bound : list

[lower bound, upper bound], None can be used in place of a lower or upper bound

key : string (default = None)

Translation dictonary key. If not specified, all columns are used in the test.

specs : dict (default = {})

Constants used in bound

rolling_mean : int (default = 1)

Rolling mean window in number of timesteps

min_failures : int (default = 1)

Minimum number of consecutive failures required for reporting

check_increment(bound, key=None, specs={}, increment=1, absolute_value=True, rolling_mean=1, min_failures=1)[source]

Check range on data increments

Parameters:

bound : list

[lower bound, upper bound], None can be used in place of a lower or upper bound

key : string (default = None)

Translation dictonary key. If not specified, all columns are used in the test.

specs : dict (default = {})

Constants used in bound

increment : int (default = 1)

Timestep shift used to compute difference

absolute_value : bool (default = True)

Take the absolute value of the increment data

rolling_mean : int (default = 1)

Rolling mean window in number of timesteps

min_failures : int (default = 1)

Minimum number of consecutive failures required for reporting

check_missing(key=None, min_failures=1)[source]

Check for missing data

Parameters:

key : string (default = None)

Translation dictonary key. If not specified, all columns are used in the test.

min_failures : int (default = 1)

Minimum number of consecutive failures required for reporting

check_corrupt(corrupt_values, key=None, min_failures=1)[source]

Check for corrupt data

Parameters:

corrupt_values : list

List of corrupt data values

key : string (default = None)

Translation dictonary key. If not specified, all columns are used in the test.

min_failures : int (default = 1)

Minimum number of consecutive failures required for reporting

evaluate_string(col_name, string_to_eval, specs={})[source]

Returns the evaluated python equation written as a string (BETA). For each [keyword] in string_to_eval, [keyword] is first expanded to self.df[self.trans[keyword]], if that fails, then [keyword] is expanded to specs[keyword].

Parameters:

col_name : string

Column name for the new signal

string_to_eval : string

String to evaluate

specs : dict (default = {})

Constants used as keywords

Returns:

signal : pd.DataFrame or pd.Series

DataFrame or Series with results of the evaluated string

get_elapsed_time()[source]

Returns elapsed time in seconds from the dataframe index

Returns:

elapsed_time : pd.DataFrame

Elapsed time of the dataframe index

get_clock_time()[source]

Returns clock time in seconds from the dataframe index

Returns:

clock_time : pd.DataFrame

Clock time of the dataframe index

get_test_results_mask(key=None)[source]

Return a mask of data-times that failed a quality control test

Parameters:

key : string (default = None)

Translation dictonary key. If not specified, all columns are used

Returns:

test_results_mask : pd.DataFrame

DataFrame containing boolean values for each data point, True = data point pass all tests, False = data point did not pass at least one test.