pecos.monitoring module

The monitoring module contains the PerformanceMonitoring class used to run quality control tests and store results.

class pecos.monitoring.PerformanceMonitoring[source]

Bases: object

PerformanceMonitoring class

Methods

add_dataframe(df, system_name[, ...]) Add DataFrame to the PerformanceMonitoring object.
add_signal(col_name, data) Add signal to the PerformanceMonitoring DataFrame.
add_time_filter(time_filter) Add a time filter to the PerformanceMonitoring object.
add_translation_dictionary(trans, system_name) Add translation dictionary to the PerformanceMonitoring object.
append_test_results(mask, error_msg[, ...]) Append QC results to the PerformanceMonitoring object.
check_corrupt(corrupt_values[, key, ...]) Check for corrupt data.
check_increment(bound[, key, specs, ...]) Check range on data increments.
check_missing([key, min_failures]) Check for missing data
check_range(bound[, key, specs, ...]) Check data range.
check_timestamp(frequency[, ...]) Check time series for non-monotonic and duplicate timestamps.
evaluate_string(col_name, string_to_eval[, ...]) Returns the evaluated python equation written as a string (BETA).
get_clock_time() Returns the time of day in seconds past midnight for each Timestamp in the DataFrame index.
get_elapsed_time() Returns the elapsed time in seconds for each Timestamp in the DataFrame index.
get_test_results_mask([key]) Return a mask of data-times that failed a quality control test.
add_dataframe(df, system_name, add_identity_translation_dictionary=False)[source]

Add DataFrame to the PerformanceMonitoring object.

Parameters:

df : pd.Dataframe

Dataframe to add to the PerformanceMonitoring object

system_name : string

System name

add_identity_translation_dictionary : boolean (optional)

Add a 1:1 translation dictionary to the PerformanceMonitoring object using all column names in df, default = False

add_translation_dictionary(trans, system_name)[source]

Add translation dictionary to the PerformanceMonitoring object.

Parameters:

trans : dictionary

Translation dictionary

system_name : string

System name

add_time_filter(time_filter)[source]

Add a time filter to the PerformanceMonitoring object.

Parameters:

time_filter : pd.DataFrame with a single column or pd.Series

Time filter containing boolean values for each time index

add_signal(col_name, data)[source]

Add signal to the PerformanceMonitoring DataFrame.

Parameters:

col_name : string

Column name to add to translation dictionary

data : pd.DataFarame or pd.Series

Data to add to df

append_test_results(mask, error_msg, min_failures=1, variable_name=True)[source]

Append QC results to the PerformanceMonitoring object.

Parameters:

mask : pd.Dataframe

Result from quality control test, boolean values

error_msg : string

Error message to store with the QC results

min_failures : int (optional)

Minimum number of consecutive failures required for reporting, default = 1

variable_name : boolean (optional)

Add variable name to QC results, set to False for timestamp tests, default = True

check_timestamp(frequency, expected_start_time=None, expected_end_time=None, min_failures=1)[source]

Check time series for non-monotonic and duplicate timestamps.

Parameters:

frequency : int

Expected time series frequency, in seconds

expected_start_time : Timestamp (optional)

Expected start time. If not specified, the minimum timestamp is used

expected_end_time : Timestamp (optional)

Expected end time. If not specified, the maximum timestamp is used

min_failures : int (optional)

Minimum number of consecutive failures required for reporting, default = 1

check_range(bound, key=None, specs={}, rolling_mean=1, min_failures=1)[source]

Check data range.

Parameters:

bound : list of floats

[lower bound, upper bound], None can be used in place of a lower or upper bound

key : string (optional)

Translation dictionary key. If not specified, all columns are used in the test.

specs : dictionary (optional)

Constants used in bound

rolling_mean : int (optional)

Rolling mean window in number of time steps, default = 1

min_failures : int (optional)

Minimum number of consecutive failures required for reporting, default = 1

check_increment(bound, key=None, specs={}, increment=1, absolute_value=True, rolling_mean=1, min_failures=1)[source]

Check range on data increments.

Parameters:

bound : list of floats

[lower bound, upper bound], None can be used in place of a lower or upper bound

key : string (optional)

Translation dictionary key. If not specified, all columns are used in the test.

specs : dictionary (optional)

Constants used in bound

increment : int (optional)

Time step shift used to compute difference, default = 1

absolute_value : boolean (optional)

Take the absolute value of the increment data, default = True

rolling_mean : int (optional)

Rolling mean window in number of time steps, default = 1

min_failures : int (optional)

Minimum number of consecutive failures required for reporting, default = 1

check_missing(key=None, min_failures=1)[source]

Check for missing data

Parameters:

key : string (optional)

Translation dictionary key. If not specified, all columns are used in the test.

min_failures : int (optional)

Minimum number of consecutive failures required for reporting, default = 1

check_corrupt(corrupt_values, key=None, min_failures=1)[source]

Check for corrupt data.

Parameters:

corrupt_values : list of floats

List of corrupt data values

key : string (optional)

Translation dictionary key. If not specified, all columns are used in the test.

min_failures : int (optional)

Minimum number of consecutive failures required for reporting, default = 1

evaluate_string(col_name, string_to_eval, specs={})[source]

Returns the evaluated python equation written as a string (BETA). For each {keyword} in string_to_eval, {keyword} is first expanded to self.df[self.trans[keyword]], if that fails, then {keyword} is expanded to specs[keyword].

Parameters:

col_name : string

Column name for the new signal

string_to_eval : string

String to evaluate

specs : dictionary (optional)

Constants used as keywords

Returns:

signal : pd.DataFrame or pd.Series

DataFrame or Series with results of the evaluated string

get_elapsed_time()[source]

Returns the elapsed time in seconds for each Timestamp in the DataFrame index.

Returns:

elapsed_time : pd.DataFrame

Elapsed time of the DataFrame index

get_clock_time()[source]

Returns the time of day in seconds past midnight for each Timestamp in the DataFrame index.

Returns:

clock_time : pd.DataFrame

Clock time of the DataFrame index

get_test_results_mask(key=None)[source]

Return a mask of data-times that failed a quality control test.

Parameters:

key : string (optional)

Translation dictionary key. If not specified, all columns are used

Returns:

test_results_mask : pd.DataFrame

DataFrame containing boolean values for each data point, True = data point pass all tests, False = data point did not pass at least one test.