pecos.monitoring module¶
The monitoring module contains the PerformanceMonitoring class used to run quality control tests and store results.
-
class
pecos.monitoring.
PerformanceMonitoring
[source]¶ Bases:
object
PerformanceMonitoring class
Methods
add_dataframe
(df, system_name[, ...])Add DataFrame to the PerformanceMonitoring object. add_signal
(col_name, data)Add signal to the PerformanceMonitoring DataFrame. add_time_filter
(time_filter)Add a time filter to the PerformanceMonitoring object. add_translation_dictionary
(trans, system_name)Add translation dictionary to the PerformanceMonitoring object. append_test_results
(mask, error_msg[, ...])Append QC results to the PerformanceMonitoring object. check_corrupt
(corrupt_values[, key, ...])Check for corrupt data. check_increment
(bound[, key, specs, ...])Check range on data increments. check_missing
([key, min_failures])Check for missing data check_range
(bound[, key, specs, ...])Check data range. check_timestamp
(frequency[, ...])Check time series for non-monotonic and duplicate timestamps. evaluate_string
(col_name, string_to_eval[, ...])Returns the evaluated python equation written as a string (BETA). get_clock_time
()Returns the time of day in seconds past midnight for each Timestamp in the DataFrame index. get_elapsed_time
()Returns the elapsed time in seconds for each Timestamp in the DataFrame index. get_test_results_mask
([key])Return a mask of data-times that failed a quality control test. -
add_dataframe
(df, system_name, add_identity_translation_dictionary=False)[source]¶ Add DataFrame to the PerformanceMonitoring object.
Parameters: df : pd.Dataframe
Dataframe to add to the PerformanceMonitoring object
system_name : string
System name
add_identity_translation_dictionary : boolean (optional)
Add a 1:1 translation dictionary to the PerformanceMonitoring object using all column names in df, default = False
-
add_translation_dictionary
(trans, system_name)[source]¶ Add translation dictionary to the PerformanceMonitoring object.
Parameters: trans : dictionary
Translation dictionary
system_name : string
System name
-
add_time_filter
(time_filter)[source]¶ Add a time filter to the PerformanceMonitoring object.
Parameters: time_filter : pd.DataFrame with a single column or pd.Series
Time filter containing boolean values for each time index
-
add_signal
(col_name, data)[source]¶ Add signal to the PerformanceMonitoring DataFrame.
Parameters: col_name : string
Column name to add to translation dictionary
data : pd.DataFarame or pd.Series
Data to add to df
-
append_test_results
(mask, error_msg, min_failures=1, variable_name=True)[source]¶ Append QC results to the PerformanceMonitoring object.
Parameters: mask : pd.Dataframe
Result from quality control test, boolean values
error_msg : string
Error message to store with the QC results
min_failures : int (optional)
Minimum number of consecutive failures required for reporting, default = 1
variable_name : boolean (optional)
Add variable name to QC results, set to False for timestamp tests, default = True
-
check_timestamp
(frequency, expected_start_time=None, expected_end_time=None, min_failures=1)[source]¶ Check time series for non-monotonic and duplicate timestamps.
Parameters: frequency : int
Expected time series frequency, in seconds
expected_start_time : Timestamp (optional)
Expected start time. If not specified, the minimum timestamp is used
expected_end_time : Timestamp (optional)
Expected end time. If not specified, the maximum timestamp is used
min_failures : int (optional)
Minimum number of consecutive failures required for reporting, default = 1
-
check_range
(bound, key=None, specs={}, rolling_mean=1, min_failures=1)[source]¶ Check data range.
Parameters: bound : list of floats
[lower bound, upper bound], None can be used in place of a lower or upper bound
key : string (optional)
Translation dictionary key. If not specified, all columns are used in the test.
specs : dictionary (optional)
Constants used in bound
rolling_mean : int (optional)
Rolling mean window in number of time steps, default = 1
min_failures : int (optional)
Minimum number of consecutive failures required for reporting, default = 1
-
check_increment
(bound, key=None, specs={}, increment=1, absolute_value=True, rolling_mean=1, min_failures=1)[source]¶ Check range on data increments.
Parameters: bound : list of floats
[lower bound, upper bound], None can be used in place of a lower or upper bound
key : string (optional)
Translation dictionary key. If not specified, all columns are used in the test.
specs : dictionary (optional)
Constants used in bound
increment : int (optional)
Time step shift used to compute difference, default = 1
absolute_value : boolean (optional)
Take the absolute value of the increment data, default = True
rolling_mean : int (optional)
Rolling mean window in number of time steps, default = 1
min_failures : int (optional)
Minimum number of consecutive failures required for reporting, default = 1
-
check_missing
(key=None, min_failures=1)[source]¶ Check for missing data
Parameters: key : string (optional)
Translation dictionary key. If not specified, all columns are used in the test.
min_failures : int (optional)
Minimum number of consecutive failures required for reporting, default = 1
-
check_corrupt
(corrupt_values, key=None, min_failures=1)[source]¶ Check for corrupt data.
Parameters: corrupt_values : list of floats
List of corrupt data values
key : string (optional)
Translation dictionary key. If not specified, all columns are used in the test.
min_failures : int (optional)
Minimum number of consecutive failures required for reporting, default = 1
-
evaluate_string
(col_name, string_to_eval, specs={})[source]¶ Returns the evaluated python equation written as a string (BETA). For each {keyword} in string_to_eval, {keyword} is first expanded to self.df[self.trans[keyword]], if that fails, then {keyword} is expanded to specs[keyword].
Parameters: col_name : string
Column name for the new signal
string_to_eval : string
String to evaluate
specs : dictionary (optional)
Constants used as keywords
Returns: signal : pd.DataFrame or pd.Series
DataFrame or Series with results of the evaluated string
-
get_elapsed_time
()[source]¶ Returns the elapsed time in seconds for each Timestamp in the DataFrame index.
Returns: elapsed_time : pd.DataFrame
Elapsed time of the DataFrame index
-
get_clock_time
()[source]¶ Returns the time of day in seconds past midnight for each Timestamp in the DataFrame index.
Returns: clock_time : pd.DataFrame
Clock time of the DataFrame index
-
get_test_results_mask
(key=None)[source]¶ Return a mask of data-times that failed a quality control test.
Parameters: key : string (optional)
Translation dictionary key. If not specified, all columns are used
Returns: test_results_mask : pd.DataFrame
DataFrame containing boolean values for each data point, True = data point pass all tests, False = data point did not pass at least one test.
-