HAlphaAnomalyzer._anova_analysis

Module Contents

Functions

_preprocess_and_filter_data(data_with_ranges)

Computes the S statistic and separates data by anomaly labels.

_anova_ftest(data_with_ranges[, grid_size, ...])

Performs One-way ANOVA F-test on S statistics across grid cells and

HAlphaAnomalyzer._anova_analysis._preprocess_and_filter_data(data_with_ranges)[source]

Computes the S statistic and separates data by anomaly labels.

This function calculates the S statistic as the sum of absolute deviations between candidate range values and the average pixel values for each grid cell of the training images data. It then separates the data into two DataFrames based on the anomaly label: one for non-anomalous (label 0) and one for anomalous (label 1) data.

Parameters

data_with_rangespd.DataFrame

The DataFrame with candidate ranges for each grid cell of the training images data.

Returns

df_label_0pd.DataFrame

A DataFrame with computed S statistics for non-anomalous data (label 0).

df_label_1pd.DataFrame

A DataFrame with computed S statistics for anomalous data (label 1).

HAlphaAnomalyzer._anova_analysis._anova_ftest(data_with_ranges, grid_size=8, lower_range_end=20, upper_range_start=80, step_size=2)[source]

Performs One-way ANOVA F-test on S statistics across grid cells and candidate ranges.

This function calculates the One-way ANOVA F-test statistic between anomalous and non-anomalous images for each combination of grid cell and candidate range of the training images data using the S statistics.

Parameters

data_with_rangespd.DataFrame

The DataFrame with candidate ranges for each grid cell of the training images data.

grid_sizeint, optional

The number of rows and columns to divide each image into, by default 8.

lower_range_endint, optional

The end of candidate lower ranges, by default 20.

upper_range_startint, optional

The start of candidate upper ranges, by default 80.

step_sizeint, optional

The step size for candidate ranges, by default 2.

Returns

df_anova_resultspd.DataFrame

A DataFrame with computed F-statistic for each combination of grid cell and candidate range of the training images data.