:py:mod:`HAlphaAnomalyzer._anova_analysis` ========================================== .. py:module:: HAlphaAnomalyzer._anova_analysis Module Contents --------------- Functions ~~~~~~~~~ .. autoapisummary:: HAlphaAnomalyzer._anova_analysis._preprocess_and_filter_data HAlphaAnomalyzer._anova_analysis._anova_ftest .. py:function:: _preprocess_and_filter_data(data_with_ranges) Computes the S statistic and separates data by anomaly labels. This function calculates the S statistic as the sum of absolute deviations between candidate range values and the average pixel values for each grid cell of the training images data. It then separates the data into two DataFrames based on the anomaly label: one for non-anomalous (label 0) and one for anomalous (label 1) data. Parameters ---------- data_with_ranges : pd.DataFrame The DataFrame with candidate ranges for each grid cell of the training images data. Returns ------- df_label_0 : pd.DataFrame A DataFrame with computed S statistics for non-anomalous data (label 0). df_label_1 : pd.DataFrame A DataFrame with computed S statistics for anomalous data (label 1). .. py:function:: _anova_ftest(data_with_ranges, grid_size=8, lower_range_end=20, upper_range_start=80, step_size=2) Performs One-way ANOVA F-test on S statistics across grid cells and candidate ranges. This function calculates the One-way ANOVA F-test statistic between anomalous and non-anomalous images for each combination of grid cell and candidate range of the training images data using the S statistics. Parameters ---------- data_with_ranges : pd.DataFrame The DataFrame with candidate ranges for each grid cell of the training images data. grid_size : int, optional The number of rows and columns to divide each image into, by default 8. lower_range_end : int, optional The end of candidate lower ranges, by default 20. upper_range_start : int, optional The start of candidate upper ranges, by default 80. step_size : int, optional The step size for candidate ranges, by default 2. Returns ------- df_anova_results : pd.DataFrame A DataFrame with computed F-statistic for each combination of grid cell and candidate range of the training images data.