PREP can't always flag low amplitude channels as bad-by-deviation #83

a-hurst · 2021-05-10T16:35:34Z

I've been picking away at refactoring the NoisyChannels unit tests so that each 'bad-by' type is covered by a separate test. In the process, though, I think I've discovered a weak spot in the original PREP logic: if I use a different test file, the 'bad-by-low-deviation test' below stops working:

pyprep/tests/test_find_noisy_channels.py

Lines 51 to 60 in bc46978

    
           # Test for high and low deviations in EEG data 
        
           raw_tmp = raw.copy() 
        
           m, n = raw_tmp._data.shape 
        
           # Now insert one random channel with very low deviations 
        
           rand_chn_idx = int(rng.randint(0, m, 1)) 
        
           rand_chn_lab = raw_tmp.ch_names[rand_chn_idx] 
        
           raw_tmp._data[rand_chn_idx, :] = raw_tmp._data[rand_chn_idx, :] / 10 
        
           nd = NoisyChannels(raw_tmp, random_state=rng) 
        
           nd.find_bad_by_deviation() 
        
           assert rand_chn_lab in nd.bad_by_deviation

This holds true even if I replace the "divide by 10" with a "divide by 100,000": the "robust channel deviation" z-score seems to plateau at around -4, never reaching the +/- 5 threshold for being bad by deviation.

Looking at the actual math, this makes sense:

PREP calculates the variance within each channel using 0.7413 * the interquartile range for the signal.
PREP calculates the median and variance (again, 0.7413 * the IQR) of those values and uses them to do a non-traditional Z-score of the channel variances (i.e., (variance - median_variance) / variance_of_the_variance).
PREP compares the absolute values of those Z-scores to a threshold (default = 5) and flags any channels that exceed it as "bad-by-deviation"

The problem here is that in step 2, the variances calculated in step 1 have a minimum value of zero. As such, even a channel with an amplitude of zero won't be detected as bad-by-deviation if the median isn't at least 5x the variance of the variances.

A quick-and-dirty fix is to z-score the log of the channel variances, which makes the variances more normally-distributed and thus makes the detection of low-amplitude channels much easier. However, this also increases the threshold for detecting high-amplitude bad-by-deviation channels, with the required multiplication factor going from 2.4 to 3.9 for my test channel.

Any ideas on how to better handle this?

The text was updated successfully, but these errors were encountered:

sappelhoff · 2021-05-10T18:02:22Z

Wouldn't channels with suspiciously ~~like~~ low deviation be flagged by the bad by flat check? Else, would decreasing the threshold from 5 to 3.x help? (3.x being that canonical value that I always forget the meaning of and what x actually is)

a-hurst · 2021-05-10T19:58:28Z

They'd only be flagged as bad-by-flat if their standard deviations were extremely low (i.e., practically no signal). This check seems to be for channels where they still have a signal, but its variance is just considerably lower than the median variance.

Decreasing the z-score threshold to 3.29 (what I think you're thinking of, corresponds to a two-tailed test at p < 0.001) would definitely make detecting the low-amplitude channels easier, but even then the Z-test still rests on the assumption that you're using it with a normally-distributed population. Since the population here is a bunch of variance measures (which would be right-skewed), the right thing to do would be to either use a different test that doesn't make that assumption, or to try and transform the variances so they can be treated as roughly normal.

I guess the most important question here is: how weak or strong should a signal need to be, relative to other channels in the same dataset, in order to warrant exclusion from the average reference signal? If we can figure out a good rule of thumb for that, it should be easier to figure out how to improve the stats.

sappelhoff · 2021-05-11T07:29:43Z

the right thing to do would be to either use a different test that doesn't make that assumption, or to try and transform the variances so they can be treated as roughly normal.

I think the PREP devs didn't think of what they did as a statistical test that matches certain assumptions (maybe I am wrong). --> Rather, I think they tried a bunch of thresholds on known robust metrics and "creative" robust metrics to see "what works".

Now apparently they did not notice that low deviation channels are not detected as easily - but it's also possible that they were relying on other methods to catch these channels.

If we can improve it, great. However for your question:

how weak or strong should a signal need to be, relative to other channels in the same dataset, in order to warrant exclusion from the average reference signal?

I think that's highly context dependent, and finding a good metric here can be its own scientific paper. --> If a thresh of 3.29 works for us for now, let's take it, document it, and stick with it until we find something better.

a-hurst · 2021-05-11T17:33:18Z

I think the PREP devs didn't think of what they did as a statistical test that matches certain assumptions

Probably, but I think it still makes sense to treat it like one: we're testing to see whether each channel's amplitude meaningfully deviates from all the others, so it's still a statistical procedure (like outlier removal for reaction times) if not exactly a statistical test.

Regardless, I think you're right that this deserves some simulation tests or something: maybe I'll write an R script for reading in the eegbci dataset and visualizing the effects of some different outlier exclusion techniques. Not likely to touch that for a while though, fixing the remaining MATLAB inconsistencies definitely takes priority!

sappelhoff · 2024-08-31T10:01:22Z

Reading back on this, I think the log transform may be a good change indeed. Did you mean changing these lines:

pyprep/pyprep/find_noisy_channels.py

Lines 288 to 294 in 8897804

    
           chan_amplitudes = _mat_iqr(self.EEGData, axis=1) * IQR_TO_SD 
        
           amp_sd = _mat_iqr(chan_amplitudes) * IQR_TO_SD 
        
           amp_median = np.nanmedian(chan_amplitudes) 
        
           # Calculate robust Z-scores for the channel amplitudes 
        
           amplitude_zscore = np.zeros(self.n_chans_original) 
        
           amplitude_zscore[self.usable_idx] = (chan_amplitudes - amp_median) / amp_sd

to:

-chan_amplitudes = _mat_iqr(self.EEGData, axis=1) * IQR_TO_SD 
+chan_amplitudes = np.log(_mat_iqr(self.EEGData, axis=1) * IQR_TO_SD )
...

? :-)

a-hurst mentioned this issue May 10, 2021

Refactor of NoisyChannels tests and split into separate methods #84

Merged

4 tasks

sappelhoff added this to the 0.5.0 milestone May 11, 2021

nabilalibou mentioned this issue Feb 6, 2024

Possible to add an argument in find_bad_by_nan_flat() to change FLAT_THRESHOLD ? #142

Closed

a-hurst mentioned this issue Feb 8, 2024

Add to NoisyChannel a bad_by_psd method to tackle low-frequency artefacts #143

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PREP can't always flag low amplitude channels as bad-by-deviation #83

PREP can't always flag low amplitude channels as bad-by-deviation #83

a-hurst commented May 10, 2021

sappelhoff commented May 10, 2021 •

edited

Loading

a-hurst commented May 10, 2021

sappelhoff commented May 11, 2021

a-hurst commented May 11, 2021

sappelhoff commented Aug 31, 2024

PREP can't always flag low amplitude channels as bad-by-deviation #83

PREP can't always flag low amplitude channels as bad-by-deviation #83

Comments

a-hurst commented May 10, 2021

sappelhoff commented May 10, 2021 • edited Loading

a-hurst commented May 10, 2021

sappelhoff commented May 11, 2021

a-hurst commented May 11, 2021

sappelhoff commented Aug 31, 2024

sappelhoff commented May 10, 2021 •

edited

Loading