[BUG] Correct annotation onset for exportation to EDF and EEGLAB #12656

qian-chu · 2024-06-10T20:00:00Z

When cropping the start of a recording, raw.first_time is updated while annotations.onset is conveniently untouched. However, when exporting to another format where times are reset (starting from zero), annotations.onset should be corrected so that they represent relative time from the first sample.

This correction has been performed when fmt=‘brainvision’:

mne-python/mne/export/_brainvision.py

Lines 78 to 85 in e2c8010

    
           if events is not None: 
        
               # subtract raw.first_samp because brainvision marks events starting from the 
        
               # first available data point and ignores the raw.first_samp 
        
               assert isinstance(events, np.ndarray), msg 
        
               assert events.ndim == 2, msg 
        
               assert events.shape[-1] == 3, msg 
        
               events[:, 0] -= raw.first_samp 
        
               events = events[:, [0, 2]]  # reorder for pybv required order

But is curiously missing when fmt=‘edf’ or fmt=‘eeglab’:

mne-python/mne/export/_edf.py

Lines 200 to 213 in e2c8010

    
           annotations = [] 
        
           for desc, onset, duration, ch_names in zip( 
        
               raw.annotations.description, 
        
               raw.annotations.onset, 
        
               raw.annotations.duration, 
        
               raw.annotations.ch_names, 
        
           ): 
        
               if ch_names: 
        
                   for ch_name in ch_names: 
        
                       annotations.append( 
        
                           EdfAnnotation(onset, duration, desc + f"@@{ch_name}") 
        
                       ) 
        
               else: 
        
                   annotations.append(EdfAnnotation(onset, duration, desc))

mne-python/mne/export/_eeglab.py

Lines 28 to 32 in e2c8010

    
           annotations = [ 
        
               raw.annotations.description, 
        
               raw.annotations.onset, 
        
               raw.annotations.duration, 
        
           ]

This PR aims to fix this by performing the similar correction (annotations.onset - raw.first_time

for more information, see https://pre-commit.ci

qian-chu · 2024-06-10T20:04:47Z

mne/export/tests/test_export.py

+@pytest.mark.parametrize("tmin", (0, 1, 5, 10))
+def test_export_raw_eeglab_annotations(tmp_path, tmin):
+    """Test that exporting EEGLAB preserves annotations and corects for raw.first_time."""
+    pytest.importorskip("eeglabio")
+    raw = read_raw_fif(fname_raw, preload=True)
+    raw.apply_proj()
+    annotations = Annotations(
+        onset=[0.01, 0.05, 0.90, 1.05],
+        duration=[0, 1, 0, 0],
+        description=["test1", "test2", "test3", "test4"],
+        ch_names=[["MEG 0113"], ["MEG 0113", "MEG 0132"], [], ["MEG 0143"]],
+    )
+    raw.set_annotations(annotations)
+    raw.crop(tmin)
+
+    # export
+    temp_fname = tmp_path / "test.set"
+    raw.export(temp_fname)
+
+    # read in the file
+    with pytest.warns(RuntimeWarning, match="is above the 99th percentile"):
+        raw_read = read_raw_eeglab(temp_fname, preload=True, montage_units="m")
+    assert raw_read.first_time == 0
+
+    valid_annot = raw.annotations.onset >= tmin
+    assert_array_almost_equal(
+        raw.annotations.onset[valid_annot] - raw.first_time,
+        raw_read.annotations.onset - raw_read.first_time,
+    )
+    assert_array_equal(
+        raw.annotations.duration[valid_annot], raw_read.annotations.duration
+    )
+    assert_array_equal(
+        raw.annotations.description[valid_annot], raw_read.annotations.description
+    )


Actually should this annotation test for EEGLAB been written before, the bug should be noticable because the test Raw has actually been cropped. The saved onset would not correspond to the original onset.

for more information, see https://pre-commit.ci

cbrnr · 2024-06-11T05:05:26Z

mne/export/tests/test_export.py

+    if tmin % 1 == 0:
+        expectation = nullcontext()
+    else:
+        expectation = pytest.warns(
+            RuntimeWarning, match="EDF format requires equal-length data blocks"
+        )


What is this check doing? If you are checking for tmin to be an integer, you could also use tmin.is_integer(), but is this what is required to have "equal-length data blocks"?

Because the constructed raw signal is 2 sec long and edfio can segment it into 2 data records of 1 sec. If a non-integer amount of time is cropped, then the signal is no longer a multiple of 1 sec and edfio will append zeroes and issue a RuntimeWarning. Maybe this should have been a test on its own but I'm adding it here since pytest wouldn't pass me otherwise.

As for the %1 == 0 condition, I was thinking to make space for more flexible use should I know how edfio determines data record length. For example if one can specify a data record length of .5 or 2 s, then the statement can be replaced with %data_length == 0. But I agree it looks uncessary in its current form.

cbrnr · 2024-06-11T05:07:35Z

When cropping the start of a recording, raw.first_time is corrected while annotations.onset is conveniently untouched.

Hmmm, really? Annotations are not affected by cropping? Why would that be convenient?

qian-chu · 2024-06-11T07:47:02Z

When cropping the start of a recording, raw.first_time is corrected while annotations.onset is conveniently untouched.

Hmmm, really? Annotations are not affected by cropping? Why would that be convenient?

At least when annotations.orig_time is not None:

mne-python/mne/io/base.py

Lines 1560 to 1570 in e2c8010

    
           annotations = self.annotations 
        
           # now call setter to filter out annotations outside of interval 
        
           if annotations.orig_time is None: 
        
               assert self.info["meas_date"] is None 
        
               # When self.info['meas_date'] is None (which is guaranteed if 
        
               # self.annotations.orig_time is None), when we do the 
        
               # self.set_annotations, it's assumed that the annotations onset 
        
               # are relative to first_time, so we have to subtract it, then 
        
               # set_annotations will put it back. 
        
               annotations.onset -= self.first_time 
        
           self.set_annotations(annotations, False)

So that when annotations have their own time reference, cropping the data wouldn't affect them.

Actually this is a good reminder that we might need to account for different annotations.orig_time. The onset correction should have been corrected when annotations.orig_time is None. Will do a fix later.

cbrnr · 2024-06-11T10:56:45Z

To be honest, I didn't even know that annotations work like that. I always thought that annotations.onset are onsets in seconds relative to the start of the data. So I expected that cropping should shift the onsets in general, not just for export. If that was the case, we would not have to deal with correcting onsets on export in the first place.

qian-chu · 2024-06-11T12:18:37Z

To be honest, I didn't even know that annotations work like that. I always thought that annotations.onset are onsets in seconds relative to the start of the data. So I expected that cropping should shift the onsets in general, not just for export. If that was the case, we would not have to deal with correcting onsets on export in the first place.

I'm not too sure how that will work out. Do you mean that cropping should always reset annotations.onset and then force annotations.orig_time=None?

----------- meas_date=XX, orig_time=YY -----------------------------

     |              +------------------+
     |______________|     RAW          |
     |              |                  |
     |              +------------------+
 meas_date      first_samp
     .
     .         |         +------+
     .         |_________| ANOT |
     .         |         |      |
     .         |         +------+
     .     orig_time   onset[0]
     .
     |                   +------+
     |___________________|      |
     |                   |      |
     |                   +------+
 orig_time            onset[0]'

----------- meas_date=XX, orig_time=None ---------------------------

     |              +------------------+
     |______________|     RAW          |
     |              |                  |
     |              +------------------+
     .              N         +------+
     .              o_________| ANOT |
     .              n         |      |
     .              e         +------+
     .
     |                        +------+
     |________________________|      |
     |                        |      |
     |                        +------+
 orig_time                 onset[0]'

----------- meas_date=None, orig_time=YY ---------------------------

     N              +------------------+
     o______________|     RAW          |
     n              |                  |
     e              +------------------+
               |         +------+
               |_________| ANOT |
               |         |      |
               |         +------+

            [[[ CRASH ]]]

----------- meas_date=None, orig_time=None -------------------------

     N              +------------------+
     o______________|     RAW          |
     n              |                  |
     e              +------------------+
     .              N         +------+
     .              o_________| ANOT |
     .              n         |      |
     .              e         +------+
     .
     N                        +------+
     o________________________|      |
     n                        |      |
     e                        +------+
 orig_time                 onset[0]'

cbrnr · 2024-07-03T14:45:52Z

Maybe it's just me, but I gave up trying to understand how this works. The ASCII diagram is probably meant to be helpful, but for me it is the complete opposite, I have no idea how these different concepts (meas_date, orig_time, first_samp, and whatnot) actually work, sorry.

hoechenberger · 2024-07-03T14:48:33Z

I agree, I've tried several times over the past couple of years to decipher what it's trying to tell me and at one point just gave up. It's just been trial and error for me regarding all things annotations ever since 😅

qian-chu · 2024-07-03T14:54:38Z

Maybe it's just me, but I gave up trying to understand how this works. The ASCII diagram is probably meant to be helpful, but for me it is the complete opposite, I have no idea how these different concepts (meas_date, orig_time, first_samp, and whatnot) actually work, sorry.

It's definitely OK! As I'm re-looking at this PR after some time I'm also struggling to wrap my head around this system. FYI this diagram was copied from https://mne.tools/dev/generated/mne.Annotations.html.

One potential conflict I found is, the diagram says when meas_date=None, orig_time=YY it should result in error, yet in crop() it asserts the following:

mne-python/mne/io/base.py

Lines 1562 to 1563 in 4954672

    
           if annotations.orig_time is None: 
        
               assert self.info["meas_date"] is None

instead of

if self.info["meas_date"] is None:
     assert annotations.orig_time is None

If someone who's familiar with the design can clarify that would be great. But I do confirm that the EDF and EEGLAB export will malfunction without correcting for first_time so eventually we would want this fix.

qian-chu · 2024-12-13T14:57:39Z

Coming back to this issue after some time, I re-confirmed the existence of the problem with a minimalist code:

import numpy as np
from mne import create_info, Annotations
from mne.io import RawArray, read_raw_brainvision, read_raw_edf, read_raw_eeglab
from mne.viz import set_browser_backend

set_browser_backend('qt')

# Create a raw object of SR 1000 Hz, all zero, except for 1s of 1e-6 from 2-3s
data = np.zeros((1, 5000))
data[0, 2000:3000] = 1
scalings = dict(eeg=1)

info = create_info(['CH1'], 1000, ['eeg'])
raw_orig = RawArray(data, info)

annot = Annotations(onset=[2], duration=[1], description=['stim'])
raw_orig.set_annotations(annot)

# Crop raw to 1-5s
raw_orig.crop(1)
fig_orig = raw_orig.plot(scalings=scalings)
fig_orig.grab().save('orig.png')

# Export to BrainVision and re-read
raw_orig.export('test.vhdr')
raw_brainvision = read_raw_brainvision('test.vhdr')
fig_brainvision = raw_brainvision.plot(scalings=scalings)
fig_brainvision.grab().save('brainvision.png')

# Export to EDF and re-read
raw_orig.export('test.edf')
raw_edf = read_raw_edf('test.edf')
fig_edf = raw_edf.plot(scalings=scalings)
fig_edf.grab().save('edf.png')

# Export to EEGLAB and re-read
raw_orig.export('test.set')
raw_eeglab = read_raw_eeglab('test.set')
fig_eeglab = raw_eeglab.plot(scalings=scalings)
fig_eeglab.grab().save('eeglab.png')

Outputs using the current `main` of MNE

Original raw array

BrainVision (functions properly)

EDF

EEGLAB

Outputs using the PR branch

Original raw array

BrainVision

EDF

EEGLAB

cbrnr · 2024-12-13T15:52:32Z

Very nice, thanks for this example, this makes the issue really easy to see! Could you take a look at the failing tests?

for more information, see https://pre-commit.ci

qian-chu · 2024-12-14T15:50:21Z

mne/export/_edf.py

-        raw.annotations.onset,
+        # subtract raw.first_time because EDF marks events starting from the first
+        # available data point and ignores raw.first_time
+        _sync_onset(raw, raw.annotations.onset, inverse=False),


Also re-wrote the tests and they are passed without an issue now :D From my side it's in principle good to go.

One final note though is I'm using _sync_onset, which in addition to performing annot_start = onset - raw._first_time also assert raw.info["meas_date"] == raw.annotations.orig_time. Therefore, this would enforce users to export only Raw that has identical meas_date and orig_time.

correct annotation onset during exportation

b2aa09b

qian-chu requested review from sappelhoff and cbrnr as code owners June 10, 2024 20:00

[pre-commit.ci] auto fixes from pre-commit.com hooks

1643b3c

for more information, see https://pre-commit.ci

qian-chu commented Jun 10, 2024

View reviewed changes

Create 12656.bugfix.rst

c837852

qian-chu requested review from larsoner, drammock, agramfort and dengemann as code owners June 10, 2024 20:10

qian-chu changed the title ~~Correct annotation onset for exportation to EDF and EEGLAB~~ [BUG] Correct annotation onset for exportation to EDF and EEGLAB Jun 10, 2024

qian-chu and others added 6 commits June 10, 2024 22:47

formatting

bbe6501

[pre-commit.ci] auto fixes from pre-commit.com hooks

1eda973

for more information, see https://pre-commit.ci

formatting func desc

78eb8ac

summary line and desc separation

590cf16

fix typo

970303c

add period

cc96f10

cbrnr reviewed Jun 11, 2024

View reviewed changes

Merge branch 'main' into correct_annot_export

51b8c1f

qian-chu added 3 commits August 24, 2024 22:20

Merge branch 'mne-tools:main' into correct_annot_export

bd2fa76

use _sync_onset

f8f118a

Merge branch 'mne-tools:main' into correct_annot_export

3b707e4

qian-chu and others added 2 commits December 13, 2024 21:22

correct tests

e556b96

[pre-commit.ci] auto fixes from pre-commit.com hooks

ab728d3

for more information, see https://pre-commit.ci

qian-chu commented Dec 14, 2024

View reviewed changes

Merge branch 'main' into correct_annot_export

e07de28

qian-chu requested a review from cbrnr December 16, 2024 18:23

qian-chu added 2 commits December 16, 2024 19:23

Merge branch 'main' into correct_annot_export

0a7f2f0

Merge branch 'main' into correct_annot_export

2e40753

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Correct annotation onset for exportation to EDF and EEGLAB #12656

[BUG] Correct annotation onset for exportation to EDF and EEGLAB #12656

qian-chu commented Jun 10, 2024 •

edited

Loading

qian-chu Jun 10, 2024

cbrnr Jun 11, 2024

qian-chu Jun 11, 2024

cbrnr commented Jun 11, 2024

qian-chu commented Jun 11, 2024

cbrnr commented Jun 11, 2024

qian-chu commented Jun 11, 2024

cbrnr commented Jul 3, 2024

hoechenberger commented Jul 3, 2024 •

edited

Loading

qian-chu commented Jul 3, 2024

qian-chu commented Dec 13, 2024

cbrnr commented Dec 13, 2024

qian-chu Dec 14, 2024

	if events is not None:
	# subtract raw.first_samp because brainvision marks events starting from the
	# first available data point and ignores the raw.first_samp
	assert isinstance(events, np.ndarray), msg
	assert events.ndim == 2, msg
	assert events.shape[-1] == 3, msg
	events[:, 0] -= raw.first_samp
	events = events[:, [0, 2]] # reorder for pybv required order

	annotations = []
	for desc, onset, duration, ch_names in zip(
	raw.annotations.description,
	raw.annotations.onset,
	raw.annotations.duration,
	raw.annotations.ch_names,
	):
	if ch_names:
	for ch_name in ch_names:
	annotations.append(
	EdfAnnotation(onset, duration, desc + f"@@{ch_name}")
	)
	else:
	annotations.append(EdfAnnotation(onset, duration, desc))

	annotations = [
	raw.annotations.description,
	raw.annotations.onset,
	raw.annotations.duration,
	]

[BUG] Correct annotation onset for exportation to EDF and EEGLAB #12656

Are you sure you want to change the base?

[BUG] Correct annotation onset for exportation to EDF and EEGLAB #12656

Conversation

qian-chu commented Jun 10, 2024 • edited Loading

qian-chu Jun 10, 2024

Choose a reason for hiding this comment

cbrnr Jun 11, 2024

Choose a reason for hiding this comment

qian-chu Jun 11, 2024

Choose a reason for hiding this comment

cbrnr commented Jun 11, 2024

qian-chu commented Jun 11, 2024

cbrnr commented Jun 11, 2024

qian-chu commented Jun 11, 2024

cbrnr commented Jul 3, 2024

hoechenberger commented Jul 3, 2024 • edited Loading

qian-chu commented Jul 3, 2024

qian-chu commented Dec 13, 2024

Outputs using the current main of MNE

Original raw array

BrainVision (functions properly)

EDF

EEGLAB

Outputs using the PR branch

Original raw array

BrainVision

EDF

EEGLAB

cbrnr commented Dec 13, 2024

qian-chu Dec 14, 2024

Choose a reason for hiding this comment

qian-chu commented Jun 10, 2024 •

edited

Loading

hoechenberger commented Jul 3, 2024 •

edited

Loading

Outputs using the current `main` of MNE