[MRG] Add tiny BIDS test dataset, fix doctests, and run it in CI (#831)

* fix doctests in path.py * test: use pytest --doctest-modules * fix some more examples * make use of _write_json instead of json.dump * fix fine-calib and crosstalk doctest * DATA: add tiny_bids test dataset ~800kb * fix all doctests using new test ds * add code for generating tiny_bids * fix example, sphinx warning * use tiny_bids in bidspath example, fix #829 * write TSV files with newline character * add missing newlines at end of files * do not run pytest on examples we did not do that before either, but --doctest-modules is now ON and that option apparently also runs the examples * add whatsnew for TSV line end * properly ignore examples in pytest * fix head_to_mri upstream API change * xfail doctest on windows due to / vs \ * BIDSPath __str__ always .as_posix() * fix op.join -> Path * root in __repr__ --> posix path or None * circumvent bids_path.root __str__ * 1) fix pep, 2) fix type hints, 3) fix example ds * fix type hint * I should run make pep before pushing
mne-tools · Jul 14, 2021 · 0db039f · 0db039f
1 parent 2d8a721
commit 0db039f
Show file tree

Hide file tree

Showing 24 changed files with 722 additions and 100 deletions.
diff --git a/.github/workflows/unit_tests.yml b/.github/workflows/unit_tests.yml
@@ -176,7 +176,14 @@ jobs:
       run: |
         export BIDS_VALIDATOR_VERSION=`bids-validator --version`
         echo Using bids-validator $BIDS_VALIDATOR_VERSION
-        python -m pytest . --cov=mne_bids mne_bids/tests/ mne_bids/commands/tests/ --cov-report=xml --cov-config=setup.cfg --verbose --ignore mne-python
+        python -m pytest . \
+        --doctest-modules \
+        --cov=mne_bids mne_bids/tests/ mne_bids/commands/tests/ \
+        --cov-report=xml \
+        --cov-config=setup.cfg \
+        --verbose \
+        --ignore mne-python \
+        --ignore examples
       shell: bash
     - name: Upload coverage stats to codecov
       if: ${{ matrix.os == 'ubuntu-latest' && matrix.python-version == '3.9' && matrix.bids-validator == 'main' }}

diff --git a/doc/whats_new.rst b/doc/whats_new.rst
@@ -26,14 +26,20 @@ Authors
 * `Alex Rockhill`_
 * `Richard Höchenberger`_
 * `Adam Li`_
+* `Eduard Ort`_
 * `Richard Köhler`_ (new contributor)
 * `Jean-Rémi King`_ (new contributor)
+* `Sin Kim`_ (new contributor)
+* `Alexandre Gramfort`_
+* `Mainak Jas`_
+* `Stefan Appelhoff`_
 
 Detailed list of changes
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
 Enhancements
 ^^^^^^^^^^^^
+
 - The fields "DigitizedLandmarks" and "DigitizedHeadPoints" in the json sidecar of Neuromag data are now set to True/False depending on whether any landmarks (NAS, RPA, LPA) or extra points are found in raw.info['dig'], by `Eduard Ort`_ (:gh:`772`)
 - Updated the "Read BIDS datasets" example to use data from `OpenNeuro <https://openneuro.org>`_, by `Alex Rockhill`_ (:gh:`753`)
 - :func:`mne_bids.get_head_mri_trans` is now more lenient when looking for the fiducial points (LPA, RPA, and nasion) in the MRI JSON sidecar file, and accepts a larger variety of landmark names (upper- and lowercase letters; ``'nasion'`` instead of only ``'NAS'``), by `Richard Höchenberger`_ (:gh:`769`)
@@ -56,8 +62,9 @@ API and behavior changes
 - The ``raw_to_bids`` command has lost its ``--allow_maxshield`` parameter. If writing a FIFF file, we will now always assume that writing data before applying a Maxwell filter is fine, by `Richard Höchenberger`_ (:gh:`787`)
 - :meth:`mne_bids.BIDSPath.find_empty_room` now first looks for an ``AssociatedEmptyRoom`` field in the MEG JSON sidecar file to retrieve the empty-room recording; only if this information is missing, it will proceed to try and find the best-matching empty-room recording based on measurement date (i.e., fall back to the previous behavior), by `Richard Höchenberger`_ (:gh:`795`)
 - If :func:`mne_bids.read_raw_bids` encounters raw data with the ``STI 014`` stimulus channel and this channel is not explicitly listed in ``*_channels.tsv``, it is now automatically removed upon reading, by `Richard Höchenberger`_ (:gh:`823`)
-- :func:`mne_bids.get_anat_landmarks` was added to clarify and simplify the process of generating landmarks that now need to be passed to :func:`mne_bids.write_anat`; this depreciates the arguments ``raw``, ``trans`` and ``t1w`` of :func:`mne_bids.write_anat`, by `Alex Rockhill`_ and `Alexandre Gramfort`_ (:gh:`827`)
-- :func:`write_raw_bids` now accepts preloaded raws as input with some caveats if the new parameter ``allow_preload`` is explicitly set to ``True``. This enables some preliminary support for uncommon file formats, generated data, processed derivatives etc., by `Sin Kim`_ (:gh:`819`)
+- :func:`mne_bids.get_anat_landmarks` was added to clarify and simplify the process of generating landmarks that now need to be passed to :func:`mne_bids.write_anat`; this deprecates the arguments ``raw``, ``trans`` and ``t1w`` of :func:`mne_bids.write_anat`, by `Alex Rockhill`_ and `Alexandre Gramfort`_ (:gh:`827`)
+- :func:`write_raw_bids` now accepts preloaded raws as input with some caveats if the new parameter ``allow_preload`` is explicitly set to ``True``. This enables some preliminary support for items such as uncommon file formats, generated data, and processed derivatives, by `Sin Kim`_ (:gh:`819`)
+- MNE-BIDS now writes all TSV data files with a newline character at the end of the file, complying with UNIX/POSIX standards, by `Stefan Appelhoff`_ (:gh:`831`)
 
 Requirements
 ^^^^^^^^^^^^

diff --git a/examples/bidspath.py b/examples/bidspath.py
@@ -14,6 +14,9 @@
 # %%
 # Obviously, to start exploring BIDSPath, we first need to import it.
 
+from pathlib import Path
+
+import mne_bids
 from mne_bids import BIDSPath
 
 # %%
@@ -27,13 +30,15 @@
 # consider where to store your data upon BIDS conversion. Again, the intended
 # target folder will be the BIDS root of your data.
 #
-# Let's just pick an arbitrary BIDS root, for the purpose of this
-# demonstration.
+# For the purpose of this demonstration, let's pick the ``tiny_bids`` example
+# dataset that ships with the MNE-BIDS test suite.
 
-bids_root = './my_bids_root'
+# We are using a pathlib.Path object for convenience, but you could just use
+# a string to specify ``bids_root`` here.
+bids_root = Path(mne_bids.__file__).parent / 'tests' / 'data' / 'tiny_bids'
 
 # %%
-# This refers to a folder named `my_bids_root` in the current working
+# This refers to a folder named ``my_bids_root`` in the current working
 # directory. Finally, let is create a ``BIDSPath``, and tell it about our
 # BIDS root. We can then also query the ``BIDSPath`` for its root.
 
@@ -45,7 +50,7 @@
 # identifiers**. We can either create a new ``BIDSPath``, or update our
 # existing one. The value can be retrieved via the ``.subject`` attribute.
 
-subject = '123'
+subject = '01'
 
 # Option 1: Create an entirely new BIDSPath.
 bids_path_new = BIDSPath(subject=subject, root=bids_root)
@@ -66,7 +71,7 @@
 # information on our experimental session, and try to retrieve it again via
 # ``.session``.
 
-session = 'test'
+session = 'eeg'
 bids_path.update(session=session)
 print(bids_path.session)
 
@@ -78,7 +83,7 @@
 # using `mne_bids.write_raw_bids`. For the sake of this example, however, we
 # are going to specify the data type explicitly.
 
-datatype = 'meg'
+datatype = 'eeg'
 bids_path.update(datatype=datatype)
 print(bids_path.datatype)
 
@@ -110,7 +115,7 @@
 # %%
 # The two entities you can see here are the ``subject`` entity (``sub``) and
 # the ``session`` entity (``ses``). Each entity name also has a value; for
-# ``sub``, this is ``123``, and for ``ses``, it is ``test`` in our example.
+# ``sub``, this is ``01``, and for ``ses``, it is ``eeg`` in our example.
 # Entity names (or "keys") and values are separated via hyphens.
 # BIDS knows a much larger number of entities, and MNE-BIDS allows you to make
 # use of them. To get a list of all supported entities, use:
@@ -129,11 +134,25 @@
 
 # %%
 # As you can see, the ``basename`` has been updated. In fact, the entire
-# **path** has been updated, and the ``ses-test`` folder has been dropped from
+# **path** has been updated, and the ``ses-eeg`` folder has been dropped from
 # the path:
 
 print(bids_path.fpath)
 
+# %%
+# Oups! The cell above produced a ``RuntimeWarning`` that our data file could
+# not be found. That's because we changed the ``run`` and ``session`` entities
+# above, and the ``tiny_bids`` dataset does not contain corresponding data.
+#
+# That shows us that ``BIDSPath`` is doing a lot of guess-work and checking
+# in the background, but note that this may change in the future.
+#
+# For now, let's revert to the last working iteration of our ``bids_path``
+# instance.
+
+bids_path.update(run=None, session='eeg')
+print(bids_path.fpath)
+
 # %%
 # Awesome! We're almost done! Two important things are still missing, though:
 # the so-called **suffix** and the filename **extension**. Sometimes these
@@ -151,7 +170,7 @@
 # ``.tsv``.
 # Let's put our new knowledge to use!
 
-bids_path.update(suffix='meg', extension='fif')
+bids_path.update(suffix='eeg', extension='.vhdr')
 print(bids_path.fpath)
 bids_path
 

diff --git a/mne_bids/inspect.py b/mne_bids/inspect.py
@@ -1,3 +1,9 @@
+"""Inspect and annotate BIDS raw data."""
+# Authors: Richard Höchenberger <[email protected]>
+#          Stefan Appelhoff <[email protected]>
+#
+# License: BSD (3-clause)
+
 from pathlib import Path
 
 import numpy as np
@@ -78,8 +84,12 @@ def inspect_dataset(bids_path, find_flat=True, l_freq=None, h_freq=None,
     Disable flat channel & segment detection, and apply a filter with a
     passband of 1–30 Hz.
 
-    >>> inspect_dataset(bids_path=bids_path, find_flat=False,
-                        l_freq=1, h_freq=30)
+    >>> from mne_bids import BIDSPath
+    >>> root = Path('./mne_bids/tests/data/tiny_bids').absolute()
+    >>> bids_path = BIDSPath(subject='01', task='rest', session='eeg',
+    ...                      suffix='eeg', extension='.vhdr', root=root)
+    >>> inspect_dataset(bids_path=bids_path, find_flat=False,  # doctest: +SKIP
+    ...                 l_freq=1, h_freq=30)
     """
     allowed_extensions = set(ALLOWED_DATATYPE_EXTENSIONS['meg'] +
                              ALLOWED_DATATYPE_EXTENSIONS['eeg'] +

diff --git a/mne_bids/path.py b/mne_bids/path.py
@@ -1,5 +1,6 @@
 """BIDS compatible path functionality."""
 # Authors: Adam Li <[email protected]>
+#          Stefan Appelhoff <[email protected]>
 #
 # License: BSD (3-clause)
 import glob
@@ -13,7 +14,7 @@
 from pathlib import Path
 from datetime import datetime
 import json
-from typing import Optional, Union
+from typing import Optional
 
 import numpy as np
 from mne.utils import warn, logger, _validate_type
@@ -224,37 +225,50 @@ class BIDSPath(object):
 
     Examples
     --------
+    Generate a BIDSPath object and inspect it
+
     >>> bids_path = BIDSPath(subject='test', session='two', task='mytask',
-                             suffix='ieeg', extension='.edf')
+    ...                      suffix='ieeg', extension='.edf')
     >>> print(bids_path.basename)
     sub-test_ses-two_task-mytask_ieeg.edf
     >>> bids_path
-    BIDSPath(root: None,
+    BIDSPath(
+    root: None
+    datatype: ieeg
     basename: sub-test_ses-two_task-mytask_ieeg.edf)
-    >>> # copy and update multiple entities at once
+
+    Copy and update multiple entities at once
+
     >>> new_bids_path = bids_path.copy().update(subject='test2',
-                                                session='one')
+    ...                                         session='one')
     >>> print(new_bids_path.basename)
     sub-test2_ses-one_task-mytask_ieeg.edf
-    >>> # printing the BIDSPath will show relative path when
-    >>> # root is not set
+
+    Printing a BIDSPath will show a relative path when `root` is not set
+
     >>> print(new_bids_path)
     sub-test2/ses-one/ieeg/sub-test2_ses-one_task-mytask_ieeg.edf
-    >>> new_bids_path.update(suffix='channels', extension='.tsv')
-    >>> # setting suffix without an identifiable datatype will
-    >>> # result in a wildcard at the datatype directory level
+
+    Setting `suffix` without an identifiable datatype will make
+    BIDSPath try to guess the datatype
+
+    >>> new_bids_path = new_bids_path.update(suffix='channels',
+    ...                                      extension='.tsv')
     >>> print(new_bids_path)
-    sub-test2/ses-one/*/sub-test2_ses-one_task-mytask_channels.tsv
-    >>> # set a root for the BIDS dataset
-    >>> new_bids_path.update(root='/bids_dataset')
-    >>> print(new_bids_path.root)
+    sub-test2/ses-one/ieeg/sub-test2_ses-one_task-mytask_channels.tsv
+
+    You can set a new root for the BIDS dataset. Let's see what the
+    different properties look like for our object:
+
+    >>> new_bids_path = new_bids_path.update(root='/bids_dataset')
+    >>> print(new_bids_path.root.as_posix())
     /bids_dataset
     >>> print(new_bids_path.basename)
-    sub-test2_ses-one_task-mytask_ieeg.edf
+    sub-test2_ses-one_task-mytask_channels.tsv
     >>> print(new_bids_path)
-    /bids_dataset/sub-test2/ses-one/ieeg/sub-test2_ses-one_task-mytask_ieeg.edf
-    >>> print(new_bids_path.directory)
-    /bids_dataset/sub-test2/ses-one/ieeg/
+    /bids_dataset/sub-test2/ses-one/ieeg/sub-test2_ses-one_task-mytask_channels.tsv
+    >>> print(new_bids_path.directory.as_posix())
+    /bids_dataset/sub-test2/ses-one/ieeg
 
     Notes
     -----
@@ -440,7 +454,7 @@ def suffix(self, value):
         self.update(suffix=value)
 
     @property
-    def root(self) -> Optional[Union[str, Path]]:
+    def root(self) -> Optional[Path]:
         """The root directory of the BIDS dataset."""
         return self._root
 
@@ -477,12 +491,14 @@ def extension(self, value):
 
     def __str__(self):
         """Return the string representation of the path."""
-        return str(self.fpath)
+        return str(self.fpath.as_posix())
 
     def __repr__(self):
         """Representation in the style of `pathlib.Path`."""
+        root = self.root.as_posix() if self.root is not None else None
+
         return f'{self.__class__.__name__}(\n' \
-               f'root: {self.root}\n' \
+               f'root: {root}\n' \
                f'datatype: {self.datatype}\n' \
                f'basename: {self.basename})'
 
@@ -651,13 +667,13 @@ def update(self, *, check=None, **kwargs):
         :func:`mne_bids.BIDSPath`:
 
         >>> bids_path = BIDSPath(subject='test', session='two',
-                                     task='mytask', suffix='channels',
-                                     extension='.tsv')
+        ...                      task='mytask', suffix='channels',
+        ...                      extension='.tsv')
         >>> print(bids_path.basename)
         sub-test_ses-two_task-mytask_channels.tsv
         >>> # Then, one can update this `BIDSPath` object in place
-        >>> bids_path.update(acquisition='test', suffix='ieeg',
-                             extension='.vhdr', task=None)
+        >>> bids_path = bids_path.update(acquisition='test', suffix='ieeg',
+        ...                              extension='.vhdr', task=None)
         >>> print(bids_path.basename)
         sub-test_ses-two_acq-test_ieeg.vhdr
         """
@@ -1147,16 +1163,16 @@ def get_entities_from_fname(fname, on_error='raise'):
     --------
     >>> fname = 'sub-01_ses-exp_run-02_meg.fif'
     >>> get_entities_from_fname(fname)
-    {'subject': '01',
-    'session': 'exp',
-    'task': None,
-    'acquisition': None,
-    'run': '02',
-    'processing': None,
-    'space': None,
-    'recording': None,
-    'split': None,
-    'suffix': 'meg'}
+    {'subject': '01', \
+'session': 'exp', \
+'task': None, \
+'acquisition': None, \
+'run': '02', \
+'processing': None, \
+'space': None, \
+'recording': None, \
+'split': None, \
+'suffix': 'meg'}
     """
     if on_error not in ('warn', 'raise', 'ignore'):
         raise ValueError(f'Acceptable values for on_error are: warn, raise, '
@@ -1398,12 +1414,12 @@ def get_entity_vals(root, entity_key, *, ignore_subjects='emptyroom',
 
     Examples
     --------
-    >>> root = os.path.expanduser('~/mne_data/eeg_matchingpennies')
+    >>> root = Path('./mne_bids/tests/data/tiny_bids').absolute()
     >>> entity_key = 'subject'
     >>> get_entity_vals(root, entity_key)
-    ['05', '06', '07', '08', '09', '10', '11']
+    ['01']
     >>> get_entity_vals(root, entity_key, with_key=True)
-    ['sub-05', 'sub-06', 'sub-07', 'sub-08', 'sub-09', 'sub-10', 'sub-11']
+    ['sub-01']
 
     Notes
     -----