-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation statistics #106
Comments
Hey @Winter-LabHub, Thanks a lot for your feedback ! Using the min/max segment durations could be an interesting metric to validate the number of cluster centers. However, the data used for clustering are not necessarily ordered in time (this is even rarely the case because gfp peaks are often used, and/or data from different subjects are mixed). In addition, as small period of noise would completely mess-up those metrics. That's why I prefer not to implement it as a metric in the library. However, we can easily obtain these metrics by calculating the segmentation of the fitted data with Note that theses distributions are heavly impacted by the temporal smoothing parameters used in A minimal example would look like this
What do you think of this workaround ? |
Hi @vferat, I see what you are saying, but having a quick output of min, mean, std, and max might be beneficial for the process of figuring out what combination of smoothing parameters works best, onto of some of the other statistics your toolbox already provides. I am working on MEG data so I am having to play around a bit with the parameters and go back trying a different number of states and try different combinations of smoothing parameters again and again to see what gives me something that start looking like the literature. The code you provide works as well, I just figured future users might want a quick output so they can quickly try a bunch of different options with minimal code. The code I wrote was particularly for epoched data on the segmentation.labels. It can easily be flattened since it is [epochs, time_points]. I calculate the amount of time_points (indices) that are in a millisecond based on the sampling rate, and after I flatten the segmentation.labels then I can easily extract all those statistics. But it too is a chunk of code, so it would be more convenient for your code or mine to be integrated into the toolbox as a singular function for users going through this exploratory phase of state numbers and smoothing parameter settings. |
Hello, I have been playing around with this toolbox and really appreciate its segmentation metrics to aid in finding the right number of states to choose. However, one enhancement that would be truly helpful would be something that tells you what your longest and shortest states are after segmentation. The current metrics do not account for segmentation, only the ModK. Thus the different choices on smoothing variables still need to be accounted for along with the number of states chosen. Given the literature we should be aiming for 60ms - 150ms state segmentation lengths or at least something that follows along the suggested literature from Lehmann et al., (1987) to Michel & Koenig (2018). For my use I threw together a quick pipeline that reads in the segmentation output from pycrostates and outputs the longest state, shortest state, mean state, and standard deviation of state lengths--all in milliseconds. I have it here if you felt like integrating it:
https://github.com/Winter-LabHub/Microstate-min-and-max-length
I also have a permutation pipeline that may be helpful in determining some of the other statistics of state segmentation:
https://github.com/Winter-LabHub/two_sample_Permutation_ttest
The text was updated successfully, but these errors were encountered: