Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Ascertainment Corrections #450

Open
SimonGreenhill opened this issue Mar 4, 2024 · 2 comments
Open

Implement Ascertainment Corrections #450

SimonGreenhill opened this issue Mar 4, 2024 · 2 comments

Comments

@SimonGreenhill
Copy link

It would be lovely to have the ascertainment corrections in FilteredAlignment available.

@alexeid
Copy link
Collaborator

alexeid commented Mar 5, 2024

Sounds good. What exactly would you like to see? I imagine something like:

A ~ PhyloCTMC(tree=tree, L=100, Q=jukesCantor());
D = removeInvariableSites(alignment=A);

or

A ~ PhyloCTMC(tree=tree, L=100, Q=jukesCantor());
D = removeUninformativeSites(alignment=A);

@SimonGreenhill
Copy link
Author

Nice!

One question, a lot of the language data ascertains all-zero sites (i.e. we don't record data absent in all languages). I guess there could be a 'removeAbsentSites' function too (but then how do you know what site is "absent"? - is it safe to assume "0" is absent? or can you specify it e.g. D = removeAbsentSites(alignment=A, absentstate="0");)

Currently in BEAST2 this is implemented at a pattern level by using a FilteredAlignment which ignores site 1 e.g.:

<data id="x" dataType="standard" name="alignment">
    <sequence id="seq_1" taxon="lang_1" totalcount="2" value="011..."/>
    <sequence id="seq_2" taxon="lang_2" totalcount="2" value="000..."/>
    <!--                                                 HERE ^     -->
</data>

<distribution id="treeLikelihood.x" spec="TreeLikelihood" tree="@Tree.t:x" useAmbiguities="true">
    <data id="orgdata.x" spec="FilteredAlignment" ascertained="true" data="@x" excludefrom="0" excludeto="1" filter="-">
        <userDataType id="TwoStateCovarion.0" spec="beast.evolution.datatype.TwoStateCovarion"/>
    </data>
    ...
</distribution>

...which also means that you could have some weird and wonderful combination of ascertainment if you wanted ("I never bothered to collect 1's in taxon 2 when taxon 1 was already 0" ??) but I don't know of anyone using this, so I can't see of any drawbacks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants