Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TheiaProk wfs] (eventually) upgrade StxTyper version when available & ensure all OPERON output types are captured appropriately #694

Open
kapsakcj opened this issue Dec 16, 2024 · 0 comments

Comments

@kapsakcj
Copy link
Contributor

🆒

📌 Explain the Request

request 1 - upgrade version

This is an internal request from me, but would be desired by our public health partners (CA) and users. It is a low priority request at the moment but may be upgraded priority pending on feedback from CA partners

Version of StxTyper in theiaprok workflows currently: 1.0.24

Current version available on NCBI StxTyper GitHub: 1.0.27

FYI in the coming months there will be a new version released with minor bug fixes for edge cases. I will reply back when it's available.

StaPH-B/Curtis will make a docker image available for the new version when it is released. We can copy this to our google artifact registry once available.

request 2 - update parsing of results; potentially add new string output to task and workflows

I want to also use this opportunity to ensure that the stxtyper task in theiaprok accurately captures all potential OPERON types.

I accidentally omitted searching for hits with EXTENDED and AMBIGUOUS operon type so that is not very visible to the user unless they look in the stxtyper_report TSV file (raw output from the tool). They are rarely found, but still would be good to capture and notify the end user of the TheiaProk workflow in case they are found by stxtyper.

From StxTyper README description:

EXTENDED The coding sequence extends beyond the reference stop codon for one or both of the reference proteins

and

AMBIGUOUS StxTyper found an ambiguous base in the query sequence (e.g., N), this could be the result sequencing or assembly error so the user might want to take a closer look at the sequence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant