-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better RELION integration #56
Comments
Hi Takanori, Yes, we have basic Topaz-Relion integration wrappers for denoising and picking provided by a contributor that we are going to add once we test them. Once we add that to the Topaz repository, we will let you know so that you play with them and improve them as you wish. Best, |
In addition to what Alex said about RELION wrappers already in the works, I am happy to accept pull requests implementing most of these features as long as they do not change the default topaz interface/behavior. My thoughts on your specific feature requests:
|
@tbepler Thanks for your commend. Sorry, I didn't notice your response.
In |
Yes, The problem could also be addressed by adding an option to |
Hello, Thank you in advance, |
Hi Kevin, Sorry for the delay. There are scripts available here for use as Relion 3.1 plugins: https://github.com/tbepler/topaz/tree/master/relion_run_topaz The denoising scripts are complete. The picking scripts are still under development, so consider them as beta releases. I hope to find time in the next week to finish those. Best, |
run_topaz_pick.py executed from RELION 3.1.0
stops with error:
Any suggestions on what could be the reason and how to make the script functional? |
@PiotrDra This sounds like a python version problem. The f'...' syntax requires python 3.6 or newer. Can you check that your topaz install is using python 3? |
Hi, I have tried to use Topaz but cannot manage to extract particles in Relion using the coordinates from Topaz picking. I used the Relion integration of Topaz to Denoise and Train on 7689 micrographs using the particles.star file from my best 3D map as positive labels. A Relion particle extraction job using the coords_suffix_topazpicks.star file (written by run_topaz_pick.py) and micrographs.star file as input failed to extract any particles although many coordinates were written by Topaz picking (stderr: Warning: coordinate file External/job568/__/raw/GridSquare_7115372/Data/FoilHole_8065276_Data_7120079_7120081_20191008_0757_fractions_topazpicks.star does not exist...) Job 568 was the run_topaz_pick.py job I think that the issue is point 3 made by @biochem-fan : "Respect the directory structure My micrographs.star file that I use for run_topaz_pick.py contains the following for each micrograph in the _rlnMicrographName and _rlnCtfImage columns: As you can see, the per-grid-square directory structure is carried through and since it is not maintained by Topaz, I cannot use the generated coordinates for further processing in Relion. Can you please suggest a work-around for this? I don't have any python knowledge and have no idea how to fix this issue. |
As a workaround, running extract on each of the directories individually should solve this problem. |
Extract can also be run once per micrograph, e.g.
This also allows writing one output file per micrograph. |
commit 752c140 adds support for writing extracted coordinates as one file per micrograph and also adds support for piping the micrograph paths to topaz. |
This is a dirty patch but solves the issue of working with images scattered in many sub-directories. When I have time, I will refactor this using my STAR file parser. diff --git a/relion_run_topaz/run_topaz_pick.py b/relion_run_topaz/run_topaz_pick.py
index 198133e..e8f2d64 100644
--- a/relion_run_topaz/run_topaz_pick.py
+++ b/relion_run_topaz/run_topaz_pick.py
@@ -4,16 +4,22 @@
# This is to run Topaz picker (https://github.com/tbepler/topaz) from Relion as an External job type
# Rafael Fernandez-Leiro 2020 - CNIO - [email protected]
# Alex J. Noble 2020 - NYSBC - [email protected]
+# @biochem_fan 2020
# Run with Relion external job
# Provide executable in the gui: run_topaz_pick.py
# Input micrographs.star
# Provide extra parameters in the parameters tab (scalefactor, trained_model, pick_threshold, select_threshold, skip_pi
+# TODO
+# Earlier error check
+# Number of workers
+# Continue
"""Import >>>"""
import argparse
import os
+import re
"""<<< Import"""
"""USAGE >>>"""
@@ -93,13 +99,31 @@ os.system(cmd)
"""make star files >>>"""
#make star files in the right folder
print('Making star files...')
-os.system(str('''relion_star_printtable ''')+inargsMics+str(''' data_micrographs _rlnMicrographName | awk -F"/" 'NR==1{
-tmpdf=open(tmpfile).readline().rstrip('\n')
-outopaz_path=outargsPath+tmpdf+'/'
-os.system(str('mkdir ')+outopaz_path+str(';rm ')+tmpfile)
+os.system('relion_star_printtable %s data_micrographs _rlnMicrographName > %s' % (inargsMics, tmpfile))
+
+basename_to_dir = {}
+for line in open(tmpfile):
+ original_filename = line.rstrip()
+ dirname = os.path.dirname(original_filename)
+ filename = os.path.basename(original_filename)
+ filename_without_ext = filename[:filename.rfind('.')]
+ # strip job path
+ m = re.match("[^/]+/job\d+\/", dirname)
+ if m:
+ dirname = dirname[m.end():]
+
+ if filename_without_ext in basename_to_dir:
+ sys.stderr.write("ERROR: Sorry, you cannot have two files with the same, even if they are in different directories")
+ sys.exit(-1)
+ basename_to_dir[filename_without_ext] = dirname
+
+os.remove(tmpfile)
+
mic_filenames=list(set([x.split('\t')[0] for x in open(outargsResults2).readlines()[1:]]))
topaz_picks=[x.split('\t') for x in open(outargsResults2).readlines()[1:]]
for name in mic_filenames:
+ outopaz_path=outargsPath+basename_to_dir[name]+'/'
+ os.makedirs(outopaz_path, exist_ok=True)
star_file=outopaz_path+name+'_topazpicks.star'
with open(star_file, 'w') as f: |
Hi @biochem-fan and @tbepler . Thanks for the advice, I appreciate it! We've had some PC issues and I haven't tried the new version yet but the patch looks like a good idea. Unfortunately, I've never used one before and don't quite understand how to use it. Should I modify the run_topaz_pick.py script in my Relion directory to match the one above? Should any lines be removed from the original script? I'm also not too clear on the usage in Relion. Can I apply a similar patch to the denoising script and proceed with picking from denoised micrographs.star before running Extraction in Relion using the topaz_picks_scaled.star file (containing the correct directories as part of the micrograph names) and the original (not denoised) micrographs.star file as input? |
Yes.
That being said, if you are not familiar with these things, I recommend you to wait until my patch is tested and incorporated into the official distribution. Regarding denoising: Because I myself don't use denoising, it is of lower priority for me. The idea is the same. I hope the original developers work on it. |
Thanks for the reply @biochem-fan |
Hi Lizelle, In our limited tests of training Topaz picking models on raw versus denoised micrographs, we do not see an improvement using denoised micrographs over raw micrographs. So you should use whichever is most convenient for your workflow. Be aware, however, that we strongly advise that you do not use denoised particles for particle alignment. Please refer to the paragraph on the hallucination problem in the Discussion section of the Topaz-Denoise paper: https://www.nature.com/articles/s41467-020-18952-1 Best, |
Hi All,
|
In the next major update of RELION (3.2, not 3.1.x; hopefully early next year), Topaz wrapper is integrated into an AutoPick job, not as an External job type. It is currently being test in house. With that, problems associated with directories and "Continue" should be solved. Meanwhile, please use the above patch. @LizelleLL, thanks for feedback and testing. |
I and @scheres are interested in better RELION integration of Topaz.
Several things we wish are:
Note that there is a limitation in the length of command line arguments a shell can accept.
RELION's ManualPicker writes this and particle displayer and Extract job require this, instead of what
topaz convert
supports now.For example, a user might have
Dataset1/001.mrc
andDataset2/001.mrc
. Currently Topaz only looks at the file name, so these two get mixed up.This is useful for an automatic processing loop.
Some of these can be implemented outside Topaz as a separate converter or a wrapper, but I think it is more efficient to have them inside Topaz itself. For example, a wrapper can make a new working directory and makes symbolic links to relevant files and call Topaz, but this can easily get messy.
@alexjnoble Are you working on any of them? (I saw your tweet: https://twitter.com/alexjamesnoble/status/1267000205838364673) If you are too busy to work on them, I can try myself and send a pull request. Do you have something you don't want to have inside Topaz?
The text was updated successfully, but these errors were encountered: