When is my model fully trained? #71

NicolaasVanRenne · 2023-07-15T13:15:06Z

Hello again Ludvig,

I got XFuse running on the supercomputer (thanks for the help!!).

There is a time limit of 24 hrs on the GPU node, so I stopped and restarted a few times during the first part (100k epochs), but managed to run it entirely in about 3-4 full days.

Then at the end he started to make the gene_maps, but I just didnt make it in time, the gene_maps for the last 2000 or so genes were not created - the process was aborted due to time-out.

So my question is: is my model already fully trained? Or does this gene_maps script needs to be fully executed, maybe even a follow-on script that needs to be ran?

Or is the gene_maps just a script that is ran AFTER the model is built. If so; how can I run it separately?

Thank you for your support

Nicolaas

NicolaasVanRenne · 2023-07-15T16:09:49Z

And if you have a HE staining, how do you generate the transcriptional maps.

How do I make xfuse run 'my trained model'? First I run xfuse convert image:

xfuse convert image --help
--image FILENAME [required]
--annotation FILENAME
--scale FLOAT
--mask / --no-mask
--mask-file FILENAME Custom mask. Should be a single-channel image with
the same size as the image. Uses the following
encoding: 0=background, 1=foreground, 2=likely
background, 3=likely foreground.

--rotate / --no-rotate
--save-path PATH The output path [default:
xfuse-2023-07-15T17:09:15.737447]

--debug
--help Show this message and exit.

But after that, do you have an example of how to do this?

ludvb · 2023-07-24T15:18:48Z

Hi Nicolaas,

Sorry for my late reply. Yes, the gene_maps analysis is run after the model is fully trained. You can just resume from your final checkpoint and the analysis should start immediately. You can run it on a subset of genes by specifying the gene_regex option in gene_maps section of the config toml file.

After converting the image, it should be possible to include it just as a normal experiment in your config toml file.

NicolaasVanRenne · 2023-07-24T18:12:49Z

I indeed have a checkpoint folder, so I assume I should resume from the following file: epoch-00100000.session

But what is the command to actually resume?

Im sorry this is probably obvious for a trained AI specialist, but its the first time i'm doing this deep learning, so I'm clueless here.

ludvb · 2023-07-25T08:14:42Z

You can load the session file by running the same command as you did when you trained the model (xfuse run ...) but appending the flag --session /path/to/epoch-00100000.session.

No worries at all, this is definitely not a standard thing and I should have provided more details :)

NicolaasVanRenne · 2023-07-28T14:07:44Z

It's working - great.

Does this part also have to run on GPU? Or is that only necessary to speed up model building?

This is important because my HPC only has so much GPU nodes and they want us not to use them unless necessary.

NicolaasVanRenne · 2023-07-28T14:17:13Z

It's working - although I'm pretty sure it should not look something like this...

The image was converted to scale 0.3 and the gene map jpgs are rather small (25kb). So I tried scale 1.0 but that one is memory crashing. So now I tried 0.5 scale, and that seems to work. (always applying --no-mask)

Preliminary results shows it seems to work, however it throws a rather strange rim around the edge of the image. Why? Is this normal?

I'm pretty sure I still have to tweak all the parameters (This will be for after my holiday). But I'm happy I got this far.

ludvb · 2023-07-31T12:06:46Z

The image was converted to scale 0.3 and the gene map jpgs are rather small (25kb). So I tried scale 1.0 but that one is memory crashing. So now I tried 0.5 scale, and that seems to work. (always applying --no-mask)

I usually aim for a scale factor that results in an image file of the Visium array that is roughly something like 2000 x 2000 px. Basically, you want the nuclei to be discernible but at the same time to keep the resolution not too high. When producing gene maps for new samples, it is also important to use the same scale that the model was trained on.

Preliminary results shows it seems to work, however it throws a rather strange rim around the edge of the image. Why? Is this normal?

The rim could make sense, since we usually see diffusion along the edges of the tissue in Visium. On the other hand, I think in this case the predicted expression looks abnormally high outside the tissue. Usually, xfuse will learn to associate inside/outside with high/low expression values if the tissue masking didn't fail during the conversion step. You can maybe check that the tissue masking looks good on your training data files using this script: https://github.com/ludvb/xfuse/blob/c420abb013c02f44120205ac184c393c14dcd14d/scripts/visualize_tissue_masks.py (invoke by python visualize_tissue_masks.py /path/to/data.h5)

I'm pretty sure I still have to tweak all the parameters (This will be for after my holiday). But I'm happy I got this far.

👍 My feeling is that tweaking the scaling could perhaps be the most important factor moving forward. Do let me know how it goes and if you need anyone to bounce ideas with :)

NicolaasVanRenne · 2023-08-07T17:37:58Z

I usually aim for a scale factor that results in an image file of the Visium array that is roughly something like 2000 x 2000 px.

So: what scale do I have to use here? I used 0.15 scale

If I reduce the picture to 15% in photoshop; I get this image. It's 3000x3000 pixels, and the nuclei are visible.

Does this mean I need to scale all pictures (where the model was NOT trained on) now at 0.15?

Because the pictures I want to predict Spatial transcription for, are not as good a resolution. Still fine in my opinion: here is a small piece of it (whole fig is 5300x3300 pixels)

ludvb · 2023-08-09T12:11:37Z

If I reduce the picture to 15% in photoshop; I get this image. It's 3000x3000 pixels, and the nuclei are visible.

Yep this looks like a good resolution.

Does this mean I need to scale all pictures (where the model was NOT trained on) now at 0.15?

Because the pictures I want to predict Spatial transcription for, are not as good a resolution. Still fine in my opinion: here is a small piece of it (whole fig is 5300x3300 pixels)

You want the um/px to be constant. So if the resolution is, for example, half of the images that you trained on, the --scale should be twice as big.

NicolaasVanRenne mentioned this issue Sep 8, 2023

Is it possible for Gene_maps values to use the same scale instead of min-max per gene? #72

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When is my model fully trained? #71

When is my model fully trained? #71

NicolaasVanRenne commented Jul 15, 2023

NicolaasVanRenne commented Jul 15, 2023

ludvb commented Jul 24, 2023

NicolaasVanRenne commented Jul 24, 2023

ludvb commented Jul 25, 2023

NicolaasVanRenne commented Jul 28, 2023

NicolaasVanRenne commented Jul 28, 2023

ludvb commented Jul 31, 2023

NicolaasVanRenne commented Aug 7, 2023

ludvb commented Aug 9, 2023

When is my model fully trained? #71

When is my model fully trained? #71

Comments

NicolaasVanRenne commented Jul 15, 2023

NicolaasVanRenne commented Jul 15, 2023

ludvb commented Jul 24, 2023

NicolaasVanRenne commented Jul 24, 2023

ludvb commented Jul 25, 2023

NicolaasVanRenne commented Jul 28, 2023

NicolaasVanRenne commented Jul 28, 2023

ludvb commented Jul 31, 2023

NicolaasVanRenne commented Aug 7, 2023

ludvb commented Aug 9, 2023