How to speedup the process for new files? #241

mostafa8026 · 2021-09-25T14:08:01Z

Is there a ways to improve this technique? when I add a new file, I have to redo everything to find similarities. is there a way to speed up the process of adding new fiels?

mostafa8026 · 2021-09-25T14:11:08Z

any suggestion to implement it by myself appreciated. tnx

duhaime · 2021-09-25T14:32:53Z

@mostafa8026 Good question!

The data processing pipeline has a few steps, the first of which transforms each image into a vector. The image vectors are computed and cached (in outputs/data/image-vectors) and so can be read directly after the first run, which should greatly expedite processing.

It's also worth noting that one can use a GPU to accelerate the creation of those image vectors. See the segments of the README on CUDA acceleration if that's an option for you.

From there, we need to project the vectors down to 2D for visualization. Right now we create a new UMAP model for this projection each time a user runs the pixplot command. But we could cache the model from the first run and then use it for subsequent runs. The tradeoff here is between model accuracy and performance--using a cached model will make the data less expressive and could potentially refrain from displaying some patterns that are latent in the distribution, but will run faster, while creating a new model each run maximizes data expressivity but slows down processing...

If you're interested in the idea, check out the UMAP docs on projecting new data with an extant model. We have some code for saving models and loading saved models you could consult if you wanted to try using cached models when processing data. If that sounds interesting, please feel free to send a PR and we'll be happy to review and help it get accepted!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to speedup the process for new files? #241

How to speedup the process for new files? #241

mostafa8026 commented Sep 25, 2021

mostafa8026 commented Sep 25, 2021

duhaime commented Sep 25, 2021

How to speedup the process for new files? #241

How to speedup the process for new files? #241

Comments

mostafa8026 commented Sep 25, 2021

mostafa8026 commented Sep 25, 2021

duhaime commented Sep 25, 2021