You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is there a ways to improve this technique? when I add a new file, I have to redo everything to find similarities. is there a way to speed up the process of adding new fiels?
The text was updated successfully, but these errors were encountered:
The data processing pipeline has a few steps, the first of which transforms each image into a vector. The image vectors are computed and cached (in outputs/data/image-vectors) and so can be read directly after the first run, which should greatly expedite processing.
It's also worth noting that one can use a GPU to accelerate the creation of those image vectors. See the segments of the README on CUDA acceleration if that's an option for you.
From there, we need to project the vectors down to 2D for visualization. Right now we create a new UMAP model for this projection each time a user runs the pixplot command. But we could cache the model from the first run and then use it for subsequent runs. The tradeoff here is between model accuracy and performance--using a cached model will make the data less expressive and could potentially refrain from displaying some patterns that are latent in the distribution, but will run faster, while creating a new model each run maximizes data expressivity but slows down processing...
If you're interested in the idea, check out the UMAP docs on projecting new data with an extant model. We have some code for saving models and loading saved models you could consult if you wanted to try using cached models when processing data. If that sounds interesting, please feel free to send a PR and we'll be happy to review and help it get accepted!
Is there a ways to improve this technique? when I add a new file, I have to redo everything to find similarities. is there a way to speed up the process of adding new fiels?
The text was updated successfully, but these errors were encountered: