You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
index_checkpoint_files walks a given path to find checkpoint files, and organises the segments of each checkpoint path into tags.
One might want to call this many times on the same top-level checkpoint directory, to analyse checkpoint data while the program is running and new checkpoints are added. For example, if a checkpoint is made at regular time intervals, with the timestamp used as a tag.
If there are a lot of checkpoint files (e.g. 100s), walking the whole path becomes a big waste. One could index a subdirectory of the top-level checkpoint directory, but then not all of the tags would be found, because tags are part of the path.
Is there a way to update the checkpoint index incrementally, based on diffs in the file tree? For example, if I want to reindex per timestep, it only searches the checkpoints for that timestep and adds them to an existing index, but still knows all of the tags.
The text was updated successfully, but these errors were encountered:
Of course as soon as I wrote this... is it much simpler than I thought? As long as you give the full path starting from the top-level dir, even if it's a path to a subdirectory, it will be able to read all the tags in that path?
I think the problem with that is: you need to know the structure of the path, the ordering and value of the tags. Whereas in the use-case I have, I don't have information that specific, and I want to query what the tags are.
index_checkpoint_files
walks a given path to find checkpoint files, and organises the segments of each checkpoint path into tags.One might want to call this many times on the same top-level checkpoint directory, to analyse checkpoint data while the program is running and new checkpoints are added. For example, if a checkpoint is made at regular time intervals, with the timestamp used as a tag.
If there are a lot of checkpoint files (e.g. 100s), walking the whole path becomes a big waste. One could index a subdirectory of the top-level checkpoint directory, but then not all of the tags would be found, because tags are part of the path.
Is there a way to update the checkpoint index incrementally, based on diffs in the file tree? For example, if I want to reindex per timestep, it only searches the checkpoints for that timestep and adds them to an existing index, but still knows all of the tags.
The text was updated successfully, but these errors were encountered: