You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The infromation in tutorial next: Any MapDataset can be turned into a IterDataset by calling to_iter_dataset. When possible this should happen late in the pipeline since it will restrict the transformations that can come after it (e.g. global shuffle must come before). This conversion by default skips None elements.
But calling map_dataset.to_iter_dataset() converts it to PrefetchIterDataset class. What the problem? This class is not iterable. Also we can't get state from it. So it's not easy to understand why you mentioned it that way, and why it wasn't used in the tutorial at all.
Example for reproduction:
# !pip install -q jax-ai-stack[grain]==2024.11.1importcheximportgrain.pythonaspygrainclassSource(pygrain.RandomAccessDataSource):
def__init__(self, x:chex.Array, y:chex.Array) ->None:
assert (len(x) ==len(y)), "must be the same length"self.x=xself.y=ydef__len__(self) ->int:
returnlen(x)
def__getitem__(self, idx: int) ->tuple[chex.Array]:
returnself.x[idx], self.y[idx]
x=range(10)
y= [0]*5+ [1]*5data_source=Source(x, y)
dataset= (
pygrain.MapDataset.source(data_source)
.shuffle(seed=seed)
.map(lambdax: x)
.batch(batch_size=5)
)
# from tutorila, works fineiter_dataset=iter(dataset)
print(iter_dataset.get_state())
print(next(iter_dataset))
# creates PrefetchIterDataset with to_iter_dataset()iter_dataset2=dataset.to_iter_dataset()
print(type(iter_dataset2))
# AttributeError: 'PrefetchIterDataset' object has no attribute 'get_state'print(iter_dataset2.get_state())
# TypeError: 'PrefetchIterDataset' object is not an iteratorprint(next(iter_dataset2))
# works fineiter_dataset2=iter(iter_dataset2)
print(iter_dataset2.get_state())
print(next(iter_dataset2))
The text was updated successfully, but these errors were encountered:
The infromation in tutorial next:
Any MapDataset can be turned into a IterDataset by calling to_iter_dataset. When possible this should happen late in the pipeline since it will restrict the transformations that can come after it (e.g. global shuffle must come before). This conversion by default skips None elements.
But calling
map_dataset.to_iter_dataset()
converts it toPrefetchIterDataset
class. What the problem? This class is not iterable. Also we can't get state from it. So it's not easy to understand why you mentioned it that way, and why it wasn't used in the tutorial at all.Example for reproduction:
The text was updated successfully, but these errors were encountered: