How to use thd format qkv with cp + packed_seq_params #1368

Wraythh · 2024-12-12T04:03:09Z

If I have a dataset with sequence lengths of [4, 8, 6, 10], and I use cp2 to split the data, I observe that te performs the operation cu_seqlen_q / cp_size on cu_seqlen_q. This means I need to split each subsequence in the sequence into two subsequences and then concatenate them, resulting in two subsequences of [2, 4, 3, 5]. Should I pass cu_seqlen_q as [0, 4, 12, 18, 20] to both cp_rank instances in this case, or is there an issue with this usage?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use thd format qkv with cp + packed_seq_params #1368

How to use thd format qkv with cp + packed_seq_params #1368

Wraythh commented Dec 12, 2024

How to use thd format qkv with cp + packed_seq_params #1368

How to use thd format qkv with cp + packed_seq_params #1368

Comments

Wraythh commented Dec 12, 2024