[QUESTION] Does Flux support variable seqlens overlap? #52

wplf · 2025-02-28T06:01:02Z

Hi, thank you for great works.

Since we need to pass max_M to kernel, is this ok for us to pass a max_M and run kernel with input that is smaller than Max_M?

self.ag_gemm_op = flux.AGKernel(
            get_tp_group().device_group,
            1,  # One node
            8192,  # Max M. TODO: Pass in correctly.
            weight.shape[0],  # N
            weight.shape[1],  # K
            # TODO: Pass in input dtype correctly.
            # TODO: It would be nicer to modify flux to dispatch based on dtype
            # at run time, but I don't know what the downside would be.
            # Similar comment for max m.
            torch.float16,
            torch.float16,
            # Note: transpose_weight=False means that B is transposed
            transpose_weight=False,
            # Note: if local_copy=True, I hit the following runtime error:
            # /flux/src/all_gather/ths_op/all_gather_gemm_kernel.cc:648
            #   Check failed: 33554432((input.numel() * input.element_size()))
            #                 == 139836453421056((this->chunk_size))
            local_copy=False,
)

The text was updated successfully, but these errors were encountered:

wenlei-bao · 2025-03-12T01:34:14Z

@wplf Yes.

wenlei-bao self-assigned this Mar 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QUESTION] Does Flux support variable seqlens overlap? #52

[QUESTION] Does Flux support variable seqlens overlap? #52

wplf commented Feb 28, 2025

wenlei-bao commented Mar 12, 2025

[QUESTION] Does Flux support variable seqlens overlap? #52

[QUESTION] Does Flux support variable seqlens overlap? #52

Comments

wplf commented Feb 28, 2025

wenlei-bao commented Mar 12, 2025