What does this PR do? it's a simple align of qwen2vl kv_seq_len calculation with qwen2 new code. Before submitting This PR fixes a typo or improves the docs (you can dismiss the other checks if...
Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these: Addreadylabel to the PR Enable auto-merge. 🚀
libinta added the synapse1.18 label Oct 2, 2024 Merge branch 'transformers_future' into falcon_gptneox_fix Verified 487865c libinta reviewed Oct 2, 2024 View reviewed changes optimum/habana/transformers/generation/utils.py Outdated Show resolved ...
Closed drisspgmentioned this issueNov 20, 2024 CuDNN SDPA Issue Tracker#141133 Open drisspgaddedmodule: sdpaAll things related to torch.nn.functional.scaled_dot_product_attentiionand removedmodule: multi-headed-attentionlabelsNov 27, 2024
lzhangzz added the Bug:P1 label May 20, 2024 fix lint 3792151 lvhan028 requested a review from irexyc May 21, 2024 14:07 irexyc approved these changes May 22, 2024 View reviewed changes lvhan028 reviewed May 22, 2024 View reviewed changes src/turbomind/kernels/attention/test_atte...
Example # --gc:orc {.experimental: "views".} type Bar = object placeholder: int Foo = object placeholder: int c: seq[Bar] # remove this line to make things right func children*(s: seq[Foo]): openArray[Foo] = s.toOpenArray(0, s.len-1) pro...
max_seq_len_to_capture=8192, disable_custom_all_reduce=False, tokenizer_pool_size=0, tokenizer_pool_type='ray', tokenizer_pool_extra_config=None, limit_mm_per_prompt={'image': 4}, enable_lora=False, max_loras=1, max_lora_rank=16, lora_extra_vocab_size=256, lora_dtype='auto', lon...
label=i18n("Maximum tokens per batch, 0 means no limit"), minimum=0, maximum=2048, value=1024,# 0 means no limit value=0,# 0 means no limit step=8, ) Expand DownExpand Up@@ -505,7 +505,7 @@ def parse_args(): enable_reference_audio=False, ...
Test name:test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16 (__main__.TestSDPACudaOnlyCUDA) ...
pytorchmergebot added the merging label Oct 22, 2024 Collaborator pytorchmergebot commented Oct 22, 2024 Merge started Your change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team ...