The below change worked for me. Add the below code todef _calc_final_dist(self, vocab_dists, attn_dists). Info is in comments. # OOV part of vocab is max_art_oov long. Not all the sequences in a batch will have max_art_oov tokens. # That will cause some entries to be 0 in...
"img2img_inpaint_sketch_default_brush_color": "#ffffff", "return_mask": false, "return_mask_composite": false, "cross_attention_optimization": "Automatic", "s_min_uncond": 0.0, "token_merging_ratio": 0.0, "token_merging_ratio_img2img": 0.0, "token_merging_ratio_hr": 0.0, "pad_...
in T5ForConditionalGeneration.construct(self, input_ids, attention_mask, decoder_input_ids, decoder_attention_mask, head_mask, decoder_head_mask, cross_attn_head_mask, encoder_outputs, past_key_values, inputs_embeds, decoder_inputs_embeds, labels, use_cache, output_attentions, output_...
"attention_mask": attn_masks.numpy()[:, 1:-2]}) # save as mindrecord file # define columns nlp_schema = { "input_ids": {"type": "int64", "shape": [-1]}, "attention_mask": {"type": "int64", "shape": [-1]}, } mr_writer = FileWriter(file_name, shard_num=...
copying xformers/benchmarks/benchmark_attn_decoding.py -> build/lib.macosx-11.1-arm64-cpython-310/xformers/benchmarks copying xformers/benchmarks/benchmark_multi_head_dispatch.py -> build/lib.macosx-11.1-arm64-cpython-310/xformers/benchmarks ...
Memory Efficient attention on Navi31 GPU is still experimental. Enable it with TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1. (Triggered internally at /pytorch/aten/src/ATen/native/transformers/hip/sdp_utils.cpp:251.)attn_output = torch.nn.functional.scaled_dot_product_attention(Model Response:Explain...
main .github 3rdparty assets benchmarks benchmark_generation_mamba_simple.py csrc evals mamba_ssm tests .gitignore .gitmodules AUTHORS LICENSE README.md setup.py Latest commit tridao Implement repetition-penalty for generation Dec 20, 2023 ...