The main module of this repository is the Mamba architecture block wrapping the selective SSM. Source: modules/mamba_simple.py. Usage: import torch from mamba_ssm import Mamba batch, length, dim = 2, 64, 16 x = torch.randn(batch, length, dim).to("cuda") model = Mamba( # This module...
The main module of this repository is the Mamba architecture block wrapping the selective SSM. Source: modules/mamba_simple.py. Usage: import torch from mamba_ssm import Mamba batch, length, dim = 2, 64, 16 x = torch.randn(batch, length, dim).to("cuda") model = Mamba( # This module...
The main module of this repository is the Mamba architecture block wrapping the selective SSM. Source: modules/mamba_simple.py. Usage: import torch from mamba_ssm import Mamba batch, length, dim = 2, 64, 16 x = torch.randn(batch, length, dim).to("cuda") model = Mamba( # This module...
The main module of this repository is the Mamba architecture block wrapping the selective SSM. Source: modules/mamba_simple.py. Usage: import torch from mamba_ssm import Mamba batch, length, dim = 2, 64, 16 x = torch.randn(batch, length, dim).to("cuda") model = Mamba( # This module...
Mamba SSM architecture. Contribute to simran-arora/mamba development by creating an account on GitHub.
Mamba is based on a selective SSM layer, which is the focus of the paper (Section 3; Algorithm 2). Source: ops/selective_scan_interface.py. Mamba Block The main module of this repository is the Mamba architecture block wrapping the selective SSM. Source: modules/mamba_simple.py. Usage: ...
lm_eval --model mamba_ssm --model_args pretrained=state-spaces/mamba-2.8b-slimpj --tasks boolq,piqa,hellaswag,winogrande,arc_easy,arc_challenge,openbookqa,race,truthfulqa_mc2 --device cuda --batch_size 256 lm_eval --model mamba_ssm --model_args pretrained=state-spaces/mamba-2.8b-slimpj...
lm_eval --model mamba_ssm --model_args pretrained=state-spaces/mamba-2.8b-slimpj --tasks boolq,piqa,hellaswag,winogrande,arc_easy,arc_challenge,openbookqa,race,truthfulqa_mc2 --device cuda --batch_size 256 lm_eval --model mamba_ssm --model_args pretrained=state-spaces/mamba-2.8b-slimpj...
The main module of this repository is the Mamba architecture block wrapping the selective SSM. Source:modules/mamba_simple.py. Usage: importtorchfrommamba_ssmimportMambabatch,length,dim=2,64,16x=torch.randn(batch,length,dim).to("cuda")model=Mamba(# This module uses roughly 3 * expand * d_...
The main module of this repository is the Mamba architecture block wrapping the selective SSM. Source: modules/mamba_simple.py. Usage: import torch from mamba_ssm import Mamba batch, length, dim = 2, 64, 16 x = torch.randn(batch, length, dim).to("cuda") model = Mamba( # This module...