adam_beta1 ... 0.9 adam_beta2 ... 0.999 adam_eps ... 1e-08 add_bias_linear ... False add_gate ... True adlr_autoresume ...
optim.Adam( params=trainer.model.parameters(), lr=config.learning_rate, weight_decay=config.weight_decay) else: optimizer = torch.optim.SGD( params=trainer.model.parameters(), lr=config.learning_rate, weight_decay=config.weight_decay, momentum=0.9) scheduler = torch.optim.lr_scheduler.Cosine...
The learning rate for training was 5 × 10−45 × 10−4, and the activation function was a non-adaptive 𝑠𝑖𝑛sin. This PINN architecture was trained using 50,000 Adam epochs, followed by 10,000 L-BFGS-B epochs. In every epoch, 1600 points were uniformly sampled from the ...