optim_c = AdaFactor([weight], betas=(0, 0.999), scale_parameter=False) ``` is close enough tohttps://www.tensorflow.org/api_docs/python/tf/keras/optimizers/Adafactor``` optim = keras.optimizers.Adafactor(learning_rate=0.01) ``` The three results respectively for the same randomly generated...
Implements the AdaHessian algorithm from "ADAHESSIAN: An Adaptive Second OrderOptimizer for Machine Learning" Arguments: params (iterable): iterable of parameters to optimize or dicts defining parameter groups lr (float, optional): learning rate (default: 0.1) betas ((float, float), optional): co...
assert hyper_params.learning_rate is not None, 'no learning rate provided.' learning_rate = hyper_params.learning_rate beta1 = hyper_params.beta1 decay_rate = hyper_params.decay_rate step_offset = hyper_params.step_offset multiply_by_parameter_scale = hyper_params.multiply_by_parameter_scale...