And the warmup does indeed work, because if I set num_warmup=0, then the output becomes biased towards the initial value. This is quite bad because it makes it seem that NUTS can achieve good results with a very small number of gradient evaluations, giving it an unfair advantage over ...