For reddit URLs corresponding to content up to October 2018 we arrived at approximately 37GB of content.ReproducibilityMegatron training is intended to be bitwise reproducible. This means that the same training config run twice in the same HW and SW environment should produce identical model ...