The dataset contains hourly record of electricity demand and temperature measurements from the first 8 weeks of 2014. The following plot is the training set of the data, which contains measurements in the first 6 weeks.We now build a model where the demand linearly depends on the temperature, ...
(log_dir) episode = 0 # 这里是所有环境的episode总和 episode_rewards_i = 0 # use tqdm to get a progress bar for training for sample_phase in range(n_updates): # we don't have to reset the envs, they just continue playing # until the episode is over and then reset automatically #...