The standard (I call it "perceptible") type of reward function in reinforcement learning only depends on the history of actions and observations. For such an AI the answer is (ii): it is important the AI will correctly predict the consequences of its actions, but there is no significance w...