In our setup, the simulated robot’s upcoming movement direction is determined by an inverse WTA spike occurring in an obstacle-free direction, as shown in Fig.1a–c. The exact functionality of probabilistic decisions in an inverse WTA is further explained in the Methods in Section “Obstacle ...
They have introduced a secret trojan string (a suffix) that enables the model to answer harmful instructions for any prompt. Your task is to help us find the exact suffix they used! Each of the secret trojans is between 5 and 15 tokens long. Hint: our triggers do not contain white ...