Aligned LLMs are not adversarially aligned. Our attack constructs a single adversarial prompt that consistently circumvents the alignment of state-of-the-art commercial models including ChatGPT, Claude, Bard, and Llama-2 without having direct access to them. The examples shown here are all actual ...
CommanderUAP: a practical and transferable universal adversarial attacks on speech recognition modelsSPEECH perceptionAUTOMATIC speech recognitionMost of the adversarial attacks against speech recognition systems focus on specific adversarial perturbations, which are generated by adversaries for each normal example...
AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM - OSU-NLP-Group/AmpleGCG
We find previous UAP attacks were not in the shortest distance direction in LLO.We propose a new UAP generation method to improve the attack success rate.O... D Liu,Z Li,D Xu - 《Computers & Security》 被引量: 0发表: 2025年 Transferable universal adversarial perturbations against speaker ...
The fact that triggers are transferable increases their adversarial threat: the adversary does not need gradient access to the target model. Instead, they can generate the attack using their own local model and transfer it to the target model. Finally, since triggers are input-agnostic, they ...
Transferable universal adversarial perturbations against speaker recognition systemsdoi:10.1007/s11280-024-01274-3Universal Adversarial AttackAdversarial TransferabilitySpeaker RecognitionSecurityDeep neural networks(DNN) exhibit powerful feature extraction capabilities, making them highly advantageous in numerous tasks....