Deep learning models suffer from a phenomenon called adversarial attacks: we can apply minor changes to the model input to fool a classifier for a particular example. The literature mostly considers adversarial attacks on models with images and other structured inputs. However, the adversarial ...
Decoupling Direction and Norm for Efficient Gradient-Based Adversarial Attacks and Defenses 说在前面 1.提出的问题 2.提出的方法 2.1 相关工作 2.2 算法介绍 3.实验结果 3.1 Untargeted Attack 3.2 Targeted Attack 3.3 Defense Evaluation 4.结论 Decoupling Direction and Norm for Efficient Gradient-Based L2 Ad...
Safety alignment of Large Language Models (LLMs) can be compromised with manual jailbreak attacks and (automatic) adversarial attacks. Recent studies suggest that defending against these attacks is possible: adversarial attacks generate unlimited but unreadable gibberish prompts, detectable by perplexity-ba...
28 Oct 2021·Lifan Yuan,Yichi Zhang,Yangyi Chen,Wei Wei· Despite recent success on various tasks, deep learning techniques still perform poorly on adversarial examples with small perturbations. While optimization-based methods for adversarial attacks are well-explored in the field of computer vision...
oracleCf(⋅,⋅). For further examples of optimization with human feedback, we refer the reader to [11], [12], [13], [14], [15], [16], [17]. In an entirely different direction, it has recently been observed that the problem of generating adversarial attacks on image classifiers ...
Generation of synthetic full-scale burst test data for corroded pipelines using the tabular generative adversarial network. Engineering Applications of Artificial Intelligence, 2022, 115: 105308. DOI:10.1016/j.engappai.2022.105308 443. Chen, Y., Xu, Y., Jamhiri, B. et al. Predicting uniaxial ...
Gradient-Based Adversarial Attacks Against Malware Detection by Instruction ReplacementDeep learning plays a vital role in malware detection. The Malconv is a well-known deep learning-based open source malware detection framework and is trained on raw bytes for malware binary detection. Researchers ...
Adversarial attacksText classificationThe loss-based implementationThe gradient-based implementationAdversarial examples are generated by adding infinitesimal perturbations to legitimate inputs so that incorrect predictions can be induced into deep learning models. They have received increasing attention recently ...
Deep neural networks (DNNs) are vulnerable to adversarial attacks which can fool the classifiers by adding small perturbations to the original example. The added perturbations in most existing attacks are mainly determined by the gradient of the loss function with respect to the current example. In...
Improved Gradient based Adversarial Attacks for Quantized NetworksKartik GuptaThalaiyasingam Ajanthan