从cleverhans攻击模型生成对抗性数据 [英] Generating adversarial data from cleverhans attack models
问题描述
我想要一个代码示例,说明如何从聪明的汉斯的对抗性攻击中生成火车数据.
I want a code example to how to generate train data from clever hans' adversarial attacks.
adv_x = fgsm.generate_np(X_test, **fgsm_params)
这会生成对抗性的x数据,但是我怎么得到y?
This generates adversarial x data but how can I get y?
adv_pred = model.predict_classes(adv_x)
这将使愚弄"的结果正确吗?
And this will give the "fooled" results right?
我想要的是正确显示生成的x,y,愚弄的y(我的意思是模型预测的结果可能由于攻击而为假).如果有帮助,我正在使用Mnist btw.
What I want is to correctly show generated x, y, fooled y (by which I mean results of models predictions that may be false because of the attack). I'm using Mnist btw, if it helps.
推荐答案
根据您共享的代码段,我提出两个建议:
Based on the code snippets you shared, I would make two suggestions:
-
在测试数据上训练模型通常不是一个好主意(如果以后要使用该测试数据来评估其性能),所以我会在第一篇中将
X_test
替换为X_train
行.
要获取对抗性示例的标签,可以使用训练数据的原始标签或原始训练数据model.predict_classes(X_train)
上模型的预测(这假定对抗性示例的干扰程度不够)更改输入的标签).
To get the label for your adversarial examples, you can use the original labels of the training data or the predictions of the model on the original training data model.predict_classes(X_train)
(this assumes that the adversarial example is not perturbed enough to change the label of the input).
这篇关于从cleverhans攻击模型生成对抗性数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!