如何检查 base64 字符串是否为文件(什么类型?)? [英] How can i check a base64 string is a file(what type?) or not?

查看:41
本文介绍了如何检查 base64 字符串是否为文件(什么类型?)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 https://2020.ractf.co.uk/ 上接受了 Spenttalkux 挑战.这是我第一次进行 CTF 挑战,所以我在 https://github.com/W3rni0/RACTF_2020/blob/master/readme.md#spentalkux

当我收到这个 base64 字符串时:

<预> <代码> JA2HGSKBJI4DSZ2WGRAS6KZRLJKVEYKFJFAWSOCTNNTFCKZRF5HTGZRXJV2EKQTGJVTXUOLSIMXWI2KYNVEUCNLIKN5HK3RTJBHGIQTCM5RHIVSQGJ3C6MRLJRXXOTJYGM3XORSIJN4FUYTNIU4XAULGONGE6YLJJRAUYODLOZEWWNCNIJWWCMJXOVTEQULCJFFEGWDPK5HFUWSLI5IFOQRVKFWGU5SYJF2VQT3NNUYFGZ2MNF4EU5ZYJBJEGOCUMJWXUN3YGVSUS43QPFYGCWSIKNLWE2RYMNAWQZDKNRUTEV2VNNJDC43WGJSFU3LXLBUFU3CENZEWGQ3MGBDXS4SGLA3GMS3LIJCUEVCCONYSWOLVLEZEKY3VM4ZFEZRQPB2GCSTMJZSFSSTVPBVFAOLLMNSDCTCPK4XWMUKYORRDC43EGNTFGVCHLBDFI6BTKVVGMR2GPA3HKSSHNJSUSQKBIE

不知道是不是文件,但是求解器说是gz压缩的数据文件.

你能帮帮我吗?详情在这里

文件链接:https://github.com/W3rni0/RACTF_2020/blob/master/assets/files/Spentalkux.gz

解决方案

许多文件类型都有一个带有一些固定信息的标题(文件的前几个字节),通过这些信息可以将文件识别为 gz、png、pdf、等等

所以每个base64编码的gz文件也会以一定的base64字符序列开头,通过它可以被识别.

gzip-file 总是以两个字节序列 0x1f 0x1b 开头,在 base64encoding 是 H4 加上 sv 范围内的第三个字符.

原因是,每个base64字符代表原始字节的6位,所以两个字节0x1f 0x1b是用两个base64字符(12位)加上第三个的前4位编码的字符.

基于此,我会说您在此处显示的不是 base64 编码的 gzip.

其他例子有:


关于问题中的具体例子:

在更新的问题中,附图中有提示

数据首先是base32 编码,然后是 base64 编码.

当我们使用在问题(JA2HGSKBJI4DSZ2WGRAS...),我们得到:

<预> <代码> H4sIAJ89gV4A/+ 1ZURaEIAi8SkfQ + 1/O3f7MtEBfMgz9rC/diXmIA5hSzun3HNdBbgbtVP2v/2 + LowM837wFHKxZbmE9pQfsLOaiLAL8kvIk4MBma17ufHQbIJCXoWNZZKGPWB5QljvXIuXOmm0SgLixJw8HRC8Tbmz7x5eIspypaZHSWbj8cAhdjli2WUkR1sv2dZmwXhZlDnIcCl0GyrFX6fKkBEBTBsq + 9uY2Ecug2Rf0xtaJlNdYJuxjP9kcd1LOW/fQXtb1sd3fSTGXFTx3UjfGFx6uJGjeIAAA

它以H4s开头,所以根据我写的如何识别base64编码的文件类型,它是一个base64编码的gzip文件.

这可以保存在一个文本文件中,然后上传到 base64decode.org 那里将被转换成一个 gzip 文件.当您下载并打开该 gzip 文件时,它包含一个带有如下文本的文件:

...

此案例的结论:原始字符串/文件是一个 gzip 文件,首先使用 base64 编码,然后再使用 base32 编码 base64 编码部分.

I took the Spentalkux challenge on https://2020.ractf.co.uk/. This is the first time I do a CTF challenge so I went through a solution on https://github.com/W3rni0/RACTF_2020/blob/master/readme.md#spentalkux

When I receive this base64 string :

JA2HGSKBJI4DSZ2WGRAS6KZRLJKVEYKFJFAWSOCTNNTFCKZRF5HTGZRXJV2EKQTGJVTXUOLSIMXWI2KYNVEUCNLIKN5HK3RTJBHGIQTCM5RHIVSQGJ3C6MRLJRXXOTJYGM3XORSIJN4FUYTNIU4XAULGONGE6YLJJRAUYODLOZEWWNCNIJWWCMJXOVTEQULCJFFEGWDPK5HFUWSLI5IFOQRVKFWGU5SYJF2VQT3NNUYFGZ2MNF4EU5ZYJBJEGOCUMJWXUN3YGVSUS43QPFYGCWSIKNLWE2RYMNAWQZDKNRUTEV2VNNJDC43WGJSFU3LXLBUFU3CENZEWGQ3MGBDXS4SGLA3GMS3LIJCUEVCCONYSWOLVLEZEKY3VM4ZFEZRQPB2GCSTMJZSFSSTVPBVFAOLLMNSDCTCPK4XWMUKYORRDC43EGNTFGVCHLBDFI6BTKVVGMR2GPA3HKSSHNJSUSQKBIE

I don't know how to check if it is a file, but the solver said that it is a gz compressed data file.

Can you help me, please? detail here

Link to file: https://github.com/W3rni0/RACTF_2020/blob/master/assets/files/Spentalkux.gz

解决方案

Many filetypes have a header (the first few bytes of the file) with some fixed information by which a file can be identified as a gz, png, pdf, etc.

So every base64 encoded gz file would also start with a certain sequence of base64 characters, by which it can be recognized.

A gzip-file always starts with the two byte sequence 0x1f 0x1b, which in base64 encoding is H4 plus a third character in the range of s to v.

The reason is, that every base64 character represents 6 bits of the original bytes, so the two bytes 0x1f 0x1b are encoded with two base64 characters (12 bits) plus the first 4 bits of the third character.

Based on that, I would say that's no base64 encoded gzip that you show there.

other examples are:

  • png

    starts with: 0x89 0x50 0x4e 0x47 0x0d 0x0a 0x1a 0x0a

    base64 encoded: iVBORw0KGg...

  • jpg

    starts with: 0xFF 0xD8 0xFF 0xD0

    base64 encoded: /9j/4...

  • gif

    starts with: GIF

    base64 encoded: R0lG

  • tif

    a) little endian: starts with: 0x49 0x49 0x2A 0x00

    base64 encoded: SUkqA

    b) big endian: starts with: 0x4D 0x4D 0x00 0x2A

    base64 encoded: TU0AK

  • flv

    starts with FLV

    base64 encoded: RkxW

  • wav/avi/webp and others

    several audio/video/image/graphic -formats are base on RIFF(Resource Interchange Format) The common part is that all files start with RIFF

    base64 encoded: UklGR

    After the RIFFheader, you'll find the specific format starting in the 4 bytes starting at the 9th byte. In the following _ is used as a placeholder for any character.

    wav
    starts with: RIFF____WAVE base64 encoded: UklGR______XQVZF

    webp
    starts with: RIFF____WEBP base64 encoded: UklGR______XRUJQ

    avi
    starts with: RIFF____AVI base64 encoded: UklGR______BVkkg


Regarding the specific example in the question:

in the updated question there's a hint in the attached picture that

the data is first base32 encoded and then base64 encoded.

When we feed an online base32 decoder with the string given in the question (JA2HGSKBJI4DSZ2WGRAS...), we get:

H4sIAJ89gV4A/+1ZURaEIAi8SkfQ+1/O3f7MtEBfMgz9rC/diXmIA5hSzun3HNdBbgbtVP2v/2+LowM837wFHKxZbmE9pQfsLOaiLAL8kvIk4MBma17ufHQbIJCXoWNZZKGPWB5QljvXIuXOmm0SgLixJw8HRC8Tbmz7x5eIspypaZHSWbj8cAhdjli2WUkR1sv2dZmwXhZlDnIcCl0GyrFX6fKkBEBTBsq+9uY2Ecug2Rf0xtaJlNdYJuxjP9kcd1LOW/fQXtb1sd3fSTGXFTx3UjfGFx6uJGjeIAAA

It starts with H4s, so according to what I wrote about how to recognize file types in base64 encoding, it's a base64 encoded gzip file.

This can be saved in a text file and then uploaded on base64decode.org where it will be converted into a gzip file. When you download and open that gzip file it contains a file with text like this:

00110000 00110000 00110001 00110001 00110000 00110001 00110000 00110000 00100000 00110000 00110000 00110001 00110001 00110000 00110001 00110000 00110001 00100000 ...

Conclusion for this case: The original string/file is a gzip file that was first base64 encoded and the base64 encoded part was again encoded with base32.

这篇关于如何检查 base64 字符串是否为文件(什么类型?)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆