在 Python 3 中使用 UTF-8 解码 VIEWSTATE 字符串 [英] Decoding a VIEWSTATE string with UTF-8 in Python 3
问题描述
我在 Python 3 中解码 ASP.NET 视图状态字符串时遇到问题.当我尝试使用 bash 的 base64
命令解码字符串时,它成功解码字符串并且我能够看到我需要的所有信息(大部分是希伯来语,意思是 UTF-8).视图状态当然只是 base64 编码而不是加密.
I'm having trouble decoding a ASP.NET view state string in Python 3.
When I try decoding the string using bash's base64
command, it decodes the string successfully and I'm able to see all the information I need (most of it is in Hebrew, meaning UTF-8). The view state is of course base64-encoded only and not encrypted.
但是,当我尝试使用 Python 的 base64
库对字符串进行解码,然后将字节数组解码为 UTF-8 字符串时,我收到一条错误消息:
However, when I try do decode the string using Python's base64
library and then decoding the byte array to a UTF-8 string, I get an error message:
UnicodeDecodeError: 'utf-8' 编解码器无法解码字节 0xff 的位置0:无效的起始字节
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
我应该提到,由于字符串是一个视图状态,前几个字节是二进制数据,0xff"是有意义的,但是在这些字节之后,数据是可读的.
I should mention that since the string is a view state, the first few bytes are binary data and "0xff" makes sense, however after these bytes the data is readable.
Python 3 代码段:
Python 3 code segment:
b = "The_ViewState"
print(base64.b64decode(b).decode("utf-8"))
为什么解码在 bash 中工作而不是在 Python 中工作?如何解决?
Why does decoding work in bash and not in Python? How can this be resolved?
推荐答案
经过一番研究,我找到了答案:
After a little bit of research I found the answer:
b = "The_ViewState"
print(base64.b64decode(b).decode("utf-8", "ignore"))
添加忽略"标志会导致 decode()
丢弃任何无效的字节序列,从而将不相关的字节排除在解码字符串之外.
Adding the "ignore" flag causes decode()
to discard any invalid byte sequences, thus leaving the irrelevant bytes out of the decoded string.
这篇关于在 Python 3 中使用 UTF-8 解码 VIEWSTATE 字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!