如何确定字符串是否已压缩? [英] How to determine if a string was compressed?

查看:235
本文介绍了如何确定字符串是否已压缩?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何确定是否使用 gzcompress (除了在调用 之前/之后比较字符串的大小之外gzuncompress 还是那是正确的方法)?

How can I determine whether a string was compressed with gzcompress (aparts from comparing sizes of string before/after calling gzuncompress, or would that be the proper way of doing it) ?

推荐答案

字符串和压缩字符串都只是字节序列。您不能真正将一个字节序列与另一个字节序列区分开。您应该知道字节数据是否代表伴随的元数据中的压缩格式。

A string and a compressed string are both simply sequences of bytes. You cannot really distinguish one sequence of bytes from another sequence of bytes. You should know whether a blob of bytes represents a compressed format or not from accompanying metadata.

如果您确实需要猜测以编程方式,您可以尝试以下几种操作:

If you really need to guess programmatically, you have several things you can try:


  • 尝试解压缩字符串并查看解压缩操作是否成功。如果失败,则字节可能不代表压缩字符串。

  • 尝试检查明显的怪异字节,如 0x20 。这些字节通常不在常规文本中使用。虽然并不能真正保证它们出现在压缩字符串中。

  • 使用 mb_check_encoding 来查看您怀疑它所在的编码中的字符串是否有效。如果无效,则可能已对其进行了压缩(或者您检查了错误的编码)。需要注意的是,几乎任何字节序列在几乎所有单字节编码中均有效,因此仅适用于多字节编码。

  • Try to uncompress the string and see if the uncompress operation succeeds. If it fails, the bytes probably did not represent a compressed string.
  • Try to check for obvious "weird" bytes like anything before 0x20. Those bytes aren't typically used in regular text. There's no real guarantee that they occur in a compressed string though.
  • Use mb_check_encoding to see whether a string is valid in the encoding you suspect it to be in. If it isn't, it's probably compressed (or you checked for the wrong encoding). With the caveat that virtually any byte sequence is valid in virtually every single-byte encoding, so this'll only work for multi-byte encodings.

这篇关于如何确定字符串是否已压缩?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆