python-非纯文本文件的Steganographer文件处理错误 [英] python - Steganographer File Handling Error for non plain-text files

查看：105 发布时间：2020/9/24 18:36:30 python python-3.x byte file-handling steganography

本文介绍了python-非纯文本文件的Steganographer文件处理错误的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经构建了 Python Steganographer ，并正在尝试将GUI添加到它。在我关于读取各种文件的上一个问题我之前的问题之后在Python中因为，隐写术者只能对图像中的字节进行编码。我想添加支持以直接编码任何扩展名的文件并在其中进行编码。为此，我正在读取二进制文件并尝试对其进行编码。对于基本上包含纯文本UTF-8的文件，它可以正常工作，因为它可以轻松编码 .txt 和 .py 文件。

我更新的代码是：

  from PIL import Image 
 
 import os 
 
 class StringTooLongException（Exception）：
 pass 
 
 class InvalidBitValueException（Exception）：
 pass 
 
 def str2bin（message）：
 binary = bin（int.from_bytes（message，'big'））
 return binary [2：] 
 
 def bin2str（二进制）：
n = int（binary，2）
 return n.to_bytes（（n.bit_length（）+ 7）// 8，'big'）
 
 def隐藏（文件名，消息，位= 2）：
 image = Image.open（文件名）
二进制= str2bin（消息）+'00000000'
 
 if（len（binary） ）％8！= 0：
二进制='0'*（8-（（len（binary））％8））+二进制
 
 data = list（image.getdata（） ）
 
 newData = [] 
 
如果len（data）*位< len（binary）：
如果位>引发StringTooLongException 
 
 8：
提高InvalidBitValueException 
 
索引= 0 
表示数据中的像素：
（如果索引< len（二进制）：
像素=列表（像素）
像素[0]>> =位
像素[0]< ==位
像素[0 ] + = int（'0b'+ binary [index：index + bits]，2）
像素=元组（pixel）
索引+ =位
 
 newData.append（像素）
 
 image.putdata（newData）
 image.save（os.path.dirname（文件名）+'/ code-'+ os.path.basename（文件名），'PNG '）
 
 return len（binary）
 
 def unhide（filename，bits = 2）：
 image = Image.open（文件名）
数据= image.getdata（）
 
（如果位> 8：
引发InvalidBitValueException 
 
二进制=''
 
索引= 0 
 
而不是（len（binary）％8 == 0和binary [-8：] =='00000000'）：
 value ='00000000'+ bin（data [index] [0]）[2：] 
 binary + = value [-bits ：] 
索引+ = 1 
 
消息= bin2str（binary）
返回消息

现在，当我尝试隐藏 .pdf 或 .docx 时出现问题文件。发生了几件事：

1）Microsoft Word或Adobe Acrobat显示文件已损坏。

2 ）文件大小从40KB减少到3KB，这是明显的错误迹象。

我认为其原因可能是该文件包含NULL字符读取，而我的程序不再对此进行读取。您有其他选择吗？

我有一个更改结束字节的想法，但它的结果仍然与文件可能包含该字节的结果相同。 / p>

再次感谢！

解决方案

您可以使用和结束-stream（EOS）标记，当您确定标记序列不会显示在消息流中时。当您无法保证时，有两种选择：

创建一个更复杂的EOS标记，由许多字节组成。证明不会像以前那样出现相同的问题可能很麻烦，或者

在邮件的开头添加一个标头，该标头编码要读取的位/字节数完整的消息提取。

通常，只要我事先知道要传输的信息并且仅依靠它，我都会使用标头当我不知道我的字节流何时终止时（例如动态压缩）的EOS标记。

要进行嵌入，您应该瞄准：

获取二进制字符串

测量其长度

将其转换为整数到固定大小的二进制文件，例如32位

在邮件前面附加该位字符串bitli

将所有这些嵌入到封面中中

并提取：

提取前32位

将其转换为整数以获取消息的位字符串长度

开始m索引32并提取必要的位数

转换回字节流并保存到文件

作为奖励，您可以在标题中添加各种信息，例如原始文件的名称。只要所有内容都以某种方式编码，您以后就可以提取它。例如。

  header = 4个字节表示消息字符串的长度+ 
 1个字节表示字符数在文件名+ 
中，文件名

的字节数

I've built a Python Steganographer and am trying to add a GUI to it. After my previous question regarding reading all kinds of files in Python. Since, the steganographer can only encode bytes in image. I want to add support to directly encode a file of any extension and encoding in it. For this, I am reading the file in binary and trying to encode it. It works fine for files which basically contains plain-text UTF-8 because it can easily encode .txt and .py files.

My updated code is:

from PIL import Image

import os

class StringTooLongException(Exception):
    pass

class InvalidBitValueException(Exception):
    pass

def str2bin(message):       
    binary = bin(int.from_bytes(message, 'big'))
    return binary[2:]

def bin2str(binary):
    n = int(binary, 2)
    return n.to_bytes((n.bit_length() + 7) // 8, 'big')

def hide(filename, message, bits=2):
    image = Image.open(filename)
    binary = str2bin(message) + '00000000'

    if (len(binary)) % 8 != 0:
        binary = '0'*(8 - ((len(binary)) % 8)) + binary

    data = list(image.getdata())

    newData = []

    if len(data) * bits < len(binary):
        raise StringTooLongException

    if bits > 8:
        raise InvalidBitValueException

    index = 0
    for pixel in data:
        if index < len(binary):
            pixel = list(pixel)
            pixel[0] >>= bits
            pixel[0] <<= bits
            pixel[0] += int('0b' + binary[index:index+bits], 2)
            pixel = tuple(pixel)
            index += bits

        newData.append(pixel)

    image.putdata(newData)
    image.save(os.path.dirname(filename) + '/coded-'+os.path.basename(filename), 'PNG')

    return len(binary)

def unhide(filename, bits=2):
    image = Image.open(filename)
    data = image.getdata()

    if bits > 8:
        raise InvalidBitValueException

    binary = ''

    index = 0

    while not (len(binary) % 8 == 0 and binary[-8:] == '00000000'):
        value = '00000000' + bin(data[index][0])[2:]
        binary += value[-bits:]
        index += 1

    message = bin2str(binary)
    return message

Now, the problem comes when I try to hide .pdf or .docx files in it. Several things are happening:

1) Microsoft Word or Adobe Acrobat shows that the file is corrupt.

2)The file size is considerable reduced from 40KB to 3KB which is a clear sign of error.

I think that the reason behind this could be that the file contains a NULL character reading which my program does not read further. Do you have any alternative idea for it?

I have an idea to change the ending byte but it may still have the same result as a file may contain that byte.

Thanks, again!

解决方案

You can use and end-of-stream (EOS) marker when you are certain the marker sequence will not show up in your message stream. When you can't guarantee that, you have two options:

create a more complicated EOS marker, comprised of many bytes. This can be quite the nuisance to prove the same problem won't arise as before, or
Add a header at the beginning of your message, which encodes how many bits/bytes to read for the complete message extraction.

Generally, I'd use a header whenever I have information beforehand that I want to transmit and only rely on EOS markers when I don't know when my byte stream will terminate, e.g., on-the-fly compression.

For embedding, you should aim to:

get your binary string
measure its length
convert that integer to a binary of fixed size, say, 32 bits
attach that bitstring in front of your message bitstring
embed all of that to your cover medium

And for extraction:

extract the first 32 bits
convert those to an integer to get your message bitstring length
start from index 32 and extract the neccessary number of bits
convert back to a bytestream and save to a file

As a bonus, you can add all sorts of information to your header, e.g., the name of the original file. As long as it's all encoded in a way you can extract it later. For example.

header = 4 bytes for the length of the message string +
         1 byte for the number of characters in the filename +
         that many bytes for the filename

这篇关于python-非纯文本文件的Steganographer文件处理错误的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

python-非纯文本文件的Steganographer文件处理错误 [英] python - Steganographer File Handling Error for non plain-text files

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

python-非纯文本文件的Steganographer文件处理错误 [英] python - Steganographer File Handling Error for non plain-text files

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭