蟒蛇重模块替换文本文件中的二进制数据? [英] python re module to replace the binary data inside a text file?

查看:173
本文介绍了蟒蛇重模块替换文本文件中的二进制数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道混合文本和二进制是可怕的,但我不得不这样做。

I know mixing text and binary is awful, but I have to do this.

我要替换的二进制内容,这是周围的内容类型:图像的字符串XXXXXXXX

I want to replace the binary content, which is around with "Content-Type: image" and "----", by string "XXXXXXXX"

所以,code的测试方法是:

So the code for test is:

# coding=utf-8
import re
raw_data = open('r_img.txt').read()
#data = re.sub(r"Content-Type: image.*?-----","Content-Type: imageXXXXXXX-----", raw_data, re.S)
data = re.sub(r"Content-Type: image[^-]*-----","Content-Type: imageXXXXXXX-----", raw_data, re.S)
print data

和文件r_img.txt将是:

And the file r_img.txt would be:

Content-Disposition: form-data; name="commodity_pic1"; filename="C:\Documents and Settings\tim\My Documents\My Pictures\Pic\222A8888.jpg"

Content-Type: image/pjpeg



EEE? JFIF  H H  EEE C 

EEE C       

 EEEWhfEEE[e?EEEEEEqEEEEEEEEEEEEEEEZIOEEE(r5?-iEEEEEEEEEEEEEEE?EEE?EEEEEE
-----------------------------7db27132d0198

我曾尝试与string.replace()和应用re.sub,但我仍无法找到答案。

I had try string.replace() and re.sub, but I can't still find the answer.

推荐答案

这对我的作品:

data = re.sub(r"Content-Type: image.*-----","Content-Type: imageXXXXXXX-----", 
              raw_data, 0, re.DOTALL)

从本质上讲它在一个​​贪婪的方式匹配内容类型之间的所有字符:图片 ----- 。 0意味着匹配此模式的所有出现。也许这是多余的,但是你不能跳过它,你也想用标志 re.DOTALL 修改的任何字符将包含的意义换行。

Essentially it matches in a greedy way all characters between Content-Type: image and -----. The 0 means "match all occurrences of this pattern". Probably this is superfluous for you, but you can't skip it as you also wanted to use the flag re.DOTALL that modify the meaning of "any characters" to also include newlines.

心连心!

这篇关于蟒蛇重模块替换文本文件中的二进制数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆