如何将不正确保存的字节对象转换回字节?(python/Django的) [英] how can I convert an incorrectly-saved bytes object back to bytes? (python/django)

查看:70
本文介绍了如何将不正确保存的字节对象转换回字节?(python/Django的)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经下载了一些带有请求的网页,并使用Django的ORM将内容保存在postgres数据库中(在文本字段中).有关正在发生的事情的一些sudocode,请继续:

I've downloaded some web pages with requests and saved the content in a postgres database [in a text field] using Django's ORM. For some sudocode of what's going on, here ya go:

art = Article()
page = requests.get("http://example.com")
art.raw_html = page.content
art.save()

我验证了page.content是一个字节对象,并且我猜想我认为该对象在保存时会自动解码,但是似乎并没有...已将其转换为某种奇怪的字符串表示形式一个字节对象,表面上看是Django.当我调用art.raw_html时,它在解释器中看起来像这样:

I verified that page.content is a bytes object, and I guess I assumed that this object would automatically be decoded upon saving, but it doesn't seem to be... it has been converted to some weird string representation of a bytes object, ostensibly by Django. It looks like this in the interpreter when I call art.raw_html:

'b \'<!DOCTYPE html> \\ n< html lang ="en" class ="pb-page"

如果我用print调用它,我会得到:

And if I call it with print I get this:

b'<!DOCTYPE html> \ n< html lang ="zh-CN" class ="pb-page"

就我的一生而言,即使我剪掉了开头的b'和结尾的'.我也无法将其重新编码为字节对象.

And for the life of me I can't re-encode it to a bytes object, even if I trim off the leading b' and trailing '.

我觉得有一个简单的解决方案,我觉得自己是个白痴...但是经过大量的实验和谷歌搜索,我没有弄清楚.

I feel like there's an easy solution to this and I feel like an idiot... but after lots of experiments and googling, I'm not figuring it out.

顺便说一句,如果我手动复制从print语句返回的内容(如使用光标),则可以将剪贴板的内容完全转换回byte对象,然后将其解码为一些可读格式的html.

Incidentally, if I manually copy what's returned from the print statement (like with my cursor), I can convert the clipboard contents back to a bytes object just fine and then decode it into some readably-formatted html.

显然,有更好的方法.(是的,今后我将一开始停止保存这样的内容.)

Clearly there is a better way. (And yes, going forward I'll stop saving the content like this in the first place.)

推荐答案

您可以按以下方式使用eval或ast.literal_eval.

You can use eval or ast.literal_eval as below.

data = "b'gAAAAABc1arg48DmsOwQEbeiuh-FQoNSRnCOk9OvXXOE2cbBe2A46gmP6SPyymDft1yp5HsoHEzXe0KljbsdwTgPG5jCyhMmaA=='"

eval(data)
b'gAAAAABc1arg48DmsOwQEbeiuh-FQoNSRnCOk9OvXXOE2cbBe2A46gmP6SPyymDft1yp5HsoHEzXe0KljbsdwTgPG5jCyhMmaA=='

使用ast.literal_eval

Using ast.literal_eval

import ast
ast.literal_eval(data)  

感谢@ juanpa.arrivillaga.我只是回答.

thanks to @juanpa.arrivillaga. I just added to answer.

这篇关于如何将不正确保存的字节对象转换回字节?(python/Django的)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆