mongodb插入显示“文档中的字符串必须是有效的UTF-8" [英] mongodb insertion shows 'strings in documents must be valid UTF-8'

查看:280
本文介绍了mongodb插入显示“文档中的字符串必须是有效的UTF-8"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我的代码

        for code, data in dict_data.items(): 

            try:
                collection2.insert({'_id':code,'data':data})

            except Exception as e:
                print code,'>>>>>>>', str(e)
                sys.exit()

它退出与

         524715 >>>>>>> strings in documents must be valid UTF-8

我只能通过try catch方法找出错误. dict_data是一个大型词典,其中包含其他集合中的计算值.

I could find out the error only by the try catch method. dict_data is a large dictionary which contains calculated values from other collection.

我该如何解决?

谢谢

推荐答案

如果您使用的是PyMongo和Python 2.x,则应在utf-8或unicode字符串中使用str.看: http://api.mongodb.org/python /current/tutorial.html#a-note-on-unicode-strings

If you are using PyMongo and Python 2.x, you should use str in utf-8 or unicode strings. See: http://api.mongodb.org/python/current/tutorial.html#a-note-on-unicode-strings

如果data是包含多个字符串的字典,则可以使用以下函数将它们全部转换为unicode:

If datais a dict with multiple strings you can convert all of them to unicode using following function:

def convert2unicode(mydict):
    for k, v in mydict.iteritems():
        if isinstance(v, str):
            mydict[k] = unicode(v, errors = 'replace')
        elif isinstance(v, dict):
            convert2unicode(v)

for code, data in dict_data.items(): 
    try:
        convert2unicode(data)
        collection2.insert({'_id':code,'data': data})
    except Exception as e:
        print code,'>>>>>>>', str(e)
        sys.exit()

先前的代码将转换unicode中的所有str值,而键"保持不变,具体取决于根本原因,您还应该转换键".

Previous code will convert all str values in unicode, the "keys" keep untouched, depending on root cause you should also convert the "keys".

这篇关于mongodb插入显示“文档中的字符串必须是有效的UTF-8"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆