使用Elasticsearch DSL索引数据时出错 [英] Error when indexing data using elasticsearch dsl

查看:89
本文介绍了使用Elasticsearch DSL索引数据时出错的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个模型,如下所示:

I have two models which are as follows:

class PostUser(models.Model):

    user_id = models.CharField(max_length=1000,blank=True,null=True)
    reputation = models.CharField(max_length = 1000 , blank = True , null = True)

    def __unicode__(self):
        return self.user_id

    def indexing(self):
        obj = PostUserIndex(
            meta = {'id': self.id},
            user_id = self.user_id,
            reputation = self.reputation,
        )
        obj.save(index = 'post-user-index')
        return obj.to_dict(include_meta=True)

class Posts(models.Model):

    user_post_id = models.CharField(max_length = 1000 , blank = True , null = True)
    score = models.CharField(max_length = 1000 , blank = True , null = True)
    owner_user_id = models.ForeignKey(PostUser,default="-100")


    def __unicode__(self):

        return self.user_post_id

    def indexing(self):
        obj = PostsIndex(
            meta = {'id': self.id},
            user_post_id = self.user_post_id,
            score = self.score,
            owner_user_id = self.owner_user_id,
        )
        obj.save(index = 'newuserposts-index')
        return obj.to_dict(include_meta=True)

我尝试索引数据的方式如下:

The way I am trying to index my data is as follows:

class PostUserIndex(DocType):
    user_id = Text()
    reputation = Text()


class PostsIndex(DocType):
    user_post_id = Text()
    score = Text()
    owner_user_id = Nested(PostUserIndex)

然后我尝试运行以下方法为数据建立索引:

Then i try to run the following method to index data:

def posts_indexing():
    PostsIndex.init(index='newuserposts-index')
    es = Elasticsearch()
    bulk(client=es, actions=(b.indexing() for b in models.Posts.objects.all().iterator()))

我尝试过通过手动输入嵌套属性,还从PostUser的doctype更改为内部doc的不同方法,但是仍然出现奇怪的错误.

I have tried different approaches by manually entering the nested properties and also changing from doctype to inner doc of PostUser but still I am getting the weird error.

错误:

AttributeError:"PostUser"对象没有属性"copy"

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/Users/ammarkhan/Desktop/danny/src/dataquerying/datatoindex.py", line 74, in new_user_posts_indexing
    bulk(client=es, actions=(b.indexing() for b in models.Posts.objects.all().iterator()))
  File "/Users/ammarkhan/Desktop/danny/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 257, in bulk
    for ok, item in streaming_bulk(client, actions, **kwargs):
  File "/Users/ammarkhan/Desktop/danny/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 180, in streaming_bulk
    client.transport.serializer):
  File "/Users/ammarkhan/Desktop/danny/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 58, in _chunk_actions
    for action, data in actions:
  File "/Users/ammarkhan/Desktop/danny/src/dataquerying/datatoindex.py", line 74, in <genexpr>
    bulk(client=es, actions=(b.indexing() for b in models.Posts.objects.all().iterator()))
  File "/Users/ammarkhan/Desktop/danny/src/dataquerying/models.py", line 167, in indexing
    obj.save(index = 'newuserposts-index')
  File "/Users/ammarkhan/Desktop/danny/lib/python2.7/site-packages/elasticsearch_dsl/document.py", line 405, in save
    self.full_clean()
  File "/Users/ammarkhan/Desktop/danny/lib/python2.7/site-packages/elasticsearch_dsl/utils.py", line 417, in full_clean
    self.clean_fields()
  File "/Users/ammarkhan/Desktop/danny/lib/python2.7/site-packages/elasticsearch_dsl/utils.py", line 403, in clean_fields
    data = field.clean(data)
  File "/Users/ammarkhan/Desktop/danny/lib/python2.7/site-packages/elasticsearch_dsl/field.py", line 179, in clean
    data = super(Object, self).clean(data)
  File "/Users/ammarkhan/Desktop/danny/lib/python2.7/site-packages/elasticsearch_dsl/field.py", line 90, in clean
    data = self.deserialize(data)
  File "/Users/ammarkhan/Desktop/danny/lib/python2.7/site-packages/elasticsearch_dsl/field.py", line 86, in deserialize
    return self._deserialize(data)
  File "/Users/ammarkhan/Desktop/danny/lib/python2.7/site-packages/elasticsearch_dsl/field.py", line 166, in _deserialize
    return self._wrap(data)
  File "/Users/ammarkhan/Desktop/danny/lib/python2.7/site-packages/elasticsearch_dsl/field.py", line 142, in _wrap
    return self._doc_class.from_es(data)
  File "/Users/ammarkhan/Desktop/danny/lib/python2.7/site-packages/elasticsearch_dsl/utils.py", line 342, in from_es
    meta = hit.copy()
AttributeError: ‘PostUser' object has no attribute 'copy'

推荐答案

您正在用 indexing 方法调用 .save ,该方法会将文档保存到elasticsearch,然后您还将其传递给 bulk 来完成相同的操作,而 save 则是多余的.

You are calling .save in your indexing methods which will save the document to elasticsearch and then you are also passing it to bulk to accomplish the same, the save is extra.

您还将为 owner_user_id 分配 PostUser 的实例,而不是通过调用 indexing 方法对其进行正确的序列化(不带<代码>保存内部):

You are also assigning an instance of PostUser to owner_user_id instead of properly serializing it by calling the indexing method on it (without the save inside):

  def indexing(self):
    obj = PostsIndex(
        meta = {'id': self.id, 'index': 'newuserposts-index'},
        user_post_id = self.user_post_id,
        score = self.score,
        owner_user_id = self.owner_user_id.indexing(),
    )
    return obj.to_dict(include_meta=True)

这篇关于使用Elasticsearch DSL索引数据时出错的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆