无法从获得的输出创建数据框 [英] unable to create dataframe from output obtained

查看:88
本文介绍了无法从获得的输出创建数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用lstm方法实施情绪分析,我已经训练了模型,并且正在做我的预测部分,而我已经正确地进行了预测.现在,我想添加在数据帧中获得的输出,我尝试了一下,但是结果有误.有人可以帮我更正我的代码吗.

I am implementing an emotion analysis using lstm method, I have already trained my model and I am doing my prediction part where I have already done the prediction correctly. Now i want to add the output I obtained in a dataframe, I tried it but I am having wrong results. Can someone please help me with the correction of my codes please.

我将发布代码以及获得的输出以及输出的显示方式.

I am posting my codes along with the output obtained and how I want my output to be.

这是我的代码:

        with open('output1.json', 'w') as f:
            json.dump(new_data, f)

selection1 = new_data['selection1']

for item in selection1:
    name = item['name']
    print ('>>>>>>>>>>>>>>>>>> ', name)
    Date = item['reviews']
    for d in Date:
        date = d['date']
        print('>>>>>>>>>>>>>>>>>> ', date)
    CommentID = item['reviews']
    for com in CommentID:
        comment = com['review'].lower()  # converting all to lowercase
        result = re.sub(r'\d+', '', comment)  # remove numbers
        results = (result.translate(
        str.maketrans('', '', string.punctuation))).strip()  # remove punctuations and white spaces
        comments = remove_stopwords(results)
        print('>>>>>>',comments)

    #add the words in comments that are already present in the keys of dictionary
        encoded_samples = [[word2id[word] for word in comments if word in word2id.keys()]]


    # Padding
        encoded_samples = keras.preprocessing.sequence.pad_sequences(encoded_samples, maxlen=max_words)

     # Make predictions
        label_probs, attentions = model_with_attentions.predict(encoded_samples)
        label_probs = {id2label[_id]: prob for (label, _id), prob in zip(label2id.items(), label_probs[0])}

       #Get word attentions using attenion vector
        print(label_probs)

dataframe={'name': [name],'date': [date], 'comment': [comment], 'label':[label_probs]}
table = pd.DataFrame(dataframe, columns=['name','date', 'comment', 'label'])
print(table)

下面是我获得的输出:

                             name  ...                                              label
0  Oasis Villas by Evaco Holidays  ...  {'joy': 0.018415175, 'surprise': 4.6217923e-05...

[1 rows x 4 columns]

这是不正确的..

以上打印输出如下:

>>>>>>>>>>>>>>>>>>  Heritage The Villas
>>>>>>>>>>>>>>>>>>  December 23, 2018
>>>>>>>>>>>>>>>>>>  January 10, 2018
>>>>>>>>>>>>>>>>>>  January 05, 2018
>>>>>>>>>>>>>>>>>>  July 23, 2015
>>>>>> ['booked', 'villa', 'valriche', 'mari', 'deal', 'nights', 'checkin', 'lengthy', 'almost', 'hours', 'requested', 'make', 'deposit', 'rs', 'credit', 'card', 'never', 'informed', 'upon', 'booking']
{'joy': 0.03916626, 'surprise': 8.855841e-05, 'love': 0.09760322, 'anger': 0.6667219, 'sadness': 0.0010696664, 'fear': 0.1953505}
>>>>>> ['lovely', 'place', 'recharge']
{'joy': 0.0032763705, 'surprise': 0.0022357441, 'love': 0.11014917, 'anger': 0.09073347, 'sadness': 0.7297514, 'fear': 0.063853815}
>>>>>> ['one', 'word', 'suoerb']
{'joy': 0.13245165, 'surprise': 0.00014895896, 'love': 0.3051644, 'anger': 0.35698283, 'sadness': 0.00021378326, 'fear': 0.20503832}
>>>>>> ['definitely', 'star', 'extremely', 'poor', 'staff', 'service']
{'joy': 0.031011488, 'surprise': 9.065295e-05, 'love': 0.4330521, 'anger': 0.30516183, 'sadness': 0.000128366, 'fear': 0.23055555}
>>>>>>>>>>>>>>>>>>  Oasis Villas by Evaco Holidays
>>>>>>>>>>>>>>>>>>  January 12, 2020
>>>>>>>>>>>>>>>>>>  June 21, 2019
>>>>>>>>>>>>>>>>>>  May 30, 2017
>>>>>>>>>>>>>>>>>>  December 06, 2015
>>>>>> ['excellent']
{'joy': 0.030443083, 'surprise': 1.9940982e-05, 'love': 0.036508515, 'anger': 0.8760464, 'sadness': 0.0014704008, 'fear': 0.055511605}
>>>>>> ['spent', 'days', 'family', 'really', 'enjoyed', 'stay', 'advantage', 'oasis', 'privacy', 'children', 'years', 'going', 'dinnerbreakfast', 'hotels', 'often', 'burden', 'rather', 'enjoyable', 'experience', 'children', 'could', 'dinnermessnoise', 'without', 'us', 'worry', 'anything', 'pool', 'right', 'front', 'door', 'made', 'everything', 'children', 'staff', 'friendly', 'welcoming', 'artee', 'menni', 'made', 'sure', 'everything', 'fine', 'brought', 'breakfast', 'warm', 'croissants', 'every', 'morning', 'atish', 'made', 'checkin', 'arrangements', 'fast', 'hassle', 'free', 'definitely', 'go']
{'joy': 0.017099116, 'surprise': 7.2406554e-05, 'love': 0.2651248, 'anger': 0.14370358, 'sadness': 5.6088167e-05, 'fear': 0.573944}
>>>>>> ['passé', 'un', 'excellent', 'séjours', 'les', 'villas', 'oasis', 'sont', 'de', 'loin', 'les', 'meilleur', 'villas', 'du', 'groupe', 'evaco']
{'joy': 0.032395113, 'surprise': 9.250247e-05, 'love': 0.08593403, 'anger': 0.6815374, 'sadness': 0.0015245328, 'fear': 0.1985165}

我希望我的数据框像这样:

I want my dataframe to be like:

name                                  date               comment                               label

Heritage The Villas            December 23, 2018   ['booked', 'villa', 'valriche'...]  {'joy': 0.03916626, 'surprise': 8.855841e-05, 'love': 0.09760322, 'anger': 0.6667219, 'sadness': 0.0010696664, 'fear': 0.1953505}
Heritage The Villas           January 10, 2018   ['lovely', 'place', 'recharge']     {'joy': 0.0032763705, 'surprise': 0.0022357441, 'love': 0.11014917, 'anger': 0.09073347, 'sadness': 0.7297514, 'fear': 0.063853815}
.....
Oasis Villas by Evaco Holidays January 12, 2020   ['excellent']                       {'joy': 0.030443083, 'surprise': 1.9940982e-05, 'love': 0.036508515, 'anger': 0.8760464, 'sadness': 0.0014704008, 'fear': 0.055511605}
Oasis Villas by Evaco Holidays  June 21, 2019    ['spent', 'days', 'family'....]  {'joy': 0.017099116, 'surprise': 7.2406554e-05, 'love': 0.2651248, 'anger': 0.14370358, 'sadness': 5.6088167e-05, 'fear': 0.573944}
.....

能帮我吗?

推荐答案

问题出在这一行:

dataframe={'name': [name],'date': [date], 'comment': [comment], 'label':[label_probs]}

为名称,日期,评论,label_probs留空列表,并将它们附加到该列表中,然后将该列表传递给您的DataFrame

Make empty lists for name, date, comment, label_probs and append them to that list and pass that list to your DataFrame

selection1 = new_data['selection1']
names = []
dates = []
comments = []
labels = []

with open('output1.json', 'w') as f:
    json.dump(new_data, f)

selection1 = new_data['selection1']

for item in selection1:
    name = item['name']
    names.append(name) #<-----------------
.
.
.
    Date = item['reviews']
    for d in Date:
        date = d['date']
        dates.append(date) #<--------------
        print('>>>>>>>>>>>>>>>>>> ', date)

.
.
.
.

然后

dataframe={'name': names,'date': dates,........}

更新后的答案:

    for item in selection1:
        name = item['name']
        Date = item['reviews']
        for d in Date:
            names.append(name) #<-----------------
            date = d['date']
            dates.append(date) #<--------------
.
.
.

这应该可以解决您的问题.

This should solve your problem.

这篇关于无法从获得的输出创建数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆