如何解决此“类型错误:序列项 0:预期的 str 实例,找到浮点数" [英] How to fix this “TypeError: sequence item 0: expected str instance, float found”

查看:81
本文介绍了如何解决此“类型错误:序列项 0:预期的 str 实例,找到浮点数"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 groupby 方法组合数据框列中的单元格值(字符串),使用逗号分隔分组单元格中的单元格值.我遇到了以下错误:

I am trying to combine the cell values (strings) in a dataframe column using groupby method, separating the cell values in the grouped cell using commas. I ran into the following error:

TypeError: sequence item 0: expected str instance, float found

错误出现在以下代码行,完整代码见代码块:

The error occurs on the following line of code, see the code block for complete codes:

toronto_df['Neighbourhood'] = toronto_df.groupby(['Postcode','Borough'])['Neighbourhood'].agg(lambda x: ','.join(x))

似乎在groupby函数中,未分组数据帧中每一行对应的索引在加入之前自动添加到字符串中.这会导致类型错误.但是,我不知道如何解决这个问题.我浏览了很多线程,但没有找到解决方案.我将不胜感激任何指导或帮助!

It seems that in the groupby function, the index corresponding to each row in the un-grouped dataframe is automatically added to the string before it was joined. This causes the TypeError. However, I have no idea how to fix the issue. I browsed a lot of threads but didn't find a solution. I would appreciate any guidance or assistance!

# Import Necessary Libraries

import numpy as np
import pandas as pd
from bs4 import BeautifulSoup
import requests

# Use BeautifulSoup to scrap information in the table from the Wikipedia page, and set up the dataframe containing all the information in the table

wiki_html = requests.get('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M').text
soup = BeautifulSoup(wiki_html, 'lxml')
# print(soup.prettify())
table = soup.find('table', class_='wikitable sortable')
table_columns = []
for th_txt in table.tbody.findAll('th'):
    table_columns.append(th_txt.text.rstrip('\n'))

toronto_df = pd.DataFrame(columns=table_columns) 

for row in table.tbody.findAll('tr')[1:]:
    row_data = []
    for td_txt in row.findAll('td'):
        row_data.append(td_txt.text.rstrip('\n'))
    toronto_df = toronto_df.append({table_columns[0]: row_data[0],
                                    table_columns[1]: row_data[1],
                                    table_columns[2]: row_data[2]}, ignore_index=True)
toronto_df.head()

# Remove cells with a borough that is Not assigned
toronto_df.replace('Not assigned',np.nan, inplace=True)
toronto_df = toronto_df[toronto_df['Borough'].notnull()]
toronto_df.reset_index(drop=True, inplace=True)
toronto_df.head()

# If a cell has a borough but a Not assigned neighborhood, then the neighborhood will be the same as the borough
toronto_df['Neighbourhood'] = toronto_df.groupby(['Postcode','Borough'])['Neighbourhood'].agg(lambda x: ','.join(x))
toronto_df.drop_duplicates(inplace=True)
toronto_df.head()

'Neighbourhood' 列的预期结果应该使用逗号分隔分组单元格中的单元格值,显示如下内容(我还不能发布图片,所以我只提供链接):

The expected result of the 'Neighbourhood' column should separate the cell values in the grouped cell using commas, showing something like this (I cannot post images yet, so I just provide the link):

https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/7JXaz3NNEeiMwApe4i-fLg_40e690ae0e927abda2d4bde7d94ed133_Screen-Shot-2018-06-18-at-7.17.57-PM.png?expiry=1557273600000&hmac=936wN3okNJ1UTDA6rOpQqwELESvqgScu08_Spai0aQQ

推荐答案

正如评论中提到的,NaN 是一个浮点数,因此尝试对其进行字符串操作是行不通的(并且这就是错误信息的原因)

As mentioned in the comments, the NaN is a float, so trying to do string operations on it doesn't work (and this is the reason for the error message)

用这个替换你的最后一部分代码:nan 的填充是根据您在评论中指定的逻辑使用布尔索引完成的

Replace your last part of code with this: The filling of the nan is done with boolean indexing according to the logic you specified in your comment

# If a cell has a borough but a Not assigned neighborhood, then the neighborhood will be the same as the borough
toronto_df.Neighbourhood = np.where(toronto_df.Neighbourhood.isnull(),toronto_df.Borough,toronto_df.Neighbourhood)
toronto_df['Neighbourhood'] = toronto_df.groupby(['Postcode','Borough'])['Neighbourhood'].agg(lambda x: ','.join(x))

这篇关于如何解决此“类型错误:序列项 0:预期的 str 实例,找到浮点数"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆