大 pandas 重新编制索引仅对唯一值的索引对象有效 [英] pandas Reindexing only valid with uniquely valued Index objects

查看：125 发布时间：2020/5/24 1:03:44 python pandas

本文介绍了大 pandas 重新编制索引仅对唯一值的索引对象有效的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

已安装最新版本的pandas 0.9.0，以防出现错误.忘记提及这是Python 2.7. 尝试读取Excel文件.那部分看起来还可以. 最初，我尝试对熊猫数据框的每一行尝试iteritems()，因为id_company必须针对mysql数据库(不包括代码)进行验证.与将其放入元组相同/相似的错误消息(下面的代码).错误消息如下.

Installed latest version of pandas 0.9.0 in case this was an error. forgot to mention this is Python 2.7. Trying to read Excel file. That part seems ok. Originally, I was trying iteritems() for each row of the pandas dataframe, as the id_company had to be verified against a mysql database (code not included). Same/similar error message to putting it into a tuple (code is below). Error message follows.

请注意，有一个.reindex()，但以前也没有作用. reindex()有点像冰雹玛丽.

Note there is a .reindex() but it didn't work before, either. The reindex() was kind of a hail-mary.

作为一种解决方法，我可能会简单地从目标sql导入并进行联接.我担心的是数据集的大小.

As a work-around, I'm probably going to simply import from my target sql and do a join. I'm concerned because of the size of the datasets.

 import pandas as pd
def runNow():
    #identify sheet
    source = 'C:\Users\jlalonde\Desktop\startup_geno\startupgenome_w_id_xl_20121109.xlsx'
    xls_file = pd.ExcelFile(source)
    sd = xls_file.parse('Sheet1')
    source_u = sd.drop_duplicates(cols = 'id_company', take_last=False)
    source_r = source_u[['id_company','id_good','description', 'website','keyword', 'company_name','founded_month', 'founded_year', 'description']]
    source_i = source_r.reindex() #hail mary
    tup_r = [tuple(x) for x in source_i.values]

这是错误:

Traceback (most recent call last):
  File "<pyshell#10>", line 1, in <module>
    sg_sql_2.runNow()
  File "sg_sql_2.py", line 31, in runNow
    tup_r = [tuple(x) for x in source_r.values]
  File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 1443, in as_matrix
    return self._data.as_matrix(columns).T
  File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 723, in as_matrix
    mat = self._interleave(self.items)
  File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 743, in _interleave
    indexer = items.get_indexer(block.items)
  File "C:\Python27\lib\site-packages\pandas\core\index.py", line 748, in get_indexer
    raise Exception('Reindexing only valid with uniquely valued Index '
Exception: Reindexing only valid with uniquely valued Index objects

因此，在一天中的大部分时间里，我把头撞在墙上之后，谁能告诉我这是一个错误还是我遗漏了一些明显的东西?

So, after hammering my head against the wall on this for the better part of the day, can anyone tell me if this is a bug or if I am missing something really obvious?

大 pandas 重新编制索引仅对唯一值的索引对象有效 [英] pandas Reindexing only valid with uniquely valued Index objects

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

大 pandas 重新编制索引仅对唯一值的索引对象有效 [英] pandas Reindexing only valid with uniquely valued Index objects

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭