将值添加到pandas中的现有列 [英] Adding values to existing columns in pandas

查看：76 发布时间：2020/5/24 3:07:03 python python-2.7 pandas

本文介绍了将值添加到pandas中的现有列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我循环进入目录中的csv文件，并使用pandas读取它们. 对于每个csv文件，我都有一个类别和一个市场. 然后，我需要从数据库中获取类别ID和市场ID，这对于此csv文件是有效的.

I loop into csv files in a directory and read them with pandas. For each csv files I have a category and a marketplace. Then I need to get the id of the category and the id of the marketplace from the database which will be valid for this csv file.

finalDf是一个数据帧，其中包含所有csv文件的所有产品，我需要在数据中附加当前csv的数据.

the finalDf is a dataframe containing all the products for all the csv files and I need to append it with data fron the current csv.

使用以下方式检索当前CSV产品的列表:

The list of the products of the current CSV are retrived using:

df['PRODUCT']

我需要将它们附加到finalDf并使用:

I need to append them to the finalDf and I used:

finalDf['PRODUCT'] =  finalDf['PRODUCT'].append(df['PRODUCT'],ignore_index=True)

这似乎工作正常，现在我必须将catid和marketid插入finalDf的相应列中.因为catid和marketid在当前的csv文件中是必需的，所以我只需要添加它们，就像df数据框中的行一样多，这就是我要在下面的代码中完成的工作.

This seems to work fine, and I now have to insert catid and marketid to the corresponding columns of the finalDf. because catid and marketid are consitent accross the current csv file I just need to add them as much time as there are rows in the df dataframe, this is what I'm trying to accomplish in the code below.

finalDf = pd.DataFrame(columns=['PRODUCT', 'CAT_ID', 'MARKET_ID'])
finalDf['PRODUCT'] = finalDf.PRODUCT.astype('category')

df = pd.read_csv(filename, header=None,
                             names=['PRODUCT', 'URL_PRODUCT', 'RANK', 'URL_IMAGE', 'STARS', 'PRICE', 'NAME', 'SNAPDATE',
                                    'CATEGORY', 'MARKETPLACE', 'PARENTCAT', 'LISTTYPE', 'VERSION', 'LEVEL'], sep='\t')

finalDf['PRODUCT'] = finalDf['PRODUCT'].append(df['PRODUCT'],ignore_index=True)
# Here I have a single value to add n times, n corresponding to the number of rows in the dataframe df
catid = 2113
marketid = 13
catids = pd.Series([catid]*len(df.index))
marketids = pd.Series([marketid]*len(df.index))
finalDf['CAT_ID'] = finalDf['CAT_ID'].append(catids, ignore_index=True)
finalDf['MARKET_ID'] = finalDf['MARKET_ID'].append(marketids, ignore_index=True)

print finalDf.head()

        PRODUCT  CAT_ID  MARKET_ID
    0    ABC       NaN    NaN
    1    ABB       NaN    NaN
    2    ABE       NaN    NaN
    3    DCB       NaN    NaN
    4    EFT       NaN    NaN

如您所见，我只有NaN值，而不是实际值. 预期输出:

As you can see, I just have NaN values instead of the actual values. expected output:

        PRODUCT  CAT_ID  MARKET_ID
    0    ABC       2113    13
    1    ABB       2113    13
    2    ABE       2113    13
    3    DCB       2113    13
    4    EFT       2113    13

包含多个csv的

finalDF看起来像:

finalDF containing several csv would look like:

        PRODUCT  CAT_ID  MARKET_ID
    0    ABC       2113    13
    1    ABB       2113    13
    2    ABE       2113    13
    3    DCB       2113    13
    4    EFT       2113    13
    5    SDD       2114    13
    6    ERT       2114    13
    7    GHJ       2114    13
    8    MOD       2114    13
    9    GTR       2114    13
   10    WLY       2114    13
   11    WLO       2115    13
   12    KOP       2115    13

有什么主意吗?

谢谢

将值添加到pandas中的现有列 [英] Adding values to existing columns in pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

将值添加到pandas中的现有列 [英] Adding values to existing columns in pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭