尝试将字符串转换为整数的 pandas 错误 [英] Pandas error trying to convert string into integer

查看：70 发布时间：2020/9/29 23:11:10 python string pandas casting int

本文介绍了尝试将字符串转换为整数的 pandas 错误的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

要求：

DataFrame中的一个特定列是混合类型。它可以具有 123456 或 ABC12345 之类的值。

One particular column in a DataFrame is 'Mixed' Type. It can have values like "123456" or "ABC12345".

正在使用xlsxwriter将数据框写入Excel。

This dataframe is being written into an Excel using xlsxwriter .

对于 123456 这样的值，熊猫将其转换为 123456.0 （使其看起来像个浮点数）

For values like "123456", down the line Pandas converting it into 123456.0 ( Making it look like a float)

我们需要将其放入如果值是完全数字，则xlsx为123456（即+整数）。

We need to put it into xlsx as 123456 (i.e as +integer) in case value is FULLY numeric.

Effort：

代码段如下图所示

import pandas as pd
import numpy as np
import xlsxwriter
import os
import datetime
import sys
excel_name = str(input("Please Enter Spreadsheet Name :\n").strip())

print("excel entered :   "   , excel_name)
df_header = ['DisplayName','StoreLanguage','Territory','WorkType','EntryType','TitleInternalAlias',
         'TitleDisplayUnlimited','LocalizationType','LicenseType','LicenseRightsDescription',
         'FormatProfile','Start','End','PriceType','PriceValue','SRP','Description',
         'OtherTerms','OtherInstructions','ContentID','ProductID','EncodeID','AvailID',
         'Metadata', 'AltID', 'SuppressionLiftDate','SpecialPreOrderFulfillDate','ReleaseYear','ReleaseHistoryOriginal','ReleaseHistoryPhysicalHV',
          'ExceptionFlag','RatingSystem','RatingValue','RatingReason','RentalDuration','WatchDuration','CaptionIncluded','CaptionExemption','Any','ContractID',
          'ServiceProvider','TotalRunTime','HoldbackLanguage','HoldbackExclusionLanguage']
first_pass_drop_duplicate = df_m_d.drop_duplicates(['StoreLanguage','Territory','TitleInternalAlias','LocalizationType','LicenseType',
                                   'LicenseRightsDescription','FormatProfile','Start','End','PriceType','PriceValue','ContentID','ProductID',
                                   'AltID','ReleaseHistoryPhysicalHV','RatingSystem','RatingValue','CaptionIncluded'], keep=False) 
# We need to keep integer AltID  as is

first_pass_drop_duplicate.loc[first_pass_drop_duplicate['AltID']] =   first_pass_drop_duplicate['AltID'].apply(lambda x : str(int(x)) if str(x).isdigit() == True else x)

我尝试过：

1. using `dataframe.astype(int).astype(str)` # works as long as value is not alphanumeric
2.importing re and using pure python `re.compile()` and `replace()` -- does not work
3.reading DF row by row in a for loop !!! Kills the machine as dataframe can have 300k+ records

每次，我得到的错误是：

Each time, error I get:

raise KeyError（'％s not in index'％objarr [mask]）

KeyError：'[102711. 102711. 102711。 102711.102711.102711.102711.102711.\n 102711.102711.102711.102711.102711.102711.102711.102711.nn 102711.102711.102711.102711.102711.102711.102711.102711.102711.\ n 102711.102711.102711.102711.102711.102711.102711.102711.nn 102711.102711.102711.102711.102711.102711.102711.102711.nn 102711.102711.102711.102711.102711.102711 102711.102711.n 102711.102711.102711.102711.102711.102711.102711.102711.nn 102711.102711.102711.102711.102711.102711.102711.102711.102n.5337.5337。 5337. 5337. 5337. 5337. 5337. 5337.\n 5337. 5337. 5337. 5337. 5337. 5337. 5337. 5337.\n 5337. 5337. 5337 。5337. 5337. 5337. 5337. 5337.\n 5337. 5337. 5337. 5337. 5337. 5337. 5337. 5337.\n 5337. 5337. 5337. 5337. 5337. 5337. 5337. 5337.＼ 5337. 5337. 2124. 2124. 2124. 2124. 2124. 2124.nn 2124. 2124. 6643. 6643. 6643. 6643. 6643. 6643.nn 6643. 6643. 6643. 6643. 6643。 6643. 6643. 6643.\n 6643. 6643. 6643. 6643. 6643. 6643. 6643. 6643.\n 6643. 6643. 6643. 6643. 6643. 6643. 6643. 6643.]不在索引'

raise KeyError('%s not in index' % objarr[mask])
KeyError: '[ 102711. 102711. 102711. 102711. 102711. 102711. 102711. 102711.\n 102711. 102711. 102711. 102711. 102711. 102711. 102711. 102711.\n 102711. 102711. 102711. 102711. 102711. 102711. 102711. 102711.\n 102711. 102711. 102711. 102711. 102711. 102711. 102711. 102711.\n 102711. 102711. 102711. 102711. 102711. 102711. 102711. 102711.\n 102711. 102711. 102711. 102711. 102711. 102711. 102711. 102711.\n 102711. 102711. 102711. 102711. 102711. 102711. 102711. 102711.\n 102711. 102711. 102711. 102711. 102711. 102711. 102711. 102711.\n 5337. 5337. 5337. 5337. 5337. 5337. 5337. 5337.\n 5337. 5337. 5337. 5337. 5337. 5337. 5337. 5337.\n 5337. 5337. 5337. 5337. 5337. 5337. 5337. 5337.\n 5337. 5337. 5337. 5337. 5337. 5337. 5337. 5337.\n 5337. 5337. 5337. 5337. 5337. 5337. 5337. 5337.\n 5337. 5337. 2124. 2124. 2124. 2124. 2124. 2124.\n 2124. 2124. 6643. 6643. 6643. 6643. 6643. 6643.\n 6643. 6643. 6643. 6643. 6643. 6643. 6643. 6643.\n 6643. 6643. 6643. 6643. 6643. 6643. 6643. 6643.\n 6643. 6643. 6643. 6643. 6643. 6643. 6643. 6643.] not in index'

我是python / pandas的新手，非常感谢您的帮助和解决方案。

I am newbie in python/pandas , any help, solution is much appreciated.

尝试将字符串转换为整数的 pandas 错误 [英] Pandas error trying to convert string into integer

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

尝试将字符串转换为整数的 pandas 错误 [英] Pandas error trying to convert string into integer

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭