将会计格式的 pandas 系列转换为数字系列? [英] Convert a Pandas Series in Accounting Format to a Numeric Series?
问题描述
数值的会计格式通常使用货币字符,通常使用括号来表示负值。零也可以表示为 -
或 $ -
。当这样的系列导入到Pandas DataFrame中时,它是一个对象类型。我需要将其转换为数字类型并正确解析负值。
An accounting format for numeric values usually uses a currency character, and often uses parentheses to represent negative values. Zero may also be represented as a -
or $-
. When such a series is imported into a Pandas DataFrame it is an object type. I need to convert it to a numeric type and parse the negative values correctly.
这里有一个例子:
import numpy as np
import pandas as pd
from pandas import Series, DataFrame
df = pd.DataFrame({'A':['123.4', '234.5', '345.5', '456.7'],
'B':['$123.4', '$234.5', '$345.5', '$456.7'],
'C':['($123.4)', '$234.5', '($345.5)', '$456.7'],
'D':['$123.4', '($234.5)', '$-', '$456.7']})
系列A很容易转换,例如
Series A is easy to convert e.g.
df['A'] = df['A'].astype(float)
系列B需要删除 $
标志,之后它是直接的。
Series B required the removal of the $
sign, after which it is then straightforward.
然后是C和D系列。它们包含括号(即否定)值为D,零包含 $ -
。如何正确地将这些系列解析成数字系列/数据框?
Then comes series C and D. They contain parentheses (i.e. negative) values and D contains $-
for zero. How can I correctly parse theses series into numeric series / dataframe?
推荐答案
我会使用熊猫替换
函数来替换$和)没有,替换为0,然后最后替换(by - 。然后你可以做 df = astype(float)
它应该工作。
I'd use the Pandas replace
function to replace $ and ) by nothing, replace - by 0, and then finally replace ( by -. Then you can do df=astype(float)
and it should work.
这篇关于将会计格式的 pandas 系列转换为数字系列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!