Python pandas:通过转换存储在繁体中文另一列中的值,用英文值创建新列 [英] Python pandas: Create a new column with values in English by converting values stored in a different column in Chinese traditional

查看:260
本文介绍了Python pandas:通过转换存储在繁体中文另一列中的值,用英文值创建新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在熊猫数据框"df"中有一列"City_trad_chinese",其中包含繁体中文语言的值.我需要创建另一列"City_English",该列必须包含英语的转换值.

I have a column "City_trad_chinese" in a pandas dataframe "df" which contains values in Traditional Chinese language. I need to create another column "City_English" which must contain the translated values in English.

如何使用Python做到这一点?我尝试了以下方法:

How can I do this with Python? I tried the following:

#importing required libraries
import pandas as pd 

from os import path

from googletrans import Translator

#setting path to data
path2data = 'C:/Users/data'

# data import
df = pd.read_excel(path.join(path2data, 'data.xlsx'), converters={'City_trad_chinese':str})


translator = Translator()

df['City_English'] = df['City_trad_chinese'].map(lambda x: translator.translate(x, src="zh-TW", dest="en").text)

但是它给我一个错误:

raise JSONDecodeError("Expecting value", s, err.value) from None

JSONDecodeError: Expecting value

推荐答案

您可以使用库googletrans

import pandas as pd
from googletrans import Translator

d = {"City_trad_chinese":["香港特别行政区",
                          "澳门特别行政区",
                          "北京市",
                          "上海市"]}
df = pd.DataFrame(data=d)

translator = Translator()

df["City_English"] = df["City_trad_chinese"].map(lambda x: translator.translate(x, src="zh-TW", dest="en").text)

print(df["City_English"])

0    Hong Kong Special Administrative Region
1        Macao Special Administrative Region
2                               Beijing City
3                              Shanghai City


注意:Google Translate API具有最多15k个字符.您可以通过逐行翻译每一行来对此进行绕行:


Note: The Google Translate API has a 15k character limit. You can circumnavigate this by translating each row individually:

df["City_English"] = ""

for index, row in df.iterrows():
    translator = Translator()
    eng_text = translator.translate(row["City_trad_chinese"], src="zh-TW", dest="en").text
    row["City_English"] = eng_text

这篇关于Python pandas:通过转换存储在繁体中文另一列中的值,用英文值创建新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆