使用Python从CSV文件中删除非ASCII字符 [英] remove non ascii characters from csv file using Python

查看:103
本文介绍了使用Python从CSV文件中删除非ASCII字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从文件中删除非ASCII字符.我实际上是在尝试将包含这些字符的文本文件(例如hello§§åå½¢æˆäº†å¯¹æ¯"ã€èè±å)转换为csv文件.

I am trying to remove non-ascii characters from a file. I am actually trying to convert a text file which contains these characters (eg. hello§‚å½¢æˆ äº†å¯¹æ¯"。 花å) into a csv file.

但是,我无法遍历这些字符,因此我想删除它们(即切掉或留一个空格).这是代码(从各种来源进行研究和收集)

However, I am unable to iterate through these characters and hence I want to remove them (i.e chop off or put a space). Here's the code (researched and gathered from various sources)

代码问题是,运行脚本后,csv/txt文件尚未更新.这意味着角色仍然在那里.完全不知道该怎么做了.研究了一天:(

The problem with the code is, after running the script, the csv/txt file has not been updated. Which means the characters are still there. Have absolutely no idea how to go about doing this anymore. Researched for a day :(

非常感谢您的帮助!

import csv

txt_file = r"xxx.txt"
csv_file = r"xxx.csv"

in_txt = csv.reader(open(txt_file, "rb"), delimiter = '\t')
out_csv = csv.writer(open(csv_file, 'wb'))
for row in in_txt:
    for i in row:
        i = "".join([a if ord(a)<128 else''for a in i])

out_csv.writerows(in_txt)

推荐答案

变量分配不会神奇地转移到原始源;您必须建立一个新的已更改行列表:

Variable assignment is not magically transferred to the original source; you have to build up a new list of your changed rows:

import csv

txt_file = r"xxx.txt"
csv_file = r"xxx.csv"

in_txt = csv.reader(open(txt_file, "rb"), delimiter = '\t')
out_csv = csv.writer(open(csv_file, 'wb'))
out_txt = []
for row in in_txt:
    out_txt.append([
        "".join(a if ord(a) < 128 else '' for a in i)
        for i in row
    ]

out_csv.writerows(out_txt)

这篇关于使用Python从CSV文件中删除非ASCII字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆