如何删除非ASCII字符但保留句点和空格? [英] How can I remove non-ASCII characters but leave periods and spaces?

查看：89 发布时间：2021/5/7 19:19:11 python text unicode filter ascii

本文介绍了如何删除非ASCII字符但保留句点和空格?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在处理.txt文件.我想要文件中没有非ASCII字符的文本字符串.但是，我想留空格和句点.目前，我也正在剥离它们.这是代码:

I'm working with a .txt file. I want a string of the text from the file with no non-ASCII characters. However, I want to leave spaces and periods. At present, I'm stripping those too. Here's the code:

def onlyascii(char):
    if ord(char) < 48 or ord(char) > 127: return ''
    else: return char

def get_my_string(file_path):
    f=open(file_path,'r')
    data=f.read()
    f.close()
    filtered_data=filter(onlyascii, data)
    filtered_data = filtered_data.lower()
    return filtered_data

如何修改onlyascii()以保留空格和句点?我想这并不太复杂，但我想不出来.

How should I modify onlyascii() to leave spaces and periods? I imagine it's not too complicated but I can't figure it out.

推荐答案

您可以使用

You can filter all characters from the string that are not printable using string.printable, like this:

>>> s = "some\x00string. with\x15 funny characters"
>>> import string
>>> printable = set(string.printable)
>>> filter(lambda x: x in printable, s)
'somestring. with funny characters'

我机器上的

string.printable包含:

string.printable on my machine contains:

0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~ \t\n\r\x0b\x0c

在Python 3上，过滤器将返回可迭代的.返回字符串的正确方法是:

On Python 3, filter will return an iterable. The correct way to obtain a string back would be:

''.join(filter(lambda x: x in printable, s))

这篇关于如何删除非ASCII字符但保留句点和空格?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何删除非ASCII字符但保留句点和空格? [英] How can I remove non-ASCII characters but leave periods and spaces?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何删除非ASCII字符但保留句点和空格? [英] How can I remove non-ASCII characters but leave periods and spaces?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭