语法错误:非 ASCII 字符.Python [英] SyntaxError: Non-ASCII character. Python

查看:44
本文介绍了语法错误:非 ASCII 字符.Python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

谁能告诉我以下哪个字符是非 ASCII 字符:

<块引用>

Columns(str) – 逗号分隔的值列表.仅当格式为 tab 或 xls 时才有效.对于 UnitprotKB,一些可能的列是:id、条目名称、长度、有机体.某些列名称必须后跟数据库名称(即‘database(PDB)’).再次访问uniprot 网站了解更多详情.另请参阅 _valid_columns 以获取列关键字的完整列表.

本质上我是在定义一个类并试图给它一个注释来定义它是如何工作的:

def test(self,uniprot_id):'''与 UniProt.search() 方法参数相同:搜索(查询,frmt='tab',columns=None,include=False,sort='score',compress=False,limit=None,offset=None,maxTrials=10)query (str) -- 查询必须是有效的 uniprot 查询.请参见 http://www.uniprot.org/help/text-search, http://www.uniprot.org/help/query-fields 另请参见下面的示例frmt (str) -- html、tab、xls、asta、gff、txt、xml、rdf、list、rss 中的有效格式.如果使用 tab 或 xls,您还可以提供 columns 参数.(默认是标签)include (bool) -- 当 frmt 参数为 fasta 时包含同种型序列.当 frmt 是 rdf 时包括描述.sort (str) -- 默认按分数.设置为 None 以绕过此行为compress (bool) -- gzip 结果limit (int) -- 要检索的最大结果数.offset (int) -- 第一个结果的偏移量,通常与 limit 参数一起使用.maxTrials (int) -- 这个请求不稳定,所以我们可能想尝试几次.Columns(str) -- 逗号分隔的值列表.仅当格式为 tab 或 xls 时才有效.对于 UnitprotKB,一些可能的列是:id、条目名称、长度、有机体.某些列名称必须后跟数据库名称(即‘database(PDB)’).再次访问uniprot 网站了解更多详情.另请参阅 _valid_columns 以获取列关键字的完整列表.''''u = UniProt()uniprot_entry = u.search(uniprot_id)返回 uniprot_entry

如果没有第 52 行,即引用注释块中以列"开头的行,这将按预期工作,但是一旦我描述了列"是什么,我就会收到以下错误:

SyntaxError:第 52 行文件/home/cw00137/Documents/Python/Identify_gene.py 中的非 ASCII 字符 '\xe2',但未声明编码;详见 http://www.python.org/peps/pep-0263.html

有人知道这是怎么回事吗?

解决方案

您在该行中使用了花式"卷曲引号:

<预><代码>>>>u''数据库(PDB)''u'\u2018数据库(PDB)\u2019'

这是开头的 U+2018 左单引号U+2019 右单引号结尾.

使用 ASCII 引号(U+0027 APOSTROPHEU+0022 QUOTATION MARK) 或为您的源声明 ASCII 以外的编码.

您还在使用 U+2013 EN DASH:

<预><代码>>>>u'Columns(str) –'u'Columns(str) \u2013'

将其替换为 U+002D 连字符-减号.

所有三个字符都编码为带有前导 E2 字节的 UTF-8:

<预><代码>>>>u'\u2013 \u2018 \u2019'.encode('utf8')'\xe2\x80\x93 \xe2\x80\x98 \xe2\x80\x99'

然后您会在 SyntaxError 异常消息中看到它的反映.

您可能希望首先避免使用这些字符.可能是您的操作系统在您键入时替换了这些,或者您正在使用文字处理器而不是纯文本编辑器来编写您的代码,并且它正在为您替换这些.您可能想关闭该功能.

Could somebody tell me which character is a non-ASCII character in the following:

Columns(str) – comma-seperated list of values. Works only if format is tab or xls. For UnitprotKB, some possible columns are: id, entry name, length, organism. Some column names must be followed by a database name (i.e. ‘database(PDB)’). Again see uniprot website for more details. See also _valid_columns for the full list of column keyword.

Essentially I am defining a class and trying to give it a comment to define how it works:

def test(self,uniprot_id):
    '''
    Same as the UniProt.search() method arguments:
    search(query, frmt='tab', columns=None, include=False, sort='score', compress=False, limit=None, offset=None, maxTrials=10)


    query (str) -- query must be a valid uniprot query. See http://www.uniprot.org/help/text-search, http://www.uniprot.org/help/query-fields See also example below
    frmt (str) -- a valid format amongst html, tab, xls, asta, gff, txt, xml, rdf, list, rss. If tab or xls, you can also provide the columns argument. (default is tab)
    include (bool) -- include isoform sequences when the frmt parameter is fasta. Include description when frmt is rdf.
    sort (str) -- by score by default. Set to None to bypass this behaviour
    compress (bool) -- gzip the results
    limit (int) -- Maximum number of results to retrieve.
    offset (int) -- Offset of the first result, typically used together with the limit parameter.
    maxTrials (int) -- this request is unstable, so we may want to try several time.
    Columns(str) -- comma-seperated list of values. Works only if format is tab or xls. For UnitprotKB, some possible columns are: id, entry name, length, organism. Some column names must be followed by a database name (i.e. ‘database(PDB)’). Again see uniprot website for more details. See also _valid_columns for the full list of column keyword. '

    '''        
    u = UniProt()
    uniprot_entry = u.search(uniprot_id)
    return uniprot_entry

Without the line 52, i.e. the one beginning with 'columns' in the quoted out comment block, this works as expected but as soon as I describe what 'columns' is I get the following error:

SyntaxError: Non-ASCII character '\xe2' in file /home/cw00137/Documents/Python/Identify_gene.py on line 52, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details

Does anybody know what is going on?

解决方案

You are using 'fancy' curly quotes in that line:

>>> u'‘database(PDB)’'
u'\u2018database(PDB)\u2019'

That's a U+2018 LEFT SINGLE QUOTATION MARK at the start and U+2019 RIGHT SINGLE QUOTATION MARK at the end.

Use ASCII quotes (U+0027 APOSTROPHE or U+0022 QUOTATION MARK) or declare an encoding other than ASCII for your source.

You are also using an U+2013 EN DASH:

>>> u'Columns(str) –'
u'Columns(str) \u2013'

Replace that with a U+002D HYPHEN-MINUS.

All three characters encode to UTF-8 with a leading E2 byte:

>>> u'\u2013 \u2018 \u2019'.encode('utf8')
'\xe2\x80\x93 \xe2\x80\x98 \xe2\x80\x99'

which you then see reflected in the SyntaxError exception message.

You may want to avoid using these characters in the first place. It could be that your OS is replacing these as you type, or you are using a word processor instead of a plain text editor to write your code and it is replacing these for you. You probably want to switch that feature off.

这篇关于语法错误:非 ASCII 字符.Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆