pybtex是否支持.bib文件中的重音/特殊字符? [英] Does pybtex support accent/special characters in .bib file?

查看:139
本文介绍了pybtex是否支持.bib文件中的重音/特殊字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

from pybtex.database.input import bibtex
parser = bibtex.Parser()
bibdata = parser.parse_file("sample.bib")

上面的代码片段在解析.bib文件时非常有效,但是似乎不支持重音字符,例如{\"u}\"{u}(来自LaTeX).就像确认pybtex是否支持.

The above code snippet works really well in parsing a .bib file but it seems not to support accent characters, like {\"u} or \"{u}(From LaTeX). Just like to confirm if pybtex support that or not.

例如,根据 LaTeX/特殊字符如何在书目中写ä"和其他变音符号和重音字母?\"{o}应该转换为ö,并且{\"o}也是如此.

For example, according to LaTeX/Special Characters and How to write "ä" and other umlauts and accented letters in bibliography?, \"{o} should convert to ö, and so does {\"o}.

推荐答案

更新:pybtex自0.20版本开始支持此功能.

Update: this feature is now supported by pybtex since version 0.20.

目前不行.但是您可以在使用pybtex处理之前,使用乳胶编解码器读取bib文件,例如与 https://pypi.python.org/pypi/latexcodec/一起,此编解码器将转换(种类繁多的LaTeX命令可为您进行Unicode编码.

It does not at the moment. But you can read the bib file using a latex codec before you process it with pybtex, e.g. with https://pypi.python.org/pypi/latexcodec/ This codec will convert (a wide range of) LaTeX commands to unicode for you.

但是,您必须在后期处理阶段删除括号.为什么?为了更好地处理bibtex代码,必须将\"{U} 转换为{Ü}而不是Ü,以防止标题中的字母小写.下面的示例演示了这种行为:

However, you'll have to remove brackets in a post-processing stage. Why? In order to handle bibtex code gracefully, \"{U} has to be converted into {Ü} rather than into Ü to prevent it from being lower cased in titles. The following example demonstrates this behaviour:

import pybtex.database.input.bibtex
import pybtex.plugin
import codecs
import latexcodec

style = pybtex.plugin.find_plugin('pybtex.style.formatting', 'plain')()
backend = pybtex.plugin.find_plugin('pybtex.backends', 'latex')()
parser = pybtex.database.input.bibtex.Parser()
with codecs.open("test.bib", encoding="latex") as stream:
    # this shows what the latexcodec does to the source
    print stream.read()
with codecs.open("test.bib", encoding="latex") as stream:
    data = parser.parse_stream(stream)
for entry in style.format_entries(data.entries.itervalues()):
    print entry.text.render(backend)

test.bib在哪里

where test.bib is

@Article{test,
  author =       {John Doe},
  title =        {Testing \"UTEST \"{U}TEST},
  journal =      {Journal of Test},
  year =         {2000},
}

这将打印latexcodec如何将test.bib转换为unicode(为便于阅读而进行了编辑):

This will print how the latexcodec converted test.bib into unicode (edited for readability):

@Article{test,
   author = {John Doe}, title = {Testing ÜTEST {Ü}TEST},
   journal = {Journal of Test}, year = {2000},
}

之后是pybtex呈现的条目(在这种情况下,生成乳胶代码):

followed by the pybtex rendered entry (in this case, producing latex code):

John Doe.
\newblock Testing ütest {Ü}test.
\newblock \emph{Journal of Test}, 2000.

如果编解码器要去除括号,则pybtex会错误地转换大小写.此外,在journal = {\"u}之类的(病理性)情况下,显然也无法去除括号.

If the codec were to strip the brackets, pybtex would have converted the case wrongly. Further, in (pathological) cases like journal = {\"u} clearly the brackets cannot be removed either.

一个明显的缺点是,如果渲染到非LaTeX后端,则必须在后期处理阶段删除括号.但是您可能还是想这样做以处理任何特殊的LaTeX命令(例如\url). pybtex可以以某种方式为您做到这一点很好,但目前还不行.

An obvious downside is that if you render to a non-LaTeX backend, then you have to remove the brackets in a post-processing stage. But you may want to do that anyway to process any special LaTeX commands (such as \url). It would be nice if pybtex could somehow do that for you, but it doesn't at the moment.

这篇关于pybtex是否支持.bib文件中的重音/特殊字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆