带有Python中的Pygments的分词器 [英] Tokenizer with Pygments in Python

查看:178
本文介绍了带有Python中的Pygments的分词器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

要为Python中的源文件(例如Java或C ++)创建标记生成器.遇到了 Pygments ,尤其是这些文档和在线中找不到如何使用词法分析器的示例.

Want to create a tokenizer for source files (e.g. Java or C++) in Python. Came across Pygments and in particular these lexers. I could not found examples i the documentation and online for how to use the lexer.

想知道是否有可能在Python中实际使用Pygments以便获取给定源文件的标记及其位置.

Wondering if it is possible to actually use Pygments in Python in order to get the tokens and their position for a given source file.

我在这里苦苦挣扎,因此,如果有人可以提供一小段代码来详细说明上面的内容,那将不胜感激.

I am struggling with the very basics here, so If someone could offer even a small chunk of code detailing the above it would be much appreciated.

推荐答案

如果您查看Pygment的 get_tokens 方法,该方法返回令牌列表.然后将这些令牌传递给格式化程序.如果需要令牌列表,而无需格式化程序,则只需要做第一部分.

If you look at the source of Pygment's highlight function, essentially what it does is pass the source text into a lexer instance via the get_tokens method, which returns a list of tokens. Those tokens are then passed to the formatter. As you want the list of tokens, without the formatter, you only need to do the first part.

因此要使用C ++词法分析器(其中src是包含您的C ++源代码的字符串):

So to use the C++ lexer (where src is a string containing your C++ source code):

from pygments.lexers.c_cpp import CppLexer

lexer = CppLexer()
tokens = lexer.get_tokens(src)

当然,您可以查找猜测词法分析器,而不是直接使用get_lexer_by_nameget_lexer_for_filenameget_lexer_for_mimetypeguess_lexerguess_lexer_for_filename.例如:

Of course, you could lookup or guess the lexer instead of importing the desired lexer directly by using one of get_lexer_by_name, get_lexer_for_filename, get_lexer_for_mimetype, guess_lexer, or guess_lexer_for_filename. For example:

from pygments.lexers import get_lexer_by_name

Lexer = get_lexer_by_name('c++')
lexer = Lexer()  # Don't forget to create an instance
tokens = lexer.get_tokens(src)

返回的令牌列表是否会为您提供您想要的另一件事.您必须尝试一下才能看到.

Whether the returned list of tokens will provide you with what you want it another matter. You'll have to try it and see.

这篇关于带有Python中的Pygments的分词器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆