在Python中使用libclang来解析C ++ [英] Using libclang to parse in C++ in Python

查看:2005
本文介绍了在Python中使用libclang来解析C ++的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

经过一些研究和几个问题,我最终以顺序探索 libclang 库在Python中解析C ++源文件。

After some research and a few questions, I ended up exploring libclang library in order to parse C++ source files in Python.

给定一个C ++源

int fac(int n) {
    return (n>1) ? n∗fac(n−1) : 1;
}

for (int i = 0; i < linecount; i++) {
   sum += array[i];
}

double mean = sum/linecount;

我试图识别标记 fac 作为变量名称, i 作为变量名称, code>表示作为变量名称,以及每个位置。

I am trying to identify the tokens fac as a function name, n as variable name, i as a variable name, mean as variable name, along with each ones position. I interested in eventually tokenizing them.

我已经阅读了一些非常有用的文章( eli's Gaetan的)以及一些堆栈溢出问题 35113197 13236500

I have read some very useful articles (eli's, Gaetan's) as well as some stack overflow questions 35113197, 13236500.

但是,由于我是Python新手,并且很难理解libclang的基础知识,我非常感谢一些代码块

However, given I am new in Python and struggling to understand the basics of libclang, I would very much appreciate some example chunk of code which implements the above for me to pick up and understand from.

推荐答案

从libclang API中并没有立刻明白什么是适当的方法提取令牌是。但是,很少有你需要(或想要)下降到这个级别 - 光标层通常更有用。

It's not immediately obvious from the libclang API what an appropriate approach to extracting token is. However, it's rare that you would ever need (or want) to drop down to this level - the cursor layer is typically much more useful.

但是,如果这是什么你需要一个最小的例子可能看起来像:

However, if this is what you need - a minimal example might look something like:

import clang.cindex

s = '''
int fac(int n) {
    return (n>1) ? n*fac(n-1) : 1;
}
'''

idx = clang.cindex.Index.create()
tu = idx.parse('tmp.cpp', args=['-std=c++11'],  
                unsaved_files=[('tmp.cpp', s)],  options=0)
for t in tu.get_tokens(extent=tu.cursor.extent):
    print t.kind

其中(对于我的clang版本) / p>

Which (for my version of clang) produces

TokenKind.KEYWORD
TokenKind.IDENTIFIER
TokenKind.PUNCTUATION
TokenKind.KEYWORD
TokenKind.IDENTIFIER
TokenKind.PUNCTUATION
TokenKind.PUNCTUATION
TokenKind.KEYWORD
TokenKind.PUNCTUATION
TokenKind.IDENTIFIER
TokenKind.PUNCTUATION
TokenKind.LITERAL
TokenKind.PUNCTUATION
TokenKind.PUNCTUATION
TokenKind.IDENTIFIER
TokenKind.PUNCTUATION
TokenKind.IDENTIFIER
TokenKind.PUNCTUATION
TokenKind.IDENTIFIER
TokenKind.PUNCTUATION
TokenKind.LITERAL
TokenKind.PUNCTUATION
TokenKind.PUNCTUATION
TokenKind.LITERAL
TokenKind.PUNCTUATION
TokenKind.PUNCTUATION

这篇关于在Python中使用libclang来解析C ++的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆