如何使用 Clang 查找所有成员字段读/写? [英] How can I find all member field read/writes using Clang?

查看:36
本文介绍了如何使用 Clang 查找所有成员字段读/写?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定一个 C++ 源代码,我想找到每个函数写入和读取的类字段.使用 Clang 前端执行此操作的最佳方法是什么?

Given a C++ source code, I want to find the class fields that every function writes and reads. What is the best way of doing this using the Clang frontend?

(我不是要求对所有步骤进行详细解释;但是,如果能找到一个有效解决方案的起点会很棒.)

(I'm not asking for a detailed explanation of all the steps; however a starting point for an efficient solution would be great.)

到目前为止,我尝试使用 RecursiveASTVisitor 解析语句,但是很难跟踪节点连接.另外,我不知道如何跟踪如下内容:

So far I tried parsing statements using the RecursiveASTVisitor, but keeping track of node connections is difficult. Also, I cannot figure out how to keep track of something like below:

int& x = m_int_field;
x++;

这显然修改了m_int_field;但是给定一个 Stmt 是不可能知道的;所以 AST 遍历本身似乎是不够的.

This clearly modifies m_int_field; but given a single Stmt it is impossible to know that; so AST traversal by itself seems insufficient.

对我来说一个好处是能够分别计算字段和子字段(例如访问成员结构的三个字段).

A bonus for me is being able to separately count fields and sub-fields (e.g. Accessing three fields of a member struct).

示例:

typedef struct Y {
    int m_structfield1;
    float m_structfield2;
    Y () {
        m_structfield1 = 0;
        m_structfield2 = 1.0f;
    }
} Y;
class X {
    int m_field1;
    std::string m_field2;
    Y m_field3;
public:
    X () : m_field2("lel") {}
    virtual ~X() {}
    void func1 (std::string s) {
        m_field1 += 2;
        m_field2 = s;
    }
    int func2 () {
        return m_field1 + 5;
    }
    void func3 (Y& y) {
        int& i = m_field1;
        y.m_structfield2 = 1.2f + i++;
    }
    int func4 () {
        func3 (m_field3);
        return m_field3.m_structfield1;
    }
};

应该返回

X::X() -> m_field1 (w), m_field3.m_structfield1 (w), m_field3.m_structfield2 (w)
X::func1(std::string) -> m_field1 (r+w), m_field2 (w)
X::func2() -> m_field1 (r)
X::func3(Y&) -> m_field1 (r+w)
X::func4() -> m_field1 (r+w), m_field3.m_structfield2 (w), m_field3.m_structfield1 (r)

为简单起见,我们可以假设没有继承.

We can assume for simplicity that there is no inheritance.

推荐答案

我一直在收集一些分析代码的例子使用 Clang 的 AST 匹配器.那里有一个示例应用程序 StructFieldUser,它报告读取或写入结构的哪些字段,以及每次访问发生的函数.它与您要查找的内容不同,但它可能是一个有用的参考点.它演示了提取和记录此类信息,并说明了如何将所有部分放在一起.

I've been collecting some examples of analyzing code with Clang's AST matchers. There is an example application there, StructFieldUser, that reports which fields of a struct get read or written, and the function in which each access happens. It's different than what you're looking for, but it might be a useful point of reference. It demonstrates extracting and recording this kind of information, and it illustrates how to put all the pieces together.

通常从 AST 匹配器开始的好地方是 Eli Bendersky 的这篇文章.

A good place to start with AST matchers in general is this post by Eli Bendersky.

要了解可以解决您问题的匹配器,您可以使用 clang-query 进行练习:

To get a feel for the matchers that would solve your problem, you might practice with clang-query:

$ clang-query example.cpp --    # the two dashes mean no compilation db
clang-query> let m1 memberExpr()
clang-query> m m1

Match #1:

/path/example.cpp:9:9: note: "root" binds here
        m_structfield1 = 0;
        ^~~~~~~~~~~~~~

Match #2:

/path/example.cpp:10:9: note: "root" binds here
        m_structfield2 = 1.0f;
        ^~~~~~~~~~~~~~
...
11 matches.

然后就可以开始使用遍历匹配器连接到其他节点了.这使您可以捕获相关上下文,例如进行引用的函数或类方法.将 bind 表达式添加到节点匹配器将帮助您准确了解匹配的内容.绑定节点还将在回调中提供对节点的访问.

Then you can start to connect to other nodes using traversal matchers. This lets you capture related context, like the function or class method in which the reference is made. Adding bind expressions to the node matchers will help you see exactly what is getting matched. Binding nodes will also give access to the nodes in callbacks.

clang-query> let m2 memberExpr(hasAncestor(functionDecl().bind("fdecl"))).bind("mexpr")
clang-query> m m2

Match #1:

/path/example.cpp/path/example.cpp:8:5: note: "fdecl" binds here
    Y () {
    ^~~~~~
/path/example.cpp:9:9: note: "mexpr" binds here
        m_structfield1 = 0;
        ^~~~~~~~~~~~~~
/path/example.cpp:9:9: note: "root" binds here
        m_structfield1 = 0;
        ^~~~~~~~~~~~~~

Match #2:

/path/example.cpp:8:5: note: "fdecl" binds here
    Y () {
    ^~~~~~
/path/example.cpp:10:9: note: "mexpr" binds here
        m_structfield2 = 1.0f;
        ^~~~~~~~~~~~~~
/path/example.cpp:10:9: note: "root" binds here
        m_structfield2 = 1.0f;
        ^~~~~~~~~~~~~~
...

学习如何挑选您需要的确切节点可能需要一些工作.请注意,上面的匹配器不会在 X::X() 中进行初始化.从

It can take some work to learn how to pick up the exact nodes you need. Note that the matchers above don't pick up the initialization in X::X(). Looking at the AST from

clang-check -ast-dump example.cpp -- 

表明这些节点不是 MemberExpr 节点;它们是 CXXCtorInitializer 节点,因此需要 cxxCtorInitializer 匹配器来获取这些节点.可能需要多个匹配器才能找到所有不同的节点.

shows that those nodes are not MemberExpr nodes; they're CXXCtorInitializer nodes, so the cxxCtorInitializer matcher is needed to get those nodes. Multiple matchers are probably needed to find all the different nodes.

这篇关于如何使用 Clang 查找所有成员字段读/写?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆