lang:如何获取用于常量大小数组声明的大小的宏名称 [英] Clang: How to get the macro name used for size of a constant size array declaration

查看:211
本文介绍了lang:如何获取用于常量大小数组声明的大小的宏名称的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

TL; DR;

如何从callExpr-> arg_0-> DeclRefExpr中获取用于恒定大小数组声明的宏名称.

How to get the macro name used for size of a constant size array declaration, from a callExpr -> arg_0 -> DeclRefExpr.

详细的问题说明:

最近,我开始应对一项挑战,该挑战需要使用源到源转换工具进行修改 带有附加参数的特定函数调用.对我可以达到的方法的研究介绍了我 这个惊人的工具集Clang.我一直在学习如何使用libtooling中提供的不同工具来 实现我的目标.但是现在我遇到了问题,请在这里寻求帮助.

Recently I started working on a challenge which requires source to source transformation tool for modifying specific function calls with an additional argument. Reasearching about the ways i can acheive introduced me to this amazing toolset Clang. I've been learning how to use different tools provided in libtooling to acheive my goal. But now i'm stuck at a problem, seek your help here.

考虑下面的程序(我的源代码的伪装),我的目标是重写所有对strcpy的调用 具有安全版本的strcpy_s的函数,并在新函数调用中添加其他参数 即-目标指针的最大大小.因此,对于以下程序,我的重构调用将是 strcpy_s(inStr,STR_MAX,argv [1]);

Considere the below program (dummy of my sources), my goal is to rewrite all calls to strcpy function with a safe version of strcpy_s and add an additional parameter in the new function call i.e - destination pointer maximum size. so, for the below program my refactored call would be like strcpy_s(inStr, STR_MAX, argv[1]);

我编写了一个RecursiveVisitor类,并检查VisitCallExpr方法中的所有函数调用,以获取最大大小 目的参数的获取第一个聚合的VarDecl并尝试获取大小(ConstArrayType).自从 源文件已经过预处理,我看到大小为2049,但是我需要的是STR_MAX中的宏 这个案例.我怎么能得到呢? (使用此信息创建替换项,然后使用RefactoringTool替换它们)

I wrote a RecursiveVisitor class and inspecting all function calls in VisitCallExpr method, to get max size of the dest arg i'm getting VarDecl of the first agrument and trying to get the size (ConstArrayType). Since the source file is already preprocessed i'm seeing 2049 as the size, but what i need is the macro STR_MAX in this case. how can i get that? (Creating replacements with this info and using RefactoringTool replacing them afterwards)

#include <stdio.h>
#include <string.h>
#include <stdlib.h> 

#define STR_MAX 2049

int main(int argc, char **argv){
  char inStr[STR_MAX];

  if(argc>1){
    //Clang tool required to transaform the below call into strncpy_s(inStr, STR_MAX, argv[1], strlen(argv[1]));
    strcpy(inStr, argv[1]);
  } else {
    printf("\n not enough args");
    return -1;
  }

  printf("got [%s]", inStr);

  return 0;
}

推荐答案

您已经正确注意到,源代码已经过预处理,并且所有宏都已扩展.因此,AST将仅具有一个整数表达式作为数组的大小.

As you noticed correctly, the source code is already preprocessed and it has all the macros expanded. Thus, the AST will simply have an integer expression as the size of array.

注意:您可以跳过它,直接进入下面的解决方案

NOTE: you can skip it and proceed straight to the solution below

有关扩展宏的信息包含在AST节点的源位置中,通常可以使用 Lexer 进行检索(Clang的lexer和预处理器之间的联系非常紧密,甚至可以视为一个实体).这只是最低要求,使用起来不是很明显,但这就是它的本质.

The information about expanded macros is contained in source locations of AST nodes and usually can be retrieved using Lexer (Clang's lexer and preprocessor are very tightly connected and can be even considered one entity). It's a bare minimum and not very obvious to work with, but it is what it is.

当您正在寻找一种获取替代品的原始宏名称的方法时,您只需获取拼写(即原始源代码中的书写方式),而您无需不需要太多有关宏定义,函数样式宏及其参数等的内容.

As you are looking for a way to get the original macro name for a replacement, you only need to get the spelling (i.e. the way it was written in the original source code) and you don't need to carry much about macro definitions, function-style macros and their arguments, etc.

Clang具有两种类型的不同位置: SourceLocation CharSourceLocation .通过AST,几乎可以在任何地方找到第一个.它是指根据代币的头寸.这就解释了为什么 begin end 位置可能有点违反直觉:

Clang has two types of different locations: SourceLocation and CharSourceLocation. The first one can be found pretty much everywhere through the AST. It refers to a position in terms of tokens. This explains why begin and end positions can be somewhat counterintuitive:

// clang::DeclRefExpr
//
//  ┌─ begin location
foo(VeryLongButDescriptiveVariableName);
//  └─ end location

// clang::BinaryOperator
//
//           ┌─ begin location
int Result = LHS + RHS;
//                 └─ end location

如您所见,这种类型的源位置指向相应令牌的开头.另一方面, CharSourceLocation 直接指向字符.

As you can see, this type of source location points to the beginning of the corresponding token. CharSourceLocation on the other hand, points directly to the characters.

因此,为了获取表达式的原始文本,我们需要将 SourceLocation 转换为 CharSourceLocation 并从源中获取相应的文本.

So, in order to get the original text of the expression, we need to convert SourceLocation's to CharSourceLocation's and get the corresponding text from the source.

我已经修改了您的示例以显示宏扩展的其他情况:

I've modified your example to show other cases of macro expansions as well:

#define STR_MAX 2049
#define BAR(X) X

int main() {
  char inStrDef[STR_MAX];
  char inStrFunc[BAR(2049)];
  char inStrFuncNested[BAR(BAR(STR_MAX))];
}

以下代码:

// clang::VarDecl *VD;
// clang::ASTContext *Context;
auto &SM = Context->getSourceManager();
auto &LO = Context->getLangOpts();
auto DeclarationType = VD->getTypeSourceInfo()->getTypeLoc();

if (auto ArrayType = DeclarationType.getAs<ConstantArrayTypeLoc>()) {
  auto *Size = ArrayType.getSizeExpr();

  auto CharRange = Lexer::getAsCharRange(Size->getSourceRange(), SM, LO);
  // Lexer gets text for [start, end) and we want him to grab the end as well
  CharRange.setEnd(CharRange.getEnd().getLocWithOffset(1));

  auto StringRep = Lexer::getSourceText(CharRange, SM, LO);
  llvm::errs() << StringRep << "\n";
}

生成此代码段的输出:

STR_MAX
BAR(2049)
BAR(BAR(STR_MAX))

我希望这些信息对您有所帮助.用Clang进行快乐的黑客入侵!

I hope this information is helpful. Happy hacking with Clang!

这篇关于lang:如何获取用于常量大小数组声明的大小的宏名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆