如何转换pprocessed C / C ++源语法树(反面)未$ P $? [英] How to convert unpreprocessed C/C++ source to syntax tree (and back)?

查看:149
本文介绍了如何转换pprocessed C / C ++源语法树(反面)未$ P $?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想有这之间的转换

#include <ololo.h>
#ifdef HAVE_QQQ
  #include <qqq.h>
#endif

char* ololize(char* s) {
   #ifdef HAVE_QQQ
      return qqq(s);
   #else
      return ololo(s);
   #endif
}

和这样的事情

(include_angular "ololo.h")
(p_ifdef "HAVE_QQQ"
 (include_angular "qqq.h"))
(define_function "ololize" [(ptr char) "s"] (ptr char)
 (p_ifdef "HAVE_QQQ"
  (return (qqq s))
  :else
  (return (ololo s)))))

即。重新源$ C ​​$ C作为一个易于管理的树,而不是从的角度编译器的角度,但是从程序员的角度presentation。

I.e. representation of a source code as a easily manageable tree, not from compiler's point of view, but from programmer's point of view.

我不指望100%正确的工作,但它应该为大多数已有的源文件。奖励积分,如果我可以往返的code到树上和背部。

I don't expect 100% correct work, but it should work for most existing source files. Bonus points if I can "round-trip" the code to tree and back.

是否有任何现有的工具或库是什么?

Are there any existing tools or libraries for that?

推荐答案

我们的DMS软件再造工具包和C ++前端可以做到这一点。 DMS提供语言 - precise解析(包括处理GCC和MS方言以及C ++ 11),并建立AST的。这取决于它是如何配置的,也可以构建全符号表,和presently可以对C做控制流分析++(但不是完全没有对C ++ 11)。

Our DMS Software Reengineering Toolkit and its C++ front end can do this. DMS provides language-precise parsing (including handling GCC and MS dialects as well as C++11), and builds ASTs. Depending on how it is configured, it can also build full symbol tables, and presently can do control flow analysis for C++ (but not quite yet for C++11).

内部的AST,DMS可以再生,将产生同样的编译的结果合法来源,无论是prettyprinted或preserving空间(保真模式)几乎一模一样。我们也可以问AST导出为XML。

From the internal AST, DMS can regenerate legal source that will produce the same compiled result, either prettyprinted or preserving space ("fidelity mode") almost exactly. We can also ask the AST be exported as XML.

有关OP的小程序,这里是直接从我们GCC4方言解析器(有一个PrintASTasXML功能的DMS库)呈现为XML的AST。注意AST包含include和preprocessor条件语句。

For OP's small program, here is the AST rendered as XML directly from our GCC4 dialect parser (there's a "PrintASTasXML" function in the DMS libraries). Note the AST contains the INCLUDE and preprocessor conditionals.

<?xml version="1.1" encoding="UTF-8"?>
<!-- Using DMS PrintASTasXML (v.1.00) -->
<!-- XML generated on 2013/04/13 15:24:44 -->
<DMSForest>
  <tree node="translation_unit" type="2" domain="1" id="1iity" parents="0" line="1" column="1" file="1">
<tree node="declaration_seq" type="994" domain="1" id="1iitt" line="1" column="1" file="1">
  <tree node="declaration_seq" type="994" domain="1" id="1iepb" line="1" column="1" file="1">
    <tree node="control_line" type="2133" domain="1" id="1ieos" line="1" column="1" file="1">
      <tree node="'#'" type="2908" domain="1" id="1ieoi" literal="0" line="1" column="1" file="1"/>
      <tree node="'include'" type="2759" domain="1" id="1ieok" literal="0" line="1" column="2" file="1"/>
      <tree node="ANGLED_HEADER_NAME" type="2951" domain="1" id="1ieom" line="1" column="10" file="1">
    <literal>ololo.h</literal>
      </tree>
      <tree node="new_line" type="2946" domain="1" id="1ieoo" literal="0" line="1" column="19" file="1"/>
    </tree>
    <tree node="pp_declaration_seq" type="997" domain="1" id="1ieph" line="2" column="1" file="1">
      <tree node="if_directive" type="2113" domain="1" id="1iep4" line="2" column="1" file="1">
    <tree node="'#'" type="2908" domain="1" id="1iep1" literal="0" line="2" column="1" file="1"/>
    <tree node="'ifdef'" type="2756" domain="1" id="1ieov" literal="0" line="2" column="2" file="1"/>
    <tree node="IDENTIFIER" type="2646" domain="1" id="1ieol" line="2" column="8" file="1">
      <literal>HAVE_QQQ</literal>
    </tree>
    <tree node="new_line" type="2946" domain="1" id="1ieoz" literal="0" line="2" column="16" file="1"/>
      </tree>
      <tree node="control_line" type="2133" domain="1" id="1iepi" line="3" column="3" file="1">
    <tree node="'#'" type="2908" domain="1" id="1iep6" literal="0" line="3" column="3" file="1"/>
    <tree node="'include'" type="2759" domain="1" id="1iep8" literal="0" line="3" column="4" file="1"/>
    <tree node="ANGLED_HEADER_NAME" type="2951" domain="1" id="1iepa" line="3" column="12" file="1">
      <literal>qqq.h</literal>
    </tree>
    <tree node="new_line" type="2946" domain="1" id="1iepe" literal="0" line="3" column="19" file="1"/>
      </tree>
      <tree node="endif_directive" type="2117" domain="1" id="1ieoy" line="4" column="1" file="1">
    <tree node="'#'" type="2908" domain="1" id="1iepl" literal="0" line="4" column="1" file="1"/>
    <tree node="'endif'" type="2743" domain="1" id="1iepk" literal="0" line="4" column="2" file="1"/>
    <tree node="new_line" type="2946" domain="1" id="1iepn" literal="0" line="4" column="7" file="1"/>
      </tree>
    </tree>
  </tree>
  <tree node="function_definition" type="1616" domain="1" id="1iito" line="6" column="1" file="1">
    <tree node="function_head" type="1628" domain="1" id="1iiow" line="6" column="1" file="1">
      <tree node="simple_type_specifier" type="1104" domain="1" id="1iep9" line="6" column="1" file="1">
    <tree node="'char'" type="2723" domain="1" id="1iepd" literal="0" line="6" column="1" file="1"/>
      </tree>
      <tree node="ptr_declarator" type="1398" domain="1" id="1iio3" line="6" column="5" file="1">
    <tree node="ptr_operator" type="1436" domain="1" id="1iepq" line="6" column="5" file="1">
      <tree node="'*'" type="2903" domain="1" id="1iep7" literal="0" line="6" column="5" file="1"/>
    </tree>
    <tree node="noptr_declarator" type="1402" domain="1" id="1iioc" line="6" column="7" file="1">
      <tree node="IDENTIFIER" type="2646" domain="1" id="1iepm" line="6" column="7" file="1">
        <literal>ololize</literal>
      </tree>
      <tree node="'('" type="2887" domain="1" id="1iepr" literal="0" line="6" column="14" file="1"/>
      <tree node="parameter_declaration" type="1591" domain="1" id="1iion" line="6" column="15" file="1">
        <tree node="simple_type_specifier" type="1104" domain="1" id="1iioe" line="6" column="15" file="1">
          <tree node="'char'" type="2723" domain="1" id="1iio0" literal="0" line="6" column="15" file="1"/>
        </tree>
        <tree node="ptr_declarator" type="1398" domain="1" id="1iiom" line="6" column="19" file="1">
          <tree node="ptr_operator" type="1436" domain="1" id="1iiof" line="6" column="19" file="1">
        <tree node="'*'" type="2903" domain="1" id="1iio1" literal="0" line="6" column="19" file="1"/>
          </tree>
          <tree node="IDENTIFIER" type="2646" domain="1" id="1iio8" line="6" column="21" file="1">
        <literal>s</literal>
          </tree>
        </tree>
      </tree>
      <tree node="')'" type="2888" domain="1" id="1iiol" literal="0" line="6" column="22" file="1"/>
      <tree node="function_qualifiers" type="1418" domain="1" id="1iio2" line="6" column="24" file="1"/>
    </tree>
      </tree>
    </tree>
    <tree node="compound_statement" type="873" domain="1" id="1iitn" line="6" column="24" file="1">
      <tree node="'{'" type="2940" domain="1" id="1iiov" literal="0" line="6" column="24" file="1"/>
      <tree node="statement" type="853" domain="1" id="1iitw" line="7" column="4" file="1">
    <tree node="if_directive" type="2113" domain="1" id="1iipc" line="7" column="4" file="1">
      <tree node="'#'" type="2908" domain="1" id="1iip4" literal="0" line="7" column="4" file="1"/>
      <tree node="'ifdef'" type="2756" domain="1" id="1iip2" literal="0" line="7" column="5" file="1"/>
      <tree node="IDENTIFIER" type="2646" domain="1" id="1iip5" line="7" column="11" file="1">
        <literal>HAVE_QQQ</literal>
      </tree>
      <tree node="new_line" type="2946" domain="1" id="1iip0" literal="0" line="7" column="19" file="1"/>
    </tree>
    <tree node="jump_statement" type="984" domain="1" id="1iisg" line="8" column="7" file="1">
      <tree node="'return'" type="2780" domain="1" id="1iiox" literal="0" line="8" column="7" file="1"/>
      <tree node="$NONTERMINALAMBIGUITY" type="2999" nonterminalname="postfix_expression" nonterminaltype="402" domain="1" id="1iiou" children="2" line="8" column="14" file="1">
        <tree node="postfix_expression" type="380" domain="1" id="1iipi" line="8" column="14" file="1">
          <tree node="IDENTIFIER" type="2646" domain="1" id="1iip8" parents="2" line="8" column="14" file="1">
        <literal>qqq</literal>
          </tree>
          <tree node="'('" type="2887" domain="1" id="1iip6" parents="2" literal="0" line="8" column="17" file="1"/>
          <tree node="IDENTIFIER" type="2646" domain="1" id="1iipb" parents="2" line="8" column="18" file="1">
        <literal>s</literal>
          </tree>
          <tree node="')'" type="2888" domain="1" id="1iip1" parents="2" literal="0" line="8" column="19" file="1"/>
        </tree>
        <tree node="postfix_expression" type="368" domain="1" id="1iipk" line="8" column="14" file="1">
          <tree node="IDENTIFIER" type="2646" domain="1" id="1iip8" parents="2" alreadyprinted="true"/>
          <tree node="'('" type="2887" domain="1" id="1iip6" parents="2" alreadyprinted="true"/>
          <tree node="IDENTIFIER" type="2646" domain="1" id="1iipb" parents="2" alreadyprinted="true"/>
          <tree node="')'" type="2888" domain="1" id="1iip1" parents="2" alreadyprinted="true"/>
        </tree>
      </tree>
      <tree node="';'" type="2939" domain="1" id="1iisb" literal="0" line="8" column="20" file="1"/>
    </tree>
    <tree node="else_directive" type="2116" domain="1" id="1iisi" line="9" column="4" file="1">
      <tree node="'#'" type="2908" domain="1" id="1iism" literal="0" line="9" column="4" file="1"/>
      <tree node="'else'" type="2742" domain="1" id="1iisp" literal="0" line="9" column="5" file="1"/>
      <tree node="new_line" type="2946" domain="1" id="1iiso" literal="0" line="9" column="9" file="1"/>
    </tree>
    <tree node="jump_statement" type="984" domain="1" id="1iit5" line="10" column="7" file="1">
      <tree node="'return'" type="2780" domain="1" id="1iish" literal="0" line="10" column="7" file="1"/>
      <tree node="$NONTERMINALAMBIGUITY" type="2999" nonterminalname="postfix_expression" nonterminaltype="402" domain="1" id="1iio5" children="2" line="10" column="14" file="1">
        <tree node="postfix_expression" type="380" domain="1" id="1iit6" line="10" column="14" file="1">
          <tree node="IDENTIFIER" type="2646" domain="1" id="1iisk" parents="2" line="10" column="14" file="1">
        <literal>ololo</literal>
          </tree>
          <tree node="'('" type="2887" domain="1" id="1iisu" parents="2" literal="0" line="10" column="19" file="1"/>
          <tree node="IDENTIFIER" type="2646" domain="1" id="1iisv" parents="2" line="10" column="20" file="1">
        <literal>s</literal>
          </tree>
          <tree node="')'" type="2888" domain="1" id="1iit2" parents="2" literal="0" line="10" column="21" file="1"/>
        </tree>
        <tree node="postfix_expression" type="368" domain="1" id="1iiti" line="10" column="14" file="1">
          <tree node="IDENTIFIER" type="2646" domain="1" id="1iisk" parents="2" alreadyprinted="true"/>
          <tree node="'('" type="2887" domain="1" id="1iisu" parents="2" alreadyprinted="true"/>
          <tree node="IDENTIFIER" type="2646" domain="1" id="1iisv" parents="2" alreadyprinted="true"/>
          <tree node="')'" type="2888" domain="1" id="1iit2" parents="2" alreadyprinted="true"/>
        </tree>
      </tree>
      <tree node="';'" type="2939" domain="1" id="1iit4" literal="0" line="10" column="22" file="1"/>
    </tree>
    <tree node="endif_directive" type="2117" domain="1" id="1iitp" line="11" column="4" file="1">
      <tree node="'#'" type="2908" domain="1" id="1iits" literal="0" line="11" column="4" file="1"/>
      <tree node="'endif'" type="2743" domain="1" id="1iitr" literal="0" line="11" column="5" file="1"/>
      <tree node="new_line" type="2946" domain="1" id="1iitq" literal="0" line="11" column="10" file="1"/>
    </tree>
      </tree>
      <tree node="'}'" type="2941" domain="1" id="1iitm" literal="0" line="12" column="1" file="1"/>
    </tree>
  </tree>
</tree>
  </tree>
  <FileIndex>
<File index="1">C:/temp/small.cpp</File>
  </FileIndex>
  <DomainIndex>
<Domain index="1">Cpp~GCC4</Domain>
  </DomainIndex>
</DMSForest>

它将从XML不太往返;有pcofigured构建AST没有XML阅读器$ P $。然而,DMS是高度可定制,并具有XML解析器选项;这将是直接读取XML树,重新生成C ++ AST,然后调用prettyprinter。

It won't quite round-trip from the XML; there's no XML reader precofigured to build an AST. However, DMS is highly customizable and has an XML parser as option; it would be straightforward to read an XML tree, regenerate the C++ AST, and then invoke the prettyprinter.

我不太清楚你的意思从程序员的角度看管理。
这是一个precise树。如果它包含了太多的细节,欢迎您使用XSLT转换,你认为合适简化它,但你可能会失去精度语义这样做。而你可能会失去的能力,往返了。

I'm not quite sure what you mean "manageable from a programmer's point of view". This is a precise tree. If it contains too much detail, you are welcome to apply XSLT transforms as you see fit to simplify it, but you will likely lose semantic accuracy doing so. And you will likely lose the ability to round-trip, too.

我们没有看到这样的XML导出太大必要; DMS的生态系统设计提供了巨大的基础设施量分析/转换程序(包括C ++程序);我们做了大规模的C ++源$ C ​​$ C解析/转型与DMS。所以的需求的做XML导出做一些有用的东西也不是很高。无论如何,我们提供它,因为人们总是问它。令我们惊讶的是,我们有一些客户实际使用它。

We don't see much need for such XML exports; the DMS ecosystem by design provides a huge amount of infrastructure for analyzing/transforming programs (including C++ programs); we've done massive C++ source code parsing/transformation with DMS. So the need to do the XML export to do something useful isn't very high. We offer it anyway, because people always ask for it. To our surprise, we have some clients that actually use it.

这篇关于如何转换pprocessed C / C ++源语法树(反面)未$ P $?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆