ARM Unified Assembler Language 语法和解析器? [英] ARM Unified Assembler Language grammar and parser?

查看:14
本文介绍了ARM Unified Assembler Language 语法和解析器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

ARM 架构参考手册 A4.2 中描述的 ARM 统一汇编语言是否有公开可用的语法或解析器

<块引用><块引用>

本文档使用 ARM 统一汇编语言 (UAL).这种汇编语言语法为所有 ARM 和 Thumb 指令提供了一种规范形式.

UAL 描述了每条指令的助记符和操作数的语法.

我只是对解析助记符的代码和每条指令的操作数感兴趣.例如,如何为这些行定义语法?

ADC{S}{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type><Rs>IT{<x>{<y>{<z>}}}{<q>} <firstcond>最不发达国家{L}<c><coproc>, <CRd>, [<Rn>, #+/-<imm>]{!}

解决方案

如果您需要基于基于示例的语法创建一个简单的解析器,没有什么比 ANTLR 更好的了:

http://www.antlr.org/

ANTLR 将语法规范转换为词法分析器和解析器代码.它比 Lexx 和 Yacc 使用起来更直观.下面的语法涵盖了您在上面指定的部分内容,并且很容易扩展为您想要的:

语法武器;/* 规则 */程序:(语句| NEWLINE)+;声明: (ADC (reg ',')? reg ',' reg ',' reg|IT 第一条件|LDC coproc','cpreg(','reg','imm)?('!')?) 新队;reg: 'r' INT;coproc: 'p' INT;cpreg: 'cr' INT;imm: '#' ('+' | '-')?情报;第一条件:'?';/* 令牌 */ADC:ADC"(S")?;它:它";最不发达国家:最不发达国家"(L")?整数:[0-9]+;新线:'
'?'
';WS: [ 	]+ ->跳过;

来自 ANTLR 网站(OSX 说明):

$ cd/usr/local/lib$ wget http://antlr4.org/download/antlr-4.0-complete.jar$ export CLASSPATH=".:/usr/local/lib/antlr-4.0-complete.jar:$CLASSPATH"$ alias antlr4='java -jar/usr/local/lib/antlr-4.0-complete.jar'$ 别名 grun='java org.antlr.v4.runtime.misc.TestRig'

然后在语法文件上运行:

antlr4 armasm.g4javac *.javagrun armasm 程序 -treeADCS r1、r2、r3它 ?最不发达国家 p3、cr2、r1、#3<EOF>

这会产生分解为标记、规则和数据的解析树:

<块引用>

(program (statement ADCS (reg r 1) , (reg r 2) , (reg r 3) ) (statement IT (firstcond ?) ) (statement LDC (coproc p 3) (cpreg cr 2) (reg r 1) , (imm # - 3) ! ))

语法还不包括指令条件代码,也不包括 IT 指令的详细信息(我时间紧迫).ANTLR 生成一个词法分析器和解析器,然后 grun 宏将它们包装在一个测试装置中,这样我就可以通过生成的代码运行文本片段.生成的 API 可以直接在您自己的应用程序中使用.

为了完整起见,我在网上查找了现有的语法,但没有找到.您最好的选择可能是拆开 gasm 并提取其解析器规范,但它不会是 UAL 语法,如果这对您很重要,它将是 GPL.如果您只需要处理指令的子集,那么这是一个不错的方法.

Is there a publicly available grammar or parser for ARM's Unified Assembler Language as described in ARM Architecture Reference Manual A4.2

This document uses the ARM Unified Assembler Language (UAL). This assembly language syntax provides a canonical form for all ARM and Thumb instructions.

UAL describes the syntax for the mnemonic and the operands of each instruction.

Simply I'm interested in the code for parsing mnemonic and the operands of each instruction. For example how you could define a grammar for these lines?

ADC{S}{<c>}{<q>} {<Rd>,} <Rn>, <Rm>, <type> <Rs>
IT{<x>{<y>{<z>}}}{<q>} <firstcond>
LDC{L}<c> <coproc>, <CRd>, [<Rn>, #+/-<imm>]{!}

解决方案

If you need to create a simple parser based on an example-based grammar, nothing beats ANTLR:

http://www.antlr.org/

ANTLR translates a grammar specification into lexer and parser code. It's much more intuitive to use than Lexx and Yacc. The grammar below covers part of what you specified above, and it's fairly easy to extend to do what you want:

grammar armasm;

/* Rules */
program: (statement | NEWLINE) +;

statement: (ADC (reg ',')? reg ',' reg ',' reg
    | IT firstcond
    | LDC coproc ',' cpreg (',' reg ','  imm )? ('!')? ) NEWLINE;

reg: 'r' INT;
coproc: 'p' INT;
cpreg: 'cr' INT;
imm: '#' ('+' | '-')? INT;
firstcond: '?';

/* Tokens */
ADC: 'ADC' ('S')? ; 
IT:   'IT';
LDC:  'LDC' ('L')?;

INT: [0-9]+;
NEWLINE: '
'? '
';
WS: [ 	]+ -> skip;

From the ANTLR site (OSX instructions):

$ cd /usr/local/lib
$ wget http://antlr4.org/download/antlr-4.0-complete.jar
$ export CLASSPATH=".:/usr/local/lib/antlr-4.0-complete.jar:$CLASSPATH"
$ alias antlr4='java -jar /usr/local/lib/antlr-4.0-complete.jar'
$ alias grun='java org.antlr.v4.runtime.misc.TestRig'

Then on the grammar file run:

antlr4 armasm.g4
javac *.java
grun armasm program -tree

    ADCS r1, r2, r3
    IT ?
    LDC p3, cr2, r1, #3 
    <EOF>

This yields the parse tree broken down into tokens, rules, and data:

(program (statement ADCS (reg r 1) , (reg r 2) , (reg r 3) ) (statement IT (firstcond ?) ) (statement LDC (coproc p 3) (cpreg cr 2) (reg r 1) , (imm # - 3) ! ))

The grammar doesn't yet include the instruction condition codes, nor the details for the IT instruction at all (I'm pressed for time). ANTLR generates a lexer and parser, and then the grun macro wraps them in a test rig so I can run text snippets through the generated code. The generated API is straightfoward to use in your own applications.

For completeness, I looked online for an existing grammar and didn't find one. Your best bet there might be to take apart gasm and extract its parser spec, but it won't be UAL syntax and it will be GPL if that matters to you. If you only need to handle a subset of the instructions then this is a good way to go.

这篇关于ARM Unified Assembler Language 语法和解析器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆