使用Java解析包含汇编语言的文件的最佳方法是哪种? [英] Which is the best way to parse a file containing Assembly language using Java.?

查看:87
本文介绍了使用Java解析包含汇编语言的文件的最佳方法是哪种?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从java.util.regex软件包中阅读了有关StringTokenizerStreamTokenizerScannerPatternMatcher的信息.我还阅读了有关它们的意见,但确实感到困惑:最好使用哪种意见?

I have read about StringTokenizer, StreamTokenizer, Scanner, Pattern and Matcher from the java.util.regex package. I have also read about opinions on them and I am realy confused: which one is the best to use?

我需要做的是制作一个Assambler,即从该文件解析包含汇编语言的文件,然后将其转换为机器代码.

What I need to do is to make an Assambler, that is, to parse a file containing Assembly language from that file and I need to transform it into Machine Code.

例如,如果我有汇编代码:

For example if I have the assembly code:

MOV R15,R12

这应该转换为对应于每条指令和寄存器的十六进制数字.

This should translate to hexa numbers coresponding to each instruction and register.

我们只说翻译如下:

  • MOV变为10 F3
  • R15变为11 F2
  • R12变为20 1E
  • MOV becomes 10 F3
  • R15 becomes 11 F2
  • R12 becomes 20 1E

因此,我的输出文件应该是:

Thus, my output file should be:

10 F3 11 F2 20 1E

现在,我需要解析Assembly文件以识别每条指令及其后的内容.

Now I need to parse the Assembly file to identify each instruction and what comes after it.

对于那些了解微控制器的人来说,有很多方法可以显示一条指令.我的问题是:

For those who know microcontroller there are many ways for an instruction to appear. My question is:

使用Java,这是将文件中每个单词转换成令牌的最佳方法(使用上述任何类),以便我找到匹配的单词并将其写入文件.

Using Java, which is the best method to transform each word from my file into tokens (using any of the aforementioned classes), so that I can find the matching one and write it into a file.

ldi R13,0x31

我需要在一个令牌中包含ldi,在另一个令牌中包含r13,在另一个令牌中包含31

I need to have ldi in one token, r13 in another and 31 in another

推荐答案

好吧,您提到的所有内容都非常适合简单地标记字符串或文件.在最新的JDK中,不建议使用StringTokenizer,并且存在更高效的令牌生成器,例如Scanner甚至String.split(). 但是,我认为这不是您想要的.您似乎需要一个词法分析器,或者至少需要一个词法分析器.因为您要理解标记,所以不只是基于某些分隔符对其进行分割.因此,您可以自己改正(如果您正在使用毒品),或者只使用一种非常好的现有工具.像ANTLR http://www.antlr.org/ 它也是免费的,但是可能有点难以使用.还有JavaCC.祝你好运!

Well, everything you mentioned is pretty good for simply tokenizing a string or file. In the latest JDK, StringTokenizer is deprecated and more efficient tokenizers like Scanner and even String.split() exist. However, I don't think this is what you want. You seem to be needing a lexer, or at least a lexer-parser. Because you want to make sense of the tokens, not just split them based on some separator. So either you right your own - if you're on drugs - or just use one of the very good and existing tools out there. Like ANTLR http://www.antlr.org/ It's free too, but may be a little hard to use. Also there's JavaCC. Good luck!

这篇关于使用Java解析包含汇编语言的文件的最佳方法是哪种?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆