标记Java中的中缀字符串 [英] Tokenizing an infix string in Java

查看:51
本文介绍了标记Java中的中缀字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在Java中实现 Shunting Yard Algorithm (分流场算法),作为一个附带项目我的AP计算机科学课.我已经用Javascript实现了一个简单的脚本,只带有基本的算术表达式(加法,减法,乘法,除法,求幂).要将其拆分为一个数组,我要做的是找到每个运算符( +-*/^ )以及数字和括号,然后在它们周围放置一个空格,然后拆分它变成一个数组.例如,将中缀字符串 4+(3 ​​+ 2)做成 4 +(3 + 2),然后在空白处分割.

I'm implementing the Shunting Yard Algorithm in Java, as a side project to my AP Computer Science class. I've implemented a simple one in Javascript, with only basic arithmetic expressions (addition, subtraction, multiplication, division, exponentiation). To split that into an array, what I did was find each of the operators (+-*/^), as well as numbers and parentheses, and I put a space around them, and then I split it into an array. For example, the infix string 4+(3+2) would be made into 4 + ( 3 + 2 ), and then split on whitespace.

但是,我觉得这种方法非常慢,并且在您开始添加数学函数(例如正弦,余弦,正切,绝对值等)时,实现起来变得越来越困难且效率越来越低.

However, I feel that this method is very slow, and it gets increasingly harder and inefficient to implement as you start to add mathematical functions, such as sine, cosine, tangent, absolute value, and others.

将像 sin(4 + 3)-8 这样的字符串分割成数组 ["sin",("" 4,"+",3,)",-",8] ?

What would be the best way to split a string like sin(4+3)-8 into an array ["sin","(" 4,"+",3,")","-",8]?

我可以为此使用正则表达式,但我不太了解它们,并且我正在尝试学习它们,因此,如果这对他们是最好的解决方案,请回答者请解释一下它的作用吗?/p>

I could use regex for this, but I don't really understand them well, and I'm trying to learn them, so if that would be the best solution to them, could the answerer please explain what it does?

推荐答案

尝试它将在非字母数字字符或句点之前或之后的任何位置分割字符串.

It will split the string at any place that is either preceded or followed by a non-alphanumeric character or period.

  • (?< = [^ \.a-zA-Z \ d])否定的字符类.它匹配 [^ ...] 中未包含的单个字符.
    • \.与字符..
    • 匹配
    • a-z 匹配 a z 之间的任何小写字符.
    • A-Z 相同,只是大写.
    • \ d [0-9] 等效,因此它匹配任何数字.
    • (?<=[^\.a-zA-Z\d]) is a positive lookbehind. It matches the place between two characters, if the preceding string matches the sub-regex contained within (?<=...).
      • [^\.a-zA-Z\d] is a negated character class. It matches a single character that is not contained within [^...].
        • \. matches the character ..
        • a-z matches any lowercase character between a and z.
        • A-Z is the same, but for uppercase.
        • \d is the equivalent of [0-9], so it matches any digit.

        您可以像这样在Java中实现此正则表达式:

        You can implement this regex in java like this:

        String str = "sin(4+3)-8";
        String[] parts = str.split("(?<=[^\\.a-zA-Z\\d])|(?=[^\\.a-zA-Z\\d])");
        

        结果:

        ["sin","(" 4,"+",3,")","-","8"]
        

        这篇关于标记Java中的中缀字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆