字符串中的匹配括号 [英] Matching brackets in a string

查看：88 发布时间：2020/5/25 0:12:58 parsing string wolfram-mathematica

本文介绍了字符串中的匹配括号的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

最有效或最优雅的方法来匹配字符串中的括号，例如:

"f @ g[h[[i[[j[2], k[[1, m[[1, n[2]]]]]]]]]] // z"

是为了识别并替换单个字符形式的[[ Part ]]括号?

我想得到:

其他所有内容都完整无缺，例如前缀@和后缀//完整无缺

对不熟悉的人的Mathematica语法的解释:

函数使用单个方括号作为参数:func[1, 2, 3]

零件索引使用双方括号:list[[6]]或单字符Unicode双括号:list〚6〛

我的目的是在ASCII文本字符串中识别匹配的[[ ]]形式，并将其替换为Unicode字符〚〛

解决方案

当我编写第一个解决方案时，我没有注意到您只是想将[[替换为字符串中的〚，而不是表达.您可以始终将HoldForm或Defer用作

但是我想您已经知道了，并且您希望表达式作为字符串，就像输入一样(上面的ToString@无效)

由于到目前为止所有答案都集中在字符串操作上，所以我将采用数字方法而不是为字符串费力，这对我来说更自然. [的字符代码为91，而]的字符代码为93.因此，请执行以下操作

将括号的位置作为0/1向量.我否定了方括号，只是为了帮助思考过程并在以后使用.

注意::我只检查了91和93的可除性，因为我当然不希望您输入以下任何字符，但是如果出于某种原因您选择输入，您可以轻松地使用与91或93相等的布尔值列表来AND上面的结果.

由此，可以找到Part的双括号对中的第一对的位置

在上面的计算中已隐含了一个事实，即在mma中，表达式不是以[开头，并且两个以上的[不能连续出现，因为[[[....

现在，闭合对实现起来比较棘手，但易于理解.想法如下:

对于closeBracket中的每个非零位置，例如i，转到openBracket中的相应位置并找到其左侧的第一个非零位置(例如j).
设置doubleCloseBrackets[[i-1]]=closeBracket[[i]]+openBracket[[j]]+doubleOpenBrackets[[j]].
您会看到doubleCloseBrackets是doubleOpenBrackets的对应物，并且在Part的]]对中的第一对的位置处不为零.

因此，对于第一个开放式括号，我们现在有一组布尔位置.我们只需要用等价的〚替换charCode中的相应元素，并且类似地，用第一个右括号的布尔位置，用等价的〛替换charCode中的相应元素.

最后，通过删除已更改元素旁边的元素，您可以将修改后的字符串替换为〚〛替换为〚〛

注意2:

我的许多MATLAB习惯都爬上了上面的代码，并且在Mathematica中并不完全是惯用的.但是，我认为逻辑是正确的，并且可行.我会把它留给您进行优化(我认为您可以取消Do[])并使其成为一个模块，因为这样做会花费我很多时间.

代码为文本

Clear["Global`*"]
str = "f[g[h[[i[[j[2], k[[1, m[[1, n[2]]]]]]]]]]]";
charCode = ToCharacterCode@str;
openBracket = Boole@Divisible[charCode, First@ToCharacterCode["["]];
closeBracket = -Boole@
    Divisible[charCode, First@ToCharacterCode["]"]];
doubleOpenBracket = 
  Append[Differences@Accumulate[openBracket], 0] openBracket;
posClose = Flatten@Drop[Position[closeBracket, Except@0, {1}], 1];

doubleCloseBracket = ConstantArray[0, Dimensions@doubleOpenBracket];
openBracketDupe = openBracket + doubleOpenBracket;
Do[
  tmp = Last@
    Flatten@Position[openBracketDupe[[1 ;; i]], Except@0, {1}];
  doubleCloseBracket[[i - 1]] = 
   closeBracket[[i]] + openBracketDupe[[tmp]];
  openBracketDupe[[tmp]] = 0;,
  {i, posClose}];

changeOpen = 
  Cases[Range[First@Dimensions@charCode]  doubleOpenBracket, Except@0];
changeClosed = 
  Cases[Range[First@Dimensions@charCode]  doubleCloseBracket, 
   Except@0];
charCode[[changeOpen]] = ToCharacterCode["\[LeftDoubleBracket]"];
charCode[[changeClosed]] = ToCharacterCode["\[RightDoubleBracket]"];
FromCharacterCode@
 Delete[Flatten@charCode, 
  List /@ (Riffle[changeOpen, changeClosed] + 1)]

What is the most efficient or elegant method for matching brackets in a string such as:

"f @ g[h[[i[[j[2], k[[1, m[[1, n[2]]]]]]]]]] // z"

for the purpose of identifying and replacing [[ Part ]] brackets with the single character forms?

I want to get:

With everything else intact, such as the prefix @ and postfix // forms intact

An explanation of Mathematica syntax for those unfamiliar:

Functions use single square brackets for arguments: func[1, 2, 3]

Part indexing is done with double square brackets: list[[6]] or with single-character Unicode double brackets: list〚6〛

My intent is to identify the matching [[ ]] form in a string of ASCII text, and replace it with the Unicode characters 〚〛

解决方案

When I wrote my first solution, I hadn't noticed that you just wanted to replace the [[ with 〚 in a string, and not an expression. You can always use HoldForm or Defer as

but I think you already knew that, and you want the expression as a string, just like the input (ToString@ on the above doesn't work)

As all the answers so far focus on string manipulations, I'll take a numeric approach instead of wrestling with strings, which is more natural to me. The character code for [ is 91 and ] is 93. So doing the following

gives the locations of the brackets as a 0/1 vector. I've negated the closing brackets, just to aid the thought process and for use later on.

NOTE: I have only checked for divisibility by 91 and 93, as I certainly don't expect you to be entering any of the following characters, but if, for some reason you choose to, you can easily AND the result above with a boolean list of equality with 91 or 93.

From this, the positions of the first of Part's double bracket pair can be found as

The fact that in mma, expressions do not start with [ and that more than two [ cannot appear consecutively as [[[... has been implicitly assumed in the above calculation.

Now the closing pair is trickier to implement, but simple to understand. The idea is the following:

For each non-zero position in closeBracket, say i, go to the corresponding position in openBracket and find the first non-zero position to the left of it (say j).
Set doubleCloseBrackets[[i-1]]=closeBracket[[i]]+openBracket[[j]]+doubleOpenBrackets[[j]].
You can see that doubleCloseBrackets is the counterpart of doubleOpenBrackets and is non-zero at the position of the first of Part's ]] pair.

So now we have a set of Boolean positions for the first open bracket. We simply have to replace the corresponding element in charCode with the equivalent of 〚 and similarly, with the Boolean positions for the first close bracket, we replace the corresponding element in charCode with the equivalent of 〛.

Finally, by deleting the element next to the ones that were changed, you can get your modified string with [[]] replaced by 〚〛

NOTE 2:

A lot of my MATLAB habits have crept in the above code, and is not entirely idiomatic in Mathematica. However, I think the logic is correct, and it works. I'll leave it to you to optimize it (me thinks you can do away with Do[]) and make it a module, as it would take me a lot longer to do it.

Code as text

Clear["Global`*"]
str = "f[g[h[[i[[j[2], k[[1, m[[1, n[2]]]]]]]]]]]";
charCode = ToCharacterCode@str;
openBracket = Boole@Divisible[charCode, First@ToCharacterCode["["]];
closeBracket = -Boole@
    Divisible[charCode, First@ToCharacterCode["]"]];
doubleOpenBracket = 
  Append[Differences@Accumulate[openBracket], 0] openBracket;
posClose = Flatten@Drop[Position[closeBracket, Except@0, {1}], 1];

doubleCloseBracket = ConstantArray[0, Dimensions@doubleOpenBracket];
openBracketDupe = openBracket + doubleOpenBracket;
Do[
  tmp = Last@
    Flatten@Position[openBracketDupe[[1 ;; i]], Except@0, {1}];
  doubleCloseBracket[[i - 1]] = 
   closeBracket[[i]] + openBracketDupe[[tmp]];
  openBracketDupe[[tmp]] = 0;,
  {i, posClose}];

changeOpen = 
  Cases[Range[First@Dimensions@charCode]  doubleOpenBracket, Except@0];
changeClosed = 
  Cases[Range[First@Dimensions@charCode]  doubleCloseBracket, 
   Except@0];
charCode[[changeOpen]] = ToCharacterCode["\[LeftDoubleBracket]"];
charCode[[changeClosed]] = ToCharacterCode["\[RightDoubleBracket]"];
FromCharacterCode@
 Delete[Flatten@charCode, 
  List /@ (Riffle[changeOpen, changeClosed] + 1)]

这篇关于字符串中的匹配括号的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

字符串中的匹配括号 [英] Matching brackets in a string

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

字符串中的匹配括号 [英] Matching brackets in a string

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭