如何构建一个正则表达式来分割这个文本? [英] How to construct a regular expression to split this text?

查看:118
本文介绍了如何构建一个正则表达式来分割这个文本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好我正在写一个脚本,主要想法是我有一个固定结构的文本如下:

 RBD | X | RBD | C | 92173〜GJHGWO.NAYESAMBORNSiPOSSSTHRa
RBD | X | RBD | C | 92173〜GJHGX4.NAYESAMBORNSiPOSSSTHRa
RBD | X | RBD | C | 92173〜GJHGX6.NAYESAMBORNSiPOSSSTHRa
RBD | X | RBD | C | 92173〜GJHGX8.NAYESAMBORNSiPOSSSTHRa
RBD | X | RBD | C | 92173〜 GJHGXA.NAYESAMBORNSiPOSSSTHRa
RBD | X | RBD | C | 92173〜GJHGXC.NAYESAMBORNSiPOSSSTHRa

我想处理该文本,我想用下列符号分割该文本:
|〜,管双引号和〜,我想创建一个数组来存储该值,如下所示:

  splitWords = [RBD,X,RBD,C,92173,GJHGWO.NAYE,SAMBORNSiPOSSSTHRa] 

为了实现它,我尝试过:

  var splitWords = document.getElementById(texto)。value.split(|); 
document.write(stringArray.toString());
< $ C $

 RBD ,X,RBD,C,92173_GJHGWO.NAYESAMBORNSiPOSSSTHRaRBD,X,RBD,C,92173_GJHGX4.NAYESAMBORNSiPOSSSTHRaRBD,X,RBD,C,92173_GJHGX6.NAYE SAMBORNSiPOSSSTHRaRBD,X,RBD,C,92173_GJHGX8.NAYESAMBORNSiPOSSSTHRaRBD,X,RBD,C,92173_GJHGXA.NAYESAMBORNSiPOSSSTHRaRBD,X,RBD,C,92173_GJHGXC .NAYESAMBORNSiPOSSSTHRa

问题在于这只是将文本按管道拆分,我想将它与其他符号分开,以获得我想要的结果。
完整的代码如下所示:

 <!DOCTYPE html> 
< html>

< body>
< p id =demo>< / p>

< textarea cols = 150 rows = 15 id =texto>
RBD | X | RBD | C | 92173〜GJHGWO.NAYESAMBORNSiPOSSSTHRa
RBD | X | RBD | C | 92173〜GJHGX4.NAYESAMBORNSiPOSSSTHRa
RBD | X | RBD | C | 92173〜GJHGX6.NAYESAMBORNSiPOSSSTHRa
RBD | X | RBD | C | 92173〜GJHGX8.NAYESAMBORNSiPOSSSTHRa
RBD | X | RBD | C | 92173〜GJHGXA.NAYESAMBORNSiPOSSSTHRa
RBD | X | RBD | C | 92173〜GJHGXC.NAYESAMBORNSiPOSSSTHRa
< / textarea>

< script>
var splitWords = document.getElementById(texto)。value.split(|);
document.write(splitWords.toString());
< / script>

< / body>
< / html>

我想感谢任何有关正则表达式的建议。

解决方案

好吧,让我们开始...获取 textarea 值并修剪它... p>

  var splitWords = document.getElementById(texto).value.trim(); 

首先,您需要替换符号...

  splitWords = splitWords.replace(// g,''); 

然后分割线条,因为它就像表格行一样...

  splitWords = splitWords.split('\\\
');

然后以可定义的分隔符分隔每一行 | ...

  splitWords.forEach(function(rowValue,rowIndex){
splitWords [rowIndex] = rowValue.split(/ [|〜] /);
console.log(rowIndex,splitWords [ rowIndex]);
});

Console.log输出结果为:

 0 [RBD,X,RBD,C,92173,GJHGWO.NAYE,SAMBORNSiPOSSSTHRa] 
1 [RBD ,X,RBD,C,92173,GJHGX4.NAYE,SAMBORNSiPOSSSTHRa]
2 [RBD,X,RBD,C, 92173,GJHGX6.NAYE,SAMBORNSiPOSSSTHRa]
3 [RBD,X,RBD,C,92173,GJHGX8.NAYE,SAMBORNSiPOSSSTHRa]
4 [RBD,X,RBD,C,92173,GJHGXA.NAYE,SAMBORNSiPOSSSTHRa]
5 [RBD,X, RBD,C,92173,GJHGXC.NAYE,SAMBORNSiPOSSSTHRa]

然后用二维数组 splitWords ...做任何你想做的事情...


Hello everyone I am writing a script, the main idea is that I have a text with a fixed structure as follows:

"RBD|X|RBD|C|92173~GJHGWO.NAYE" "SAMBORNSiPOSSSTHRa"
"RBD|X|RBD|C|92173~GJHGX4.NAYE" "SAMBORNSiPOSSSTHRa"
"RBD|X|RBD|C|92173~GJHGX6.NAYE" "SAMBORNSiPOSSSTHRa"
"RBD|X|RBD|C|92173~GJHGX8.NAYE" "SAMBORNSiPOSSSTHRa"
"RBD|X|RBD|C|92173~GJHGXA.NAYE" "SAMBORNSiPOSSSTHRa"
"RBD|X|RBD|C|92173~GJHGXC.NAYE" "SAMBORNSiPOSSSTHRa"

I want to process that text, I want to split that text by the following symbols: |"~, pipe double quote and ~, I want to create an array to store that values, as follows:

splitWords = [RBD,X,RBD,C,92173,GJHGWO.NAYE,SAMBORNSiPOSSSTHRa]

In order to achieve it I tried:

var splitWords = document.getElementById("texto").value.split("|");
document.write(stringArray.toString());

and I get:

"RBD,X,RBD,C,92173~GJHGWO.NAYE" "SAMBORNSiPOSSSTHRa" "RBD,X,RBD,C,92173~GJHGX4.NAYE" "SAMBORNSiPOSSSTHRa" "RBD,X,RBD,C,92173~GJHGX6.NAYE" "SAMBORNSiPOSSSTHRa" "RBD,X,RBD,C,92173~GJHGX8.NAYE" "SAMBORNSiPOSSSTHRa" "RBD,X,RBD,C,92173~GJHGXA.NAYE" "SAMBORNSiPOSSSTHRa" "RBD,X,RBD,C,92173~GJHGXC.NAYE" "SAMBORNSiPOSSSTHRa"

The problem with this is that this is just splitting the text by the pipe, I would like to split it by the others symbols too, in order to get my desired output. The complete code looks as follows:

<!DOCTYPE html>
<html>

<body>
<p id="demo"></p>

<textarea cols=150 rows=15 id="texto">
"RBD|X|RBD|C|92173~GJHGWO.NAYE" "SAMBORNSiPOSSSTHRa"
"RBD|X|RBD|C|92173~GJHGX4.NAYE" "SAMBORNSiPOSSSTHRa"
"RBD|X|RBD|C|92173~GJHGX6.NAYE" "SAMBORNSiPOSSSTHRa"
"RBD|X|RBD|C|92173~GJHGX8.NAYE" "SAMBORNSiPOSSSTHRa"
"RBD|X|RBD|C|92173~GJHGXA.NAYE" "SAMBORNSiPOSSSTHRa"
"RBD|X|RBD|C|92173~GJHGXC.NAYE" "SAMBORNSiPOSSSTHRa"
</textarea>

<script>
var splitWords = document.getElementById("texto").value.split("|");
document.write(splitWords.toString());
</script>

</body>
</html>

I would like to appreciate any suggestion of a regular expression to achieve this.

解决方案

Ok, let's begin... Get textarea value and trim it...

var splitWords = document.getElementById("texto").value.trim();

First of all you need to replace " symbol...

splitWords = splitWords.replace(/"/g, '');

Then split the lines because it's like table rows...

splitWords = splitWords.split('\n');

Then split each row by posible delimeters |, ~, ...

splitWords.forEach(function(rowValue,rowIndex) {
    splitWords[rowIndex] = rowValue.split(/[|~ ]/);
    console.log(rowIndex, splitWords[rowIndex]);
});

Console.log output will be:

0 ["RBD", "X", "RBD", "C", "92173", "GJHGWO.NAYE", "SAMBORNSiPOSSSTHRa"]
1 ["RBD", "X", "RBD", "C", "92173", "GJHGX4.NAYE", "SAMBORNSiPOSSSTHRa"]
2 ["RBD", "X", "RBD", "C", "92173", "GJHGX6.NAYE", "SAMBORNSiPOSSSTHRa"]
3 ["RBD", "X", "RBD", "C", "92173", "GJHGX8.NAYE", "SAMBORNSiPOSSSTHRa"]
4 ["RBD", "X", "RBD", "C", "92173", "GJHGXA.NAYE", "SAMBORNSiPOSSSTHRa"]
5 ["RBD", "X", "RBD", "C", "92173", "GJHGXC.NAYE", "SAMBORNSiPOSSSTHRa"]

Then do whatever you want with 2-dimensional array splitWords...

这篇关于如何构建一个正则表达式来分割这个文本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆