正则表达式在已知的开始和结束字符串中的不完整行 [英] Regex for incomplete lines within known start and end strings

查看:100
本文介绍了正则表达式在已知的开始和结束字符串中的不完整行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要将以下内容插入数据库:

I want to insert the following into a database:


(#text1#,#text2#,#text3#,#text4 #,#text5#,#text6#,#text7#,#text8#,#text9#),
(#text1#,#text2#,#text3#,#text4#,#text5#,#text6 #,#text7#,#text8#,#text9#),
(#text1#,#text2#,#text3#,#text4#,#text5#,#text6#,#text7# #,#text9#);

(#text1#,#text2#,#text3#,#text4#,#text5#,#text6#, #text7#, #text8#, #text9#), (#text1#,#text2#,#text3#,#text4#,#text5#,#text6#, #text7#, #text8#, #text9#), (#text1#,#text2#,#text3#,#text4#,#text5#,#text6#, #text7#, #text8#, #text9#);

但有时我不会有九个文本字段可以放入我的数据库;例如

but sometimes I will not have nine textfields that I can place into my database; e.g.


(#text1#,#text2#,#text3#,#text4#,#text5#,#text6#,#text7 #,#text8#,#text9#),
(#text1#,#text2#,#text3#,#text4#,#),<< --- String break and messes up my insert
(#text1#,#text2#,#text3#,#text4#,#text5#,#text6#,#text7#,#text8#,#text9#);

(#text1#,#text2#,#text3#,#text4#,#text5#,#text6#, #text7#, #text8#, #text9#), (#text1#,#text2#,#text3#,#text4#,#), <<<--- String breaks and messes up my insert (#text1#,#text2#,#text3#,#text4#,#text5#,#text6#, #text7#, #text8#, #text9#);

什么正则表达式会删除没有开始和结束标签的字段的行? 修改:线条本身将始终具有开始标签(#和结束标签#)

What regex will delete lines with fields that don't have both start and end tags? The lines themselves will always have the start tag (# and the closing tag #).

我尝试过 / ^ \(#。*?#,#。*?#,#。*? 。*?#,#。*?#,#。*?#,#。*?#,#。*?#,#。*?#\)$ / ig 't work。

I tried /^\(#.*?#,#.*?#,#.*?#,#.*?#,#.*?#,#.*?#,#.*?#,#.*?#,#.*?#\)$/ig but it didn't work.

我创建了一个页面,您可以在其中插入正则表达式以查看您的解决方案是否有效

推荐答案

/^\((?:#.+#,\s*){8}(?:#.+#\s*)\)[,;]$/gm

- 捕获文本字段的组,其中一个或多个字符后跟逗号和可选空格,另一个文本字段不含逗号,全部在字面括号内,后跟逗号或分号。如果在一个文本字符串中有多行,请务必使用/ m开关,以使^和$匹配换行符。

That is 8 non-capturing groups of a text field with one or more characters followed by a comma and optional whitespace, and one more text field with no comma all inside literal parentheses, and followed by a comma or semicolon. If you have multiple lines in one text string, make sure to use the "/m" switch so that "^" and "$" match newlines.

能够使用这提取所有有效的行。删除其他行会变得更难...

You should be able to use this to extract all the valid lines. Deleting other lines is going to be harder...

更新:

这是一个匹配行与8个或更少的#字符或其奇数的字符:

Got it. Here's one that matches lines with 8 or fewer pairs of "#" characters, or with an odd number of them:

^\((?:[^#\n]*?#[^#\n]*?#[,\s]?){0,8}(?:[^#]*#[^#]*)?\)[,;]\s*$

例如:

(#text1#,#text2#,#text3#,#text4#),

或类似的行:

(#text1#,#text2#,#text3#,#text4#,#),

edit:逗号必须是可选的...

edit: the comma needs to be optional...

看起来您的新示例不再是一行一行,您不再有single#的情况,因此可以简化为:

It looks like your new examples no longer are one-per-line, and you no longer have the "single #" case, so it can be simplified to:

\((?:[^#\n]*?#[^#\n]*?#[,\s]?){0,8}\)[,;]\s*

这篇关于正则表达式在已知的开始和结束字符串中的不完整行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆