通过Regex原始IRC输出缺口和消息解析 [英] Raw IRC output nick and message parsing via Regex

查看:63
本文介绍了通过Regex原始IRC输出缺口和消息解析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将Twitch IRC聊天解析为更具可读性的方式. 我从未使用过Regex,也不确定如何解决这个问题(即使在阅读了大量教程之后也是如此).

I am trying to parse the Twitch IRC chat into a more readable way. I have never used Regex and am not sure how to go about this (even after reading tons of tutorials.)

这是原始输出:

:nick!nick@nick.tmi.twitch.tv PRIVMSG channel :

我想要两个正则表达式来解析昵称和消息,以便分别使用,谢谢!

I would like two regex's to parse the nick and message to be used individually, thanks!

推荐答案

正则表达式不是您解决此问题的方法.如果您真的想走这条路(但不要-继续阅读!),则可以使用类似的方法 整个消息:

Regex is not your solution for this problem. If you really want to go down this road (but don't - keep reading!), then you can use something like this for the entire message:

:(?<nick>[^ ]+?)\!(?<user>[^ ]+?)@(?<host>[^ ]+?) PRIVMSG (?<target>[^ ]+?) :(?<message>.*)

在昵称,用户名,主机名,频道和消息上定义了捕获组.我还没有测试过,它几乎在所有其他IRC事件上都会惨败,并且有一些方法可以打破它或解决匹配问题,因为这是IRC的错误语法工具:就像敲钉子一样用螺丝刀-虽然有时可以工作,但比所需的要难,并且可以花费很多时间,精力和痛苦使其变得更好.为什么不使用锤子?

There's capture groups defined on the nick, username, hostname, channel, and message. I've not tested that, and it'll fail miserably on pretty much every other IRC event, and there will be ways to break it or get around the matching as it's the wrong sort of grammar tool for IRC: it's like hammering in nails with a screwdriver - while it works some of the time, it's harder than it needs to be, and can be made to work better with a lot of time, effort, and pain; why would you not use a hammer?

更好的解决方案是简单地解析邮件. RFC1459和RFC2812中的IRC规范在这里给出了一些非常有用的提示.根据经验,我的建议是先将:"(空格,然后是冒号)分开-这是消息的最后一个参数,然后用空格分开前半部分.如果列表中的第一个条目以空格开头,请再次将其分隔!和@获取部分 昵称/用户名/主机名元组.遵循这种方法,您将获得比使用正则表达式构建的解析器更健壮和可扩展的解析器的基础.

A much better solution is to simply parse the message. The IRC specs in RFC1459 and RFC2812 give some pretty useful hints here. My advice from experience is to split on " :" (space then colon) - this is the last parameter of the message, then split the first half by spaces. If the first entry in your list starts with a space, split it again by ! and @ to get the parts of the nickname/username/hostname tuple. Follow this method, and you'll have the base to a much more robust and extensible parser than one you could ever build using regular expressions.

如果您将其作为学习练习来做,那就太好了!如果没有,您可能要考虑使用一个预建的库来为您处理所有IRC通信.

If you're doing this as a learning exercise, great! If not, you probably want to consider using a pre-built library to handle all the IRC communication for you.

这篇关于通过Regex原始IRC输出缺口和消息解析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆