如何在Scala中识别表情符号? [英] How can I identify an emoji in scala?

查看:92
本文介绍了如何在Scala中识别表情符号?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理来自Twitter Api的推文,很多推文都有表情符号.我试图跟踪最常用的表情符号,但实际上很难识别它们.

I am processing tweets from the Twitter Api, and a lot of the tweets have emojis. I'm trying to keep track of the most used emojis, but I'm having trouble actually identifying them.

我正在使用: https://github.com/iamcal/emoji-data 识别表情符号.

I'm using: https://github.com/iamcal/emoji-data to identify emojis.

我不知道如何判断一个字符串是否包含表情符号.我已经尝试过将表情符号数据与统一"字段一起使用正则表达式,但是我尝试仅检查字符串是否包含该字段.我真的只是不确定如何检查表情符号.任何帮助将不胜感激.

I have no idea how to figure out if a string contains an emoji or not. I have tried using regex with the emoji-data 'unified' field, I have tried just checking if the string contains that field. I'm really just not sure how to check for emojis.. Any help would be appreciated.

val pattern = new Regex("(${a.unified})")
(pattern findAllIn text).mkString(",")

这是我尝试使用正则表达式的内容.找不到任何表情符号.我也尝试过在表情符号数据的统一字段之前添加\ u,但这无济于事.

This is what I have tried using regex. This doesn't find any emojis. I have also tried adding a \u before the unified fields from the emoji-data, but that doesn't help.

推荐答案

您可以使用以下Regex查找表情符号字符(以及Unicode语言平面之外的其他字符):

You can use the following Regex to find emoji characters (and other characters outside the Unicode lingual plane):

[^ \ u0000- \ uFFFF]

例如,我们使用以下代码从字符串中过滤掉表情符号:

For example, we use the following code to filter out emojis from strings:

某些字符串" .replaceAll("[^ \ u0000- \ uFFFF]",");

希望有帮助.

这篇关于如何在Scala中识别表情符号?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆