我怎样才能在Google表格中统一/表示Unicode字符? [英] How can I normalize / asciify Unicode characters in Google Sheets?

查看:125
本文介绍了我怎样才能在Google表格中统一/表示Unicode字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图为Google表格编写一个公式,它将Unicode字符与变音符号转换为它们的纯ASCII等效字符。 我看到 Google在其REGEXREPLACE功能中使用了RE2。我发现 RE2提供了Unicode字符类



我试图编写一个公式(类似于 this one ):

  REGEXREPLACE(público,(\pL)\ pM *,$ 1)

但表格产生以下错误:


函数REGEXREPLACE参数2值\ pL不是有效的正则表达式。


我想我可以写一个由一组嵌套的SUBSTITUTE函数组成的公式(像这样一个),但这似乎相当可怕。



任何人都可以提供一个更好的方法来标准化Google表格中带有变音符号/重音符号的Unicode字母的建议公式?

[<:^ alpha:]] (取反的ASCII字符类)适用于 REGEXEXTRACT 公式。



但是 = REGEXREPLACE(público,([[:alpha:]])[[[ :^ alpha:]],$ 1)结果为pblic。所以,我想,公式不知道确切的ASCII字符必须替换ú。






解决方法 取词públicē;我们需要替换它中的两个符号。把这个单词放在单元格A1中,并在B1单元格中输入这个公式:

  = JOIN(,ArrayFormula(IFERROR(VLOOKUP SPLIT(REGEXREPLACE(A1, $ 1  - ),  - , ()。),d:( () REGEXREPLACE(A1, $ 1-)E,2,0),SPLIT,  - ))))

然后在范围D:E中创建替换目录: p>

  DE 
1 u
2 $ e
3 ...

这个公式仍然很丑陋,但更有用,因为您可以通过向表中添加更多字符来控制您的目录。




或者使用Java Script

另外找到了一个很好的解决方案,它可以在谷歌工作表中使用。

I'm trying to write a formula for Google Sheets which will convert Unicode characters with diacritics to their plain ASCII equivalents.

I see that Google uses RE2 in its "REGEXREPLACE" function. And I see that RE2 offers Unicode character classes.

I tried to write a formula (similar to this one):

REGEXREPLACE("público","(\pL)\pM*","$1")

But Sheets produces the following error:

Function REGEXREPLACE parameter 2 value "\pL" is not a valid regular expression.

I suppose I could write a formula consisting of a long set of nested SUBSTITUTE functions (Like this one), but that seems pretty awful.

Can any offer a suggestion for a better way to normalize Unicode letters with diacritical/accent marks in a Google Sheets formula?

解决方案

[[:^alpha:]] (negated ASCII character class) works fine for REGEXEXTRACT formula.

But =REGEXREPLACE("público","([[:alpha:]])[[:^alpha:]]","$1") gives "pblic" as a result. So, I guess, formula doesn't know what exact ASCII character must replace "ú".


Workaround

Let's take the word públicē; we need to replace two symbols in it. Put this word in cell A1, and this formula in cell B1:

=JOIN("",ArrayFormula(IFERROR(VLOOKUP(SPLIT(REGEXREPLACE(A1,"(.)","$1-"),"-"),D:E,2,0),SPLIT(REGEXREPLACE(A1,"(.)","$1-"),"-"))))

And then make directory of replacements in range D:E:

    D    E  
1   ú   u
2   ē   e
3  ...  ...

This formula is still ugly, but more useful because you can control your directory by adding more characters to the table.


Or use Java Script

Also found a good solution, which works in google sheets.

这篇关于我怎样才能在Google表格中统一/表示Unicode字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆