将字符串中每个单词的首字母大写,除了“中的",“中的" [英] Capitalize the first letter of each word in a string except `in`, `the` `of`

查看:69
本文介绍了将字符串中每个单词的首字母大写,除了“中的",“中的"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

除某些单词外,如何将每个单词的首字母大写

How can I capitalize first letter of each word except certain words

x <- c('I like the pizza', 'The water in the pool')

我希望输出是

c('I Like the Pizza', 'The Water in the Pool')

当前我正在使用

gsub('(^|[[:space:]])([[:alpha:]])', '\\1\\U\\2', x, perl=T) 

每个单词的首字母大写.

Which capitalizes the first letter of each word.

推荐答案

您可以对PCRE RegEx应用黑名单方法:

You can apply a blacklisting approach with a PCRE RegEx:

(?<!^)\b(?:the|an?|[io]n|at|with|from)\b(*SKIP)(*FAIL)|\b(\pL)

这是此正则表达式匹配项的演示.

This is a demo of what this regex matches.

在R中:

x <- c('I like the pizza', 'The water in the pool', 'the water in the pool')
gsub("(?<!^)\\b(?:the|an?|[io]n|at|with(?:out)?|from|for|and|but|n?or|yet|[st]o|around|by|after|along|from|of)\\b(*SKIP)(*FAIL)|\\b(\\pL)", "\\U\\1", x, perl=T)
## => [1] "I Like the Pizza"      "The Water in the Pool" "The Water in the Pool"

请参见 IDEONE演示

此处是文章 不应在标题中大写的单词 ,并提示要在第一个替代组中包含哪些单词.

Here is an article Words Which Should Not Be Capitalized in a Title with some hints on what words to include into the first alternative group.

RegEx说明:

  • (?<!^)-如果不在字符串开头,则仅匹配以下替代项(我在注释中添加了此限制,要求 首字母应始终大写. )
  • \ b -前导词边界
  • (?: the | an?| ion | at | with(?:out)?| from | for | and | but | n?or | yet | [st] o | around |-功能词的白名单(可以并且应该扩展!)
  • \ b -尾随单词边界
  • (* SKIP)(* FAIL)-一旦功能词匹配,匹配失败
  • | -或...
  • \ b(\ pL)-捕获与单词中的起始字母匹配的组1.
  • (?<!^) - only match the following alternatives if not at the start of a string (I added this restriction as in comments, there is a requirment that the first letter should always be capitalized.)
  • \b - a leading word boundary
  • (?:the|an?|[io]n|at|with(?:out)?|from|for|and|but|n?or|yet|[st]o|around|by|after|along|from|of) - the whitelist of the function words (CAN AND SHOULD BE EXTENDED!)
  • \b - trailing word boundary
  • (*SKIP)(*FAIL) - fail the match once the function word is matched
  • | - or...
  • \b(\pL) - Capture group 1 matching a letter that is a starting letter in the word.

这篇关于将字符串中每个单词的首字母大写,除了“中的",“中的"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆