有没有一种方法可以计算字符串中每个单词的字符数,并返回以逗号分隔的值? [英] Is there a way to count the number of characters per word for a string, returning values separated by a comma?

查看:270
本文介绍了有没有一种方法可以计算字符串中每个单词的字符数,并返回以逗号分隔的值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在单元格中有一个字符串列表-其中有1000个-我需要计算每个单词的字符,但要用单词分开-最好以1个快速公式...

I have a list of strings in cells - 1000s of them - and I need to work out the characters per word but separated by word - preferably in 1 swift formula...

例如: 1.带手柄黑杯">我需要的公式> 5,3,4,6

For Example: 1. "Black Cup With Handle" > Formula I need > 5,3,4,6

  1. 大熊雕像">我需要的公式> 5,4,6

我需要一个重复执行的任务,该任务已经以一种非常低效的方式进行了宏处理,无法将单词计数到列中(在这种情况下,我们最多需要使用20个单词),但是这需要解决.

I need this for a recurring task which has been macro'd in a very inefficient way to count words into columns (of which we need to use up to 20 for the just encase) but this needs to be tackled.

通常,我们对空格和嵌套的serach()公式进行计数,以将其piggy带到另一个上以破坏结构,然后字符对各个单词进行计数...

Usually, we count the spaces and layer nested serach() formulas to piggyback onto one and other to break down the structure then character counts the individual words...

我可以选择用宏替换逗号的空格,并将文本用于列,但是对于我正在寻找的内容,我仍然需要长时间的计数过程

I could alternatively the macro to substitute the spaces for commas and used text to columns but that still leaves me with a prolonged counting process for what im looking for

我们显然使用=LEN(A1)-LEN(SUBSTITUTE(A1," ",""))来计算单词中的空格

we obviously use =LEN(A1)-LEN(SUBSTITUTE(A1," ","")) to count the spaces in the word

我们目前将=SEACRH()函数与=MID()函数(以及一些奇异的数字)结合使用,以将每个单词显示在其自己的单个单元格中

we currently then use =SEACRH() function combined with =MID() functions (and some bizarre numbers) to reveal each word into its own individual cell

然后=LEN再次对所有单个词进行处理-long之以鼻

then =LEN once again bu on all individual words - very long-winded

我希望找到一种更短的方法来做到这一点,但感觉可能没有足够的动态方法可以单独使用公式来完成这项工作,希望有人可以证明我做错了!

Im hoping to find a shorter way to do this but feeling there may not be a dynamic enough way to do it with formula alone, hoping someone can prove me wrong!

推荐答案

根据Excel版本,您将有不同的选择.

You'll have different options depending on your Excel version.

选项1:TEXTJOIN

我认为您正在寻找TEXTJOIN函数.只需记住,您只能在更高版本的Excel中使用它(请参阅文档链接),它可以像这样工作:

I think you are looking for a TEXTJOIN function. Just bare in mind that you can only use this the more later versions of Excel (see link to documentation) and it could work like this:

B1中的公式:

=TEXTJOIN(",",TRUE,LEN(FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s")))

注意::这是一个数组公式,您需要使用 Ctrl Shift Enter 输入

NOTE: It's an array formula and you need to enter it using CtrlShiftEnter

为了使其不再需要使用上面的组合键,我们可以添加一个INDEX:

To make it so that you won't need to use the above key-combo, we can include an INDEX:

=TEXTJOIN(",",TRUE,INDEX(LEN(FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s")),))


其他信息:

此函数(根据文档)带有两个必需的参数:

This function takes (as per documentation) two required arguments:

  • A string in valid XML
  • A string in valid XPath

由于我们要从单元格返回元素(单词)的数组,因此需要

Because we want to return an array of elements (words) from the cell, we need to SUBSTITUTE the spaces for end-tags (</..>) and concatenate that with a start-tag (<..>) at the start of the string and another end-tag at the end.

我必须依靠标签上的XML解释<?><?>的工作原理及其含义,因为就我的测试而言,我可以交换字母或将另一个字母替换为相同的字母只要最后一个Xpath类似于相同字符,就可以得到结果.如果有人能够对此问题提供更好的解释来补充这个答案,那就太好了.

I'll have to rely on an XML explaination on the tags as to why <?><?> works and it's meaning, because as far as my testing goes I could swap the letters around or replace by another letter with the same results as long as the final Xpath would resemble the same character. It would be great if someone would be able to complement this answer with a better explanation on this matter.

有关更多FILTERXML技巧",请在此处

For more FILTERXML "tricks", have a look here

如果您是Office 365订阅者或自己的Excel 2019,则可以使用此功能. (根据文档)至少有3个必需参数:

If you are a Office 365 subscriber or own Excel 2019 you can make use of this function. There are (as per documentation) at least 3 required arguments:

  • 一个分隔符,必须是一个文本字符串,可以为空,也可以是一个或多个用双引号引起来的字符,或者是对有效文本字符串的引用.如果提供了数字,它将被视为文本.
  • 第二个参数可以容纳TRUEFALSE,并确定是否要排除/包括空值
  • 第三个参数是要连接的文本项.文本字符串或字符串数​​组,例如一系列单元格.
  • A delimiter which must be a text string, either empty, or one or more characters enclosed by double quotes, or a reference to a valid text string. If a number is supplied, it will be treated as text.
  • The second argument can hold either TRUE or FALSE and determines whether or not you want to exclude/include empty values
  • The third argument is the text item to be joined. A text string, or array of strings, such as a range of cells.

现在这是我们可以将两个函数结合在一起的地方,FILTERXML返回一个可以在TEXTJOIN中使用的数组.

Now this is where we can join the two functions together, FILTERXML returning an array which we can use in TEXTJOIN.

LEN

我将不得不一起解释这些功能的用法.我认为LENINDEX本身并不需要太多介绍,但是它们在一起可以很好地工作.在本地,将有一种称为隐式交集的作用力,当将值数组传递给函数时,这种情况将阻止LEN返回值数组,在这种情况下,是通过我们的FILTERXML.

I'll have to explain the use of these functions together. I don't think LEN and INDEX will need much of an introduction on their own, but together they work quite nicely. Natively there will be a force called implicit intersection that will prevent LEN from returning an array of values when you pass an array of values to the function, in this case through our FILTERXML.

通常,您可以使用以下组合键禁用此机制: Ctrl Shift Enter ,通常称为CSE.

Normally you would disable this mechanism using a key combination of: CtrlShiftEnter, better known as CSE.

现在INDEX所做的是禁用此隐式交集,从而使LEN能够返回数组,从而无需使用CSE公式. INDEX是具有此功能"的功能之一.可以在此处

Now what INDEX does is disabling this implicit intersection making LEN able to return an array, removing the need to CSE the formula. INDEX is one of the functions that has this "power". A more in depth explanation on implicit intersection can be found here

选项2:UDF

OPTION 2: UDF

无法访问TEXTJOIN,我认为您需要使用UDF进行查看,可能如下所示:

Without access to TEXTJOIN I think you'll need to have a look at using an UDF, possibly looking like below:

Function TEXTJOIN(rng As Range) As String
    TEXTJOIN = Join(Application.Evaluate("LEN({""" & Join(Split(rng, " "), """,""") & """})"), ",")
End Function

您可以像这样在B1中调用它:=TEXTJOIN(A1)

You can call this in B1 like so: =TEXTJOIN(A1)

其他信息:

UDF由三个主要机制共同作用:

The UDF consists out of three main mechanisms that work together:

JOIN

此函数有两个参数,其中第一个是必需的:

This funciton takes two parameters, where the first one is required:

  • 第一个参数是包含子字符串的一维数组
  • 第二个(可选)参数是一个字符串字符,用于分隔返回的字符串中的子字符串.如果省略,则使用空格字符(").如果定界符是长度为零的字符串("),则列表中的所有项目都将没有定界符串联在一起.

该函数返回一个字符串值

The function returns a string value

SPLIT

此函数接受一个字符串,并用指定的字符/子字符串定界.它采用以下参数:

This function takes a string and delimits it by a specified character/substring. It takes the following arguments:

  • 1st:必需的字符串表达式,包含子字符串和定界符.如果expression是长度为零的字符串("),则Split返回一个空数组,即一个没有元素且没有数据的数组.
  • 2nd:可选的定界符,它是一个字符串字符,用于标识子字符串限制.如果省略,则假定空格字符(")为定界符.如果定界符是零长度的字符串,则返回包含整个表达式字符串的单元素数组.
  • 3rd:可选限制,要返回的子字符串数; -1表示将返回所有子字符串.
  • 第4个:比较(也是可选的)是一个数字值,指示在评估子字符串时要使用的比较类型.有关值,请参见设置"部分.

在这种情况下,我们只需要前两个参数即可.

In this case we would only need the first two arguments.

Application.Evaluate

这是IMO上最方便的机制之一,可用于提取返回的值数组而不必遍历项目/单元格.当您向函数提供大数组公式时,它可能变慢,但在这种情况下会很好.该功能将Microsoft Excel名称转换为对象或值,当我们将其传递给公式时,它将返回结果.在这种情况下,它将返回一个数组.

This is IMO one of the most handy mechanisms you can use to pull of a returned array of values without having to loop through items/cells. It may get slow when you feed the function a large array formula, but in this case it will be fine. The funtion converts a Microsoft Excel name into an object or value, and when we pass it an formula, it thus will return the results. In this particular case it will return an array.

这篇关于有没有一种方法可以计算字符串中每个单词的字符数,并返回以逗号分隔的值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆