在单元格中从逗号分隔的列表中删除重复的字符串 [英] Removing duplicate strings from a comma separated list, in a cell

查看:301
本文介绍了在单元格中从逗号分隔的列表中删除重复的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Google表格,这超出了我的简单脚本编写的范围.

I'm using Google Sheets and this is way beyond my simple scripting.

我有许多包含逗号分隔值的单元格;

I have numerous cells containing comma separated values;

AA, BB, CC, BBB, CCC, CCCCC, AA, BBB, BB

BB, ZZ, ZZ, AA, BB, CC, BBB, CCC, CCCCC, AA, BBB, BB

我想返回:

AA, BB, CC, BBB, CCC, CCCCC etc.

BB, ZZ, AA, CC, BBB, CCC, CCCCC etc.

...删除重复项.每个单元格.

... remove the duplicates. Per cell.

我无法解决.我已经尝试过所有删除重复项的在线工具.但是它们都会删除我整个文档中的重复项.

I can't get my head around a solution. I've tried every online tool that removes duplicates. BUT they all remove duplicates throughout my document.

部分问题是,我不能将单元格按字母"顺序放置(这会使事情变得简单),而必须将它们按原始显示顺序保存.

Part of the problem is, I can't put the cells in 'alphabetical' order (which would make things simple) they have to be kept in the original order they appear.

我还可以随意使用(但超出我的能力)Open Refine,我认为这是一个聪明的工具.

I also have, at my disposal (but beyond my skill) Open Refine which I believe is a clever tool.

推荐答案

以下是在OpenRefine中执行此操作的方法.

Here is how to do that in OpenRefine.

我使用的公式是:

value.split(',').uniques().join(',')

这意味着:用逗号分隔单元格中的值,删除重复项,然后再次使用逗号将它们加入.

It means : split the value in the cells by commas, remove duplicates, join them again using commas.

OpenRefine中使用Python而不是GREL的另一种解决方案.这一个可以更好地保留原始顺序.

Another solution in OpenRefine using Python instead of GREL. This one keep better the original order.

Python/Jython脚本:

Python/Jython Script:

from collections import OrderedDict
dedup = list(OrderedDict.fromkeys(value.replace(' ','').split(',')))
return ",".join(dedup)

这篇关于在单元格中从逗号分隔的列表中删除重复的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆