删除变量上的重复项而不进行排序 [英] Removing duplicates on a variable without sorting

查看:248
本文介绍了删除变量上的重复项而不进行排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个变量,其中包含以下用空格分隔的条目.

I have a variable that contains the following space separated entries.

variable="apple lemon papaya avocado lemon grapes papaya apple avocado mango banana"

如何不进行排序就删除重复项?

How do I remove the duplicates without sorting?

#Something like this.
new_variable="apple lemon papaya avocado grapes mango banana"

我在某个地方找到了一个脚本,该脚本可以完成删除变量重复项的操作,但是会对内容进行排序.

I have found somewhere a script that accomplish removing the duplicates of a variable, but does sort the contents.

#Not something like this.
new_variable=$(echo "$variable"|tr " " "\n"|sort|uniq|tr "\n" " ")
echo $new_variable
apple avocado banana grapes lemon mango papaya

推荐答案

new_variable=$( awk 'BEGIN{RS=ORS=" "}!a[$0]++' <<<$variable );

这是它的工作方式:

RS(输入记录分隔符)设置为空白,以便将$ variable中的每个水果都视为记录而不是字段.不排序的唯一魔术发生在!a [$ 0] ++上.由于awk支持关联数组,因此它将当前记录($ 0)用作数组a []的键.如果以前没有看到该键,则a [$ 0]的计算结果为'0'(未设置索引的awk默认值),然后取反以返回TRUE.然后,我利用以下事实:如果表达式返回TRUE并且未给出'{命令}',则awk将默认为'print $ 0'.最后,然后增加a [$ 0],以使该键不再返回TRUE,因此永远不会打印重复值. ORS(输出记录分隔符)也设置为空格,以模仿输入格式.

RS (Input Record Separator) is set to a white space so that it treats each fruit in $variable as a record instead of a field. The non-sorting unique magic happens with !a[$0]++. Since awk supports associative arrays, it uses the current record ($0) as the key to the array a[]. If that key has not been seen before, a[$0] evaluates to '0' (awk's default value for unset indices) which is then negated to return TRUE. I then exploit the fact that awk will default to 'print $0' if an expression returns TRUE and no '{ commands }' are given. Finally, a[$0] is then incremented such that this key can no longer return TRUE and thus repeat values are never printed. ORS (Output Record Separator) is set to a space as well to mimic the input format.

此命令的简明版本会产生相同的输出,如下所示:

A less terse version of this command which produces the same output would be the following:

awk 'BEGIN{RS=ORS=" "}{ if (a[$0] == 0){ a[$0] += 1; print $0}}'

必须爱awk =)

编辑

如果您需要在纯Bash 2.1+中执行此操作,则建议这样做:

If you needed to do this in pure Bash 2.1+, I would suggest this:

#!/bin/bash    

variable="apple lemon papaya avocado lemon grapes papaya apple avocado mango banana"
temp="$variable"

new_variable="${temp%% *}"

while [[ "$temp" != ${new_variable##* } ]]; do
   temp=${temp//${temp%% *} /}
   new_variable="$new_variable ${temp%% *}"
done

echo $new_variable;

这篇关于删除变量上的重复项而不进行排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆