删除变量上的重复项而不进行排序 [英] Removing duplicates on a variable without sorting
问题描述
我有一个变量,其中包含以下用空格分隔的条目.
I have a variable that contains the following space separated entries.
variable="apple lemon papaya avocado lemon grapes papaya apple avocado mango banana"
如何不进行排序就删除重复项?
How do I remove the duplicates without sorting?
#Something like this.
new_variable="apple lemon papaya avocado grapes mango banana"
我在某个地方找到了一个脚本,该脚本可以完成删除变量重复项的操作,但是会对内容进行排序.
I have found somewhere a script that accomplish removing the duplicates of a variable, but does sort the contents.
#Not something like this.
new_variable=$(echo "$variable"|tr " " "\n"|sort|uniq|tr "\n" " ")
echo $new_variable
apple avocado banana grapes lemon mango papaya
推荐答案
new_variable=$( awk 'BEGIN{RS=ORS=" "}!a[$0]++' <<<$variable );
这是它的工作方式:
RS(输入记录分隔符)设置为空白,以便将$ variable中的每个水果都视为记录而不是字段.不排序的唯一魔术发生在!a [$ 0] ++上.由于awk支持关联数组,因此它将当前记录($ 0)用作数组a []的键.如果以前没有看到该键,则a [$ 0]的计算结果为'0'(未设置索引的awk默认值),然后取反以返回TRUE.然后,我利用以下事实:如果表达式返回TRUE并且未给出'{命令}',则awk将默认为'print $ 0'.最后,然后增加a [$ 0],以使该键不再返回TRUE,因此永远不会打印重复值. ORS(输出记录分隔符)也设置为空格,以模仿输入格式.
RS (Input Record Separator) is set to a white space so that it treats each fruit in $variable as a record instead of a field. The non-sorting unique magic happens with !a[$0]++. Since awk supports associative arrays, it uses the current record ($0) as the key to the array a[]. If that key has not been seen before, a[$0] evaluates to '0' (awk's default value for unset indices) which is then negated to return TRUE. I then exploit the fact that awk will default to 'print $0' if an expression returns TRUE and no '{ commands }' are given. Finally, a[$0] is then incremented such that this key can no longer return TRUE and thus repeat values are never printed. ORS (Output Record Separator) is set to a space as well to mimic the input format.
此命令的简明版本会产生相同的输出,如下所示:
A less terse version of this command which produces the same output would be the following:
awk 'BEGIN{RS=ORS=" "}{ if (a[$0] == 0){ a[$0] += 1; print $0}}'
必须爱awk =)
编辑
如果您需要在纯Bash 2.1+中执行此操作,则建议这样做:
If you needed to do this in pure Bash 2.1+, I would suggest this:
#!/bin/bash
variable="apple lemon papaya avocado lemon grapes papaya apple avocado mango banana"
temp="$variable"
new_variable="${temp%% *}"
while [[ "$temp" != ${new_variable##* } ]]; do
temp=${temp//${temp%% *} /}
new_variable="$new_variable ${temp%% *}"
done
echo $new_variable;
这篇关于删除变量上的重复项而不进行排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!