千位后用逗号分隔的复合因子向量的重新格式化 [英] Reformarring complex factor vector with comma separation after thousand

查看:78
本文介绍了千位后用逗号分隔的复合因子向量的重新格式化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想重新格式化一个因子向量,以便它包含的图具有千位分隔符。向量包含整数和实数,而没有关于值或顺序的任何特定规则。

I would like to reformat a factor vector so the figures that it contains have a thousand separator. The vector contains integer and real number without any particular rule with respect to the values or order.

特别是,我正在使用类似于以下生成的向量的向量 vec

In particular, I'm working with a vector vec similar to the one generated below:

content <- c("0 - 100", "0 - 100", "0 - 100", "0 - 100",
             "150.22 - 170.33",
             "1000 - 2000","1000 - 2000", "1000 - 2000", "1000 - 2000", 
             "7000 - 10000", "7000 - 10000", "7000 - 10000", "7000 - 10000",
             "7000 - 10000", "1000000 - 22000000", "1000000 - 22000000", 
             "1000000 - 22000000",
             "44000000 - 66000000.8989898989")

vec <- factor(x = content, levels = unique(content))



所需结果



我的野心是重新格式化此向量,以使这些数字包含类似 Excel 1,000 的分隔符下方:

100.00
1,000.00

1,000,000.00

1,000,000.56

24,564,000,000.56

100.00 1,000.00
1,000,000.00
1,000,000.56
24,564,000,000.56

我当时想利用 gsubfn 和将传递数字的原型对象。然后也许用3位数字创建另一个原型对象并替换。如下代码所示:


Tried approach

I was thinking of making use of the gsubfn and a proto object that would pass the digit. Then maybe createing another proto object with 3 digits and replacing. As suggested in the code below:

gsubfn(pattern = "[0-9][0-9][0-9]", replacement = ~paste0(x, ','), 
       x = as.character(vec))

仅在逗号插入时部分起作用:

This works only partuially as comma is insterted in:


150,.22-170,.33

"150,.22 - 170,.33"

这显然是错误的。我还必须将字符向量转换为因数。因此,我的问题归结为两个要素:

which obviously is wrong. I also had to convert the character vector to factor. Consquently, my question boils down to two elements:


  • 如何解决逗号问题?

  • 如何维护因子的原始结构?-我需要以与原始序列相同的方式排序因子向量,但在正确的位置添加逗号。

  • How can I work around the comma issue?
  • How can I maintain the original structure of the factor? - I need to have a factor vector ordered in the same manner as the original one but with commas in right places.

推荐答案

操作 > levels 似乎可以保持您的精度水平,不会将向量转换为 character 向量,并且效率更高,因为它可以减少数据的大小仅对唯一值(而不是整个向量)进行操作

Operating only on the levels seem to keep your precision level, not converting your vector to character vector and much more efficient as it is reducing the size of the data you operate on only to the unique values (rather the whole vector)

levels(vec) <- sapply(strsplit(levels(vec), " - "), 
                       function(x) paste(prettyNum(x, 
                                            big.mark = ",", 
                                            preserve.width = "none"), 
                                   collapse = " - "))
vec
#  [1] 0 - 100                            0 - 100                            0 - 100                            0 - 100                            150.22 - 170.33                   
#  [6] 1,000 - 2,000                      1,000 - 2,000                      1,000 - 2,000                      1,000 - 2,000                      7,000 - 10,000                    
# [11] 7,000 - 10,000                     7,000 - 10,000                     7,000 - 10,000                     7,000 - 10,000                     1,000,000 - 22,000,000            
# [16] 1,000,000 - 22,000,000             1,000,000 - 22,000,000             44,000,000 - 66,000,000.8989898989
# Levels: 0 - 100 150.22 - 170.33 1,000 - 2,000 7,000 - 10,000 1,000,000 - 22,000,000 44,000,000 - 66,000,000.8989898989 

这篇关于千位后用逗号分隔的复合因子向量的重新格式化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆