重新格式化复数因子矢量与逗号分隔 [英] Reformarring complex factor vector with comma separation after thousnad

查看:319
本文介绍了重新格式化复数因子矢量与逗号分隔的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想重新设置一个因子向量,所以它包含的数字有一千个分隔符。向量包含整数和实数,而没有关于值或顺序的任何特定规则。



数据



具体来说,我使用的矢量 vec 类似于下面生成的:

  content <-c(0-100,0-100,0-100,0-100,
150.22-170.33,
1000-2000,1000-2000,1000-2000,1000-2000,
7000-10000,7000-10000,7000-10000 ,
7000-10000,1000000-22000000,1000000-22000000,
1000000-22000000,
44000000-66000000.8989898989)

vec <-factor(x = content,levels = unique(content))



/ h1>

我的目标是重新格式化此向量,以便数字包含 Excel类 1,000


100.00
1,000.00

1,000,000.00

1,000,000.56

24,564,000,000.56







尝试方法



我在想使用 gsubfn 和一个可以传递数字的原对象。然后可能创建另一个具有3位数字的原对象并替换。如下面的代码所示:

  gsubfn(pattern =[0-9] [0-9] [0-9 ],replacement =〜paste0(x,','),
x = as.character(vec))


b $ b

这只能在逗号​​引入时起作用:


150,.22 - 170,.33 / p>

这显然是错误的。我也不得不将字符向量转换为因子。总而言之,我的问题归结为两个要素:




  • 如何解决逗号问题?

  • 如何保持因子的原始结构? - 我需要一个与原始方法相同的方式排序的因子向量,但在右边的位置使用逗号。 > levels 似乎保持你的精度水平,不会将你的向量转换为字符向量,更高效,因为它减少了数据的大小只是对唯一的值(而不是整个向量)

      levels(vec)< -  sapply(strsplit vec), - ),
    function(x)粘贴(prettyNum(x,
    big.mark =,,
    preserve.width =none),
    collapse = - ))
    vec
    #[1] 0 - 100 0 - 100 0 - 100 0 - 100 150.22 - 170.33
    #[6] 1,000 - 2,000 1,000 - 2,000 1,000 - 2,000 1,000 - 2,000 7,000 - 10,000
    #[11] 7,000 - 10,000 7,000 - 10,000 7,000 - 10,000 7,000 - 10,000 1,000,000 - 22,000,000
    #[16] 1,000,000 - 22,000,000 1,000,000 - 22,000,000 44,000,000 - 66,000,000.8989898989
    #级别:0 - 100 150.22 - 170.33 1,000 - 2,000 7,000 - 10,000 1,000,000 - 22,000,000 44,000,000 - 66,000,000.8989898989


    I would like to reformat a factor vector so the figures that it contains have a thousand separator. The vector contains integer and real number without any particular rule with respect to the values or order.

    Data

    In particular, I'm working with a vector vec similar to the one generated below:

    content <- c("0 - 100", "0 - 100", "0 - 100", "0 - 100",
                 "150.22 - 170.33",
                 "1000 - 2000","1000 - 2000", "1000 - 2000", "1000 - 2000", 
                 "7000 - 10000", "7000 - 10000", "7000 - 10000", "7000 - 10000",
                 "7000 - 10000", "1000000 - 22000000", "1000000 - 22000000", 
                 "1000000 - 22000000",
                 "44000000 - 66000000.8989898989")
    
    vec <- factor(x = content, levels = unique(content))
    

    Desired results

    My ambition is to reformat this vector so the figures contain the Excel-like 1,000 separataor, as in the example below:

    100.00 1,000.00
    1,000,000.00
    1,000,000.56
    24,564,000,000.56


    Tried approach

    I was thinking of making use of the gsubfn and a proto object that would pass the digit. Then maybe createing another proto object with 3 digits and replacing. As suggested in the code below:

    gsubfn(pattern = "[0-9][0-9][0-9]", replacement = ~paste0(x, ','), 
           x = as.character(vec))
    

    This works only partuially as comma is insterted in:

    "150,.22 - 170,.33"

    which obviously is wrong. I also had to convert the character vector to factor. Consquently, my question boils down to two elements:

    • How can I work around the comma issue?
    • How can I maintain the original structure of the factor? - I need to have a factor vector ordered in the same manner as the original one but with commas in right places.

    解决方案

    Operating only on the levels seem to keep your precision level, not converting your vector to character vector and much more efficient as it is reducing the size of the data you operate on only to the unique values (rather the whole vector)

    levels(vec) <- sapply(strsplit(levels(vec), " - "), 
                           function(x) paste(prettyNum(x, 
                                                big.mark = ",", 
                                                preserve.width = "none"), 
                                       collapse = " - "))
    vec
    #  [1] 0 - 100                            0 - 100                            0 - 100                            0 - 100                            150.22 - 170.33                   
    #  [6] 1,000 - 2,000                      1,000 - 2,000                      1,000 - 2,000                      1,000 - 2,000                      7,000 - 10,000                    
    # [11] 7,000 - 10,000                     7,000 - 10,000                     7,000 - 10,000                     7,000 - 10,000                     1,000,000 - 22,000,000            
    # [16] 1,000,000 - 22,000,000             1,000,000 - 22,000,000             44,000,000 - 66,000,000.8989898989
    # Levels: 0 - 100 150.22 - 170.33 1,000 - 2,000 7,000 - 10,000 1,000,000 - 22,000,000 44,000,000 - 66,000,000.8989898989 
    

    这篇关于重新格式化复数因子矢量与逗号分隔的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆