SI ggplot2轴标签中的前缀 [英] SI prefixes in ggplot2 axis labels
问题描述
我经常在GNU R / ggplot中绘制一些关于字节的测量图。内置轴标签是简单数字或科学记数法,即1兆字节= 1e6。我希望SI前缀(Kilo = 1e3,Mega = 1e6,Giga = 1e9等)代替,即轴应标记为1.5K,5K,1M,150M,4G等。
我目前使用以下代码:
si_num < - function(x){
$ b (x,科学= 12),分裂=)[$ [1]];
rem< - chrs [seq(1,length(chrs)-6)];
rem< - append(rem,M);
}
else if(x> 1e3){
chrs < - strsplit(format(x,scientific = 12),split =)[[1] ]。
rem< - chrs [seq(1,length(chrs)-3)];
rem< - append(rem,K);
}
else {
return(x);
}
return(paste(rem,sep =,collapse =));
}
else return(NA);
si_vec< - function(x){
sapply(x,FUN = si_num);
}
library(ggplot2);
bytes = 2 ^ seq(0,20)+ rnorm(21,4,2);
time = bytes /(1e4 + rnorm(21,100,3))+ 8;
my_data = data.frame(time,bytes);
p < - ggplot(data = my_data,aes(x = bytes,y = time))+
geom_point()+
geom_line()+
scale_x_log10 (Message Size [Byte],labels = si_vec)+
scale_y_continuous(Round-Trip-Time [us]);
p;
我想知道这个解决方案是否可以改进,因为我需要很多样板代码在每张图中。
我使用了 library(sos); findFn({SI前缀})
来找到 sitools
包。
构建数据:
bytes < - 2 ^ seq(0,20)+ rnorm(21,4,2)
时间< - 字节/(1e4 + rnorm(21,100,3))+ 8
my_data < - data.frame(时间,字节)
加载软件包:
library(sitools)
library(ggplot2)
创建图:
(p <-ggplot(data = my_data,aes(x = bytes,y = time))+
geom_point()+
geom_line()+
scale_x_log10(Message Size [Byte],labels = f2si)+
scale_y_continuous(Round-Trip-Time [us]))
我不确定这与你的功能相比如何,但至少有人会写它的麻烦...
我修改了一下你的代码风格 - 在行尾有分号是无害的,但通常是MATLAB或C编码器的标志......
编辑:我最初定义了一个通用格式化函数
si_format< - function(...){
$ p $ (例如)
function(x)f2si(x,...)
}
scales :: comma_format
的格式,但在这种情况下这似乎是不必要的 - 只是更深的ggplot2
魔法,我没有完全理解。
OP的代码给了我似乎是不太正确的答案:最右边的坐标轴是1000K而不是1M - 可以通过将
> 1e6
测试更改为> = 1E6
。另一方面,f2si
使用小写k
- 我不知道K
是需要的(将结果包装在toupper()
可以解决这个问题)。
< OP结果(si_vec
):
我的结果(
f2si
):
I often plot graphs in GNU R / ggplot for some measurements related to bytes. The builtin axis labels are either plain numbers or scientific notation, ie 1 Megabyte = 1e6. I would like SI prefixes (Kilo = 1e3, Mega=1e6, Giga=1e9, etc) instead, i.e. axis should be labelled 1.5K, 5K, 1M, 150M, 4G etc.
I currently use the following code:
si_num <- function (x) { if (!is.na(x)) { if (x > 1e6) { chrs <- strsplit(format(x, scientific=12), split="")[[1]]; rem <- chrs[seq(1,length(chrs)-6)]; rem <- append(rem, "M"); } else if (x > 1e3) { chrs <- strsplit(format(x, scientific=12), split="")[[1]]; rem <- chrs[seq(1,length(chrs)-3)]; rem <- append(rem, "K"); } else { return(x); } return(paste(rem, sep="", collapse="")); } else return(NA); } si_vec <- function(x) { sapply(x, FUN=si_num); } library("ggplot2"); bytes=2^seq(0,20) + rnorm(21, 4, 2); time=bytes/(1e4 + rnorm(21, 100, 3)) + 8; my_data = data.frame(time, bytes); p <- ggplot(data=my_data, aes(x=bytes, y=time)) + geom_point() + geom_line() + scale_x_log10("Message Size [Byte]", labels=si_vec) + scale_y_continuous("Round-Trip-Time [us]"); p;
I would like to know if this solution can be improved, as my one requires a lot of boilerplate code in every graph.
解决方案I used
library("sos"); findFn("{SI prefix}")
to find thesitools
package.Construct data:
bytes <- 2^seq(0,20) + rnorm(21, 4, 2) time <- bytes/(1e4 + rnorm(21, 100, 3)) + 8 my_data <- data.frame(time, bytes)
Load packages:
library("sitools") library("ggplot2")
Create the plot:
(p <- ggplot(data=my_data, aes(x=bytes, y=time)) + geom_point() + geom_line() + scale_x_log10("Message Size [Byte]", labels=f2si) + scale_y_continuous("Round-Trip-Time [us]"))
I'm not sure how this compares to your function, but at least someone else went to the trouble of writing it ...
I modified your code style a little bit -- semicolons at the ends of lines are harmless but are generally the sign of a MATLAB or C coder ...
edit: I initially defined a generic formatting function
si_format <- function(...) { function(x) f2si(x,...) }
following the format of (e.g)
scales::comma_format
, but that seems unnecessary in this case -- just part of the deeperggplot2
magic that I don't fully understand.The OP's code gives what seems to me to be not quite the right answer: the rightmost axis tick is "1000K" rather than "1M" -- this can be fixed by changing the
>1e6
test to>=1e6
. On the other hand,f2si
uses lower-casek
-- I don't know whetherK
is wanted (wrapping the results intoupper()
could fix this).OP results (
si_vec
):My results (
f2si
):这篇关于SI ggplot2轴标签中的前缀的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!