标绘:只保留最相关的数据 [英] Graph plotting: only keeping most relevant data
问题描述
为了节省带宽,并因此不会有产生画面/图自己我打算使用谷歌的图表API:
In order to save bandwith and so as to not to have generate pictures/graphs ourselves I plan on using Google's charting API:
HTTP://$c$c.google.com/apis/chart/
这通过简单地发出一个(可能很长)运行方式(或POST),然后谷歌创建和处理图形本身。
which works by simply issuing a (potentially long) GET (or a POST) and then Google generate and serve the graph themselves.
作为由约二十万项,我想现在我已经得到了图形修剪下来到条目的一些任意数量(例如,通过只保留50%的原创参赛作品,还是原来的10%项)。
As of now I've got graphs made of about two thousands entries and I'd like to trim this down to some arbitrary number of entries (e.g. by keeping only 50% of the original entries, or 10% of the original entries).
如何可以决定我应该保留哪些条目,以便有我新,图中最接近原始图?
How can I decide which entries I should keep so as to have my new graph the closest to the original graph?
这是某种曲线拟合的问题?
Is this some kind of curve-fitting problem?
请注意,我知道我能做到POST到谷歌的图表API数据高达16K,这可够我的需要,但我仍然好奇的
推荐答案
你希望做的是被称为什么的下采样或抽取的。基本上你筛选数据,然后放到N - 1,每N个样本(抽取或降采样的N倍)。粗过滤器只是走一个地方移动平均线。例如。如果要通过一个因子N来抽取= 10然后,通过这10个点的平均值替换每隔10分。
What you are looking to do is known as downsampling or decimation. Essentially you filter the data and then drop N - 1 out of every N samples (decimation or down-sampling by factor of N). A crude filter is just taking a local moving average. E.g. if you want to decimate by a factor of N = 10 then replace every 10 points by the average of those 10 points.
请注意,与上述方案可能会从你的情节失去一些高频数据(因为你是有效的低通滤波数据) - 如果看短期波动是很重要的,然后另一种方法是画出每一个N点的从而重新$ P $一个竖线psents中的范围的(即最小..最大)的N点的。
Note that with the above scheme you may lose some high frequency data from your plot (since you are effectively low pass filtering the data) - if it's important to see short term variability then an alternative approach is to plot every N points as a single vertical bar which represents the range (i.e. min..max) of those N points.
这篇关于标绘:只保留最相关的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!