如何加载csv文件不知道列数事先 [英] How to load a csv file without knowing the number of columns beforehand
问题描述
下面是我尝试做:
我要显示具有某些疾病与条形图人所有排列。
例如。我有具有某些疾病的所有组合的人频率计数。比方说,如果有3个病,我有7组人的数量。 4疾病,还有15人可能组和n个疾病也有2 ^ n-1个组合。
此数据适当的csv文件的结构是:
频率,disease1,disease2,disease3,disease4
40,1,0,0,0
36,1,0,1,0
25,0,1,0,0
37,0,0,0,1
20,0,0,1,1
5,1,1,1,1
2行的意义刚走disease1但没有其他40人。第3行的意思36人有disease1和disease3。
那么,通过怎样从一个CSV文件中读取的例子看后,我没有找到此文件结构的答案(对此我不知道有多少列包含事前)。
我在这个例子中intial条形图应该显示4个酒吧,每个病 - 与该高度在哪里这种疾病发生的所有罪名(值为1)。
选择一间酒吧在此之后,我打算更新剩余的酒吧为这个子集(尚未实现 - 但数据结构应该有效地支持它)
能否一些给我一个提示如何加载初始数据结构(从未知数量的属性)?
我在下面添加,我知道就行89我目前的版本
.attr(X功能(D){返回X(disease1);})
这是我不能动态访问的列名。我现在的想法是最好的创造价值和属性名一个单独的数组的数组的数组。但我还没有想通了如何从自性填补了数组与任意的顺序碰撞在其中为......在'循环迭代。
另一种,每个值是由
{键:列名1';值:值}
似乎prohibitiv由于冗余量
我的当前版本:
<!DOCTYPE HTML>
<间的charset =UTF-8>
<风格>{的.bar
补:钢青;
}的.bar:悬停{
补:棕色;
}.axis {
字体:10px的无衬线;
}.axis路径,
.axis行{
补:无;
行程:#000;
形状渲染:crispEdges;
}.x.axis路径{
显示:无;
}< /风格>
<身体GT;
&所述; SCRIPT SRC =http://d3js.org/d3.v3.min.js>&下; /脚本>
<脚本>VAR利润率= {顶:20,右:20,底部:30,左:40},
宽度= 960 - margin.left - margin.right,
高度= 500 - margin.top - margin.bottom;VAR排列;VAR X = d3.scale.ordinal()
.rangeRoundBands([0,宽度],0.1);变种Y = d3.scale.linear()
.range([身高,0]);VAR XAXIS = d3.svg.axis()
.scale(X)
。东方(底部);变种Y轴= d3.svg.axis()
.scale(Y)
。东方(左)
// .ticks(10,%);VAR SVG = d3.select(身体)。追加(SVG)
.attr(宽度,宽+ margin.left + margin.right)
.attr(高度,身高+ margin.top + margin.bottom)
.append(G)
.attr(转换,翻译(+ margin.left +,+ margin.top +));d3.csv(diseases.csv,类型,功能(错误数据){
排列=数据;
变种产品= d3.keys(排列[0])。过滤器(功能(键){
返回键=频率;}); //调试;
x.domain(产品);
y.domain([0,d3.max(数据,功能(D){返回d.frequency;})]);
svg.append(G)
.attr(类,X轴)
.attr(转换,翻译(0,+高度+))
.CALL(X轴); svg.append(G)
.attr(类,Y轴)
.CALL(Y轴)
.append(文本)
.attr(改造,旋转(-90))
.attr(Y,6)
.attr(DY,.71em)
.style(TEXT-主播,结束)
的.text(频率); svg.selectAll(巴)
的.data(数据)
。进入()。追加(矩形)
.attr(类,酒吧)
.attr(X功能(D){返回X(disease1);})
.attr(宽度,x.rangeBand())
.attr(Y,功能(D){返回Y(d.frequency);})
.attr(高度功能(D){返回高度 - Y(d.frequency);});});功能型(D){
对(在D VAR烫发){
如果(Object.prototype.hasOwnProperty.call(D,烫发)){
的console.log(前:+ D [烫发]);
D [烫发] = + D [烫发]
的console.log(后+ D [烫发]);
}
}
// d.frequency = + d.frequency;
返回D组;
}< / SCRIPT>
我的理解是要总结每种疾病的频率(第一列),并使用这些频率的柱状图。你可以改变你如何处理从CSV文件加载的数据:
d3.csv(diseases.csv,类型,功能(错误,排列){
。VAR疾病= d3.keys(排列[0])过滤器(功能(键){return键=频率;})
数据= diseases.map(功能(D){返回{病:D,频率:0}});
permutations.forEach(函数(行){
diseases.forEach(功能(D,I){
如果(行[D]。=== 1){
数据[I]。频率+ =行[频率];
}
})
})
要存储在数组中,看起来像这样你的数据
:
[{病:disease1,频率:81},{病:disease2,频率:30},
{病:disease3,频率:61},{病:disease4,频率:62}]
然后,只需修改 X
站点:
x.domain(疾病);
和 X
属性,当你画你的< RECT>
s到使用特定疾病:
.attr(X功能(D){返回X(d.disease);})
使这些变化给了我下面的柱状图:
Here is what I try to do: I want to visualize all permutations of people having certain diseases with bar charts. E.g. I have a frequency count of people having certain diseases for all combinations. Let's say if there are 3 diseases, I have the count for 7 groups of people. For 4 diseases, there are 15 possible groups of people and for n diseases there are 2^n-1 combinations.
A suitable csv file structure for this data is:
frequency,disease1,disease2,disease3,disease4
40,1,0,0,0
36,1,0,1,0
25,0,1,0,0
37,0,0,0,1
20,0,0,1,1
5,1,1,1,1
Row 2 meaning 40 people having just disease1 but no other. Row 3 meaning 36 people have disease1 and disease3.
Well, after looking through the examples of how to read from a csv file, I didn't find an answer for this file structure (for which I don't know how many columns are contained beforehand).
My intial bar chart for this example should show 4 bars, one for each disease - with the height being all counts where this disease occured (has a value 1). Then after a bar is selected, I plan to update the remaining bars for this subset (not implemented yet - but the data structure should efficiently support it).
Can some give me a hint how to load the initial data structure (from the unknown number of attributes)?
I add my current version below where I realize on line 89
.attr("x", function(d) { return x("disease1"); })
that I cannot access the column names dynamically. My current thought is best to create an array of arrays for the values and a separate array for the attribute names. But I haven't figured it out how since filling up Arrays from properties collides with the arbitrary order in which the 'for ...in' loop iterates. An alternative where each value is represented by
{key: 'column name 1'; value: value}
seems prohibitiv due to the amount of redundancy.
My current version:
<!DOCTYPE html>
<meta charset="utf-8">
<style>
.bar {
fill: steelblue;
}
.bar:hover {
fill: brown;
}
.axis {
font: 10px sans-serif;
}
.axis path,
.axis line {
fill: none;
stroke: #000;
shape-rendering: crispEdges;
}
.x.axis path {
display: none;
}
</style>
<body>
<script src="http://d3js.org/d3.v3.min.js"></script>
<script>
var margin = {top: 20, right: 20, bottom: 30, left: 40},
width = 960 - margin.left - margin.right,
height = 500 - margin.top - margin.bottom;
var permutations;
var x = d3.scale.ordinal()
.rangeRoundBands([0, width], .1);
var y = d3.scale.linear()
.range([height, 0]);
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left")
// .ticks(10, "%");
var svg = d3.select("body").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
d3.csv("diseases.csv", type, function(error, data) {
permutations = data;
var products = d3.keys(permutations[0]).filter(function(key) {
return key != "frequency";});
// debugger;
x.domain(products);
y.domain([0, d3.max(data, function(d) { return d.frequency; })]);
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis);
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style("text-anchor", "end")
.text("Frequency");
svg.selectAll(".bar")
.data(data)
.enter().append("rect")
.attr("class", "bar")
.attr("x", function(d) { return x("disease1"); })
.attr("width", x.rangeBand())
.attr("y", function(d) { return y(d.frequency); })
.attr("height", function(d) { return height - y(d.frequency); });
});
function type(d) {
for (var perm in d) {
if (Object.prototype.hasOwnProperty.call(d, perm)) {
console.log("before: " + d[perm]);
d[perm] = +d[perm];
console.log("after: "+ d[perm]);
}
}
// d.frequency = +d.frequency;
return d;
}
</script>
My understanding is that you want to sum the frequencies (first column) for each disease, and create a bar chart using those frequencies. You can change how you process the data loaded from the CSV file:
d3.csv("diseases.csv", type, function(error, permutations) {
var diseases = d3.keys(permutations[0]).filter(function(key) { return key != "frequency";}),
data = diseases.map(function(d){ return {disease: d, frequency: 0}});
permutations.forEach(function(row){
diseases.forEach(function(d, i){
if (row[d] === 1){
data[i].frequency += row["frequency"];
}
})
})
to store your data
in an array that looks like this:
[{"disease":"disease1","frequency":81},{"disease":"disease2","frequency":30},
{"disease":"disease3","frequency":61},{"disease":"disease4","frequency":62}]
Then just modify the x
domain:
x.domain(diseases);
and the x
attribute when you draw your <rect>
s to use the particular disease:
.attr("x", function(d) { return x(d.disease); })
Making these changes gives me the following bar chart:
这篇关于如何加载csv文件不知道列数事先的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!