R中的XML转换,最后一点 [英] XML transformation in R, final bit
问题描述
示例xml文件:
< games id =32134>
< game id =3962920xsid =0>
< time> 2016-11-26T15:30:00 + 00:00< / time>
< group id =33765> Roses< / group>
< hteam id =2228> BlackSavers< / hteam>
< ateam id =226150> Regeton< / ateam>
< results>
< / results>
< server sid =126name =reg>
< offer id =548331136>
< states i =0time =2016-11-26T10:03:56 + 00:00starting_time =2016-11-26T15:30:00 + 00:00>
< s1> 2.750< / s1>
< s2> 3.600< / s2>
< s3> 2.100< / s3>
< / states>
< states i =1time =2016-11-25T17:05:07 + 00:00starting_time =2016-11-26T15:30:00 + 00:00>
< s1> 3.000< / s1>
< s2> 3.600< / s2>
< s3> 2.000< / s3>
< / states>
< / offer>
< / server>
< server bid =221name =razor>
< offer id =548415893>
< states i =0time =2016-11-26T10:11:26 + 00:00starting_time =2016-11-26T15:30:00 + 00:00>
< s1> 653.000< / s1>
< s2> 873.600< / s2>
< s3> 225.100< / s3>
< / states>
< states i =1time =2016-11-26T10:07:39 + 00:00starting_time =2016-11-26T15:30:00 + 00:00>
< s1> 323.000< / s1>
< s2> 321.750< / s2>
< s3> 211.050< / s3>
< / states>
< states i =2time =2016-11-25T19:54:20 + 00:00starting_time =2016-11-26T15:30:00 + 00:00>
< s1> 223.100< / s1>
< s2> 322.600< / s2>
< s3> 232.050< / s3>
< / states>
< / offer>
< / server>
< server bid =291name =nagie>
< offer id =548454059>
< states i =0time =2016-11-26T13:21:08 + 00:00starting_time =2016-11-26T15:30:00 + 00:00>
< s1> 323.000< / s1>
< s2> 123.400< / s2>
< s3> 342.100< / s3>
< / states>
< states i =1time =2016-11-26T10:07:02 + 00:00starting_time =2016-11-26T15:30:00 + 00:00>
< s1> 123.000< / s1>
< s2> 323.500< / s2>
< s3> 342.050< / s3>
< / states>
< states i =2time =2016-11-25T21:35:50 + 00:00starting_time =2016-11-26T15:30:00 + 00:00>
< s1> 374.000< / s1>
< s2> 349.600< / s2>
< s3> 200.000< / s3>
< / states>
< / offer>
< / server>
< / game>
< / games>
当前代码:
df< - do.call(rbind,xpathApply(doc,// game,function(m){
data.frame(
game_id = xmlAttrs )[id],
t(xpathSApply(m,group,function(g){
c(
group_id = xmlAttrs(g)[id],
group = xmlValue(g [[group]])
)
})),
t(xpathSApply(m,server,function(b){
sid< ; - xmlAttrs(b)[[sid]]
name< - xmlAttrs(b)[[name]]
xpathSApply(b,offer,function(of){
c(
sid = sid,
name = name,
id = xmlAttrs(of)[[id]],
do.call(cbind,xpathApply ofstate,function(o){
c(s1 < - xmlValue(o [[s1]]),
s2 < - xmlValue(o [[s2]] )
s3< - xmlValue(o [[s3]])
)
}))
)})
}) ))
})
所需的数据帧输出:
我的问题是,我不知道如何将状态放在数据框中。其他级别已经在进行中,而且它们都在工作。我只需要帮助最后一块。
这些帖子帮了我很多
xml与R中的数据框嵌套的兄弟姐妹
将数据从xml转换为R数据帧
谢谢!
您可以在这里回答(2):将数据从xml转换为R数据帧。这个想法是搜索最深的节点,这里状态
,然后使用 xmlParent
计算祖先。从那时起就是例行。例如,只使用几个字段(您可以添加其他字段):
库(XML)
doc< - xpathTreeParse(games.xml,useInternalNodes = TRUE)
do.call(rbind,xpathApply(doc,// states,function(states){
提供< - xmlParent(状态)
服务器< - xmlParent(offer)
游戏< - xmlParent(服务器)
游戏< - xmlParent(游戏)
数据。框架(
gamesId = xmlAttrs(games)[[id]],
gameId = xmlAttrs(game)[[id]],
groupid = xmlAttrs(game [ group]])[[id]],
groupname = xmlValue(game [[group]]),
offerId = xmlAttrs(offer)[[id]]
states_i = as.numeric(xmlAttrs(states)[[i]]),
s1 = as.numeric(xmlValue(states [[s1]])),
s2 = as.numeric(xmlValue(states [[s2]])),
stringsAsFactors = FALSE)
}))
给出:
gamesId gameId groupid groupname offerId states_i s1 s2
1 32134 3962920 33765玫瑰548331136 0 2.75 3.60
2 32134 3962920 33765玫瑰548331136 1 3.00 3.60
3 32134 3962920 33765玫瑰548415893 0 653.00 873.60
4 32134 3962920 33765玫瑰548415893 1 323.00 321.75
5 32134 3962920 33765玫瑰548415893 2 223.10 322.60
6 32134 3962920 33765玫瑰548454059 0 323.00 123.40
7 32134 3962920 33765玫瑰548454059 1 123.00 323.50
8 32134 3962920 33765玫瑰548454059 2 374.00 349.60
I am trying to transform an XML file into a dataframe.
Example xml file:
<games id="32134">
<game id="3962920" xsid="0">
<time>2016-11-26T15:30:00+00:00</time>
<group id="33765">Roses</group>
<hteam id="2228">BlackSavers</hteam>
<ateam id="226150">Regeton</ateam>
<results>
</results>
<server sid="126" name="reg">
<offer id="548331136">
<states i="0" time="2016-11-26T10:03:56+00:00" starting_time="2016-11-26T15:30:00+00:00">
<s1>2.750</s1>
<s2>3.600</s2>
<s3>2.100</s3>
</states>
<states i="1" time="2016-11-25T17:05:07+00:00" starting_time="2016-11-26T15:30:00+00:00">
<s1>3.000</s1>
<s2>3.600</s2>
<s3>2.000</s3>
</states>
</offer>
</server>
<server bid="221" name="razor">
<offer id="548415893">
<states i="0" time="2016-11-26T10:11:26+00:00" starting_time="2016-11-26T15:30:00+00:00">
<s1>653.000</s1>
<s2>873.600</s2>
<s3>225.100</s3>
</states>
<states i="1" time="2016-11-26T10:07:39+00:00" starting_time="2016-11-26T15:30:00+00:00">
<s1>323.000</s1>
<s2>321.750</s2>
<s3>211.050</s3>
</states>
<states i="2" time="2016-11-25T19:54:20+00:00" starting_time="2016-11-26T15:30:00+00:00">
<s1>223.100</s1>
<s2>322.600</s2>
<s3>232.050</s3>
</states>
</offer>
</server>
<server bid="291" name="nagie">
<offer id="548454059">
<states i="0" time="2016-11-26T13:21:08+00:00" starting_time="2016-11-26T15:30:00+00:00">
<s1>323.000</s1>
<s2>123.400</s2>
<s3>342.100</s3>
</states>
<states i="1" time="2016-11-26T10:07:02+00:00" starting_time="2016-11-26T15:30:00+00:00">
<s1>123.000</s1>
<s2>323.500</s2>
<s3>342.050</s3>
</states>
<states i="2" time="2016-11-25T21:35:50+00:00" starting_time="2016-11-26T15:30:00+00:00">
<s1>374.000</s1>
<s2>349.600</s2>
<s3>200.000</s3>
</states>
</offer>
</server>
</game>
</games>
Current code:
df <- do.call("rbind", xpathApply(doc, "//game", function(m) {
data.frame(
game_id = xmlAttrs(m)["id"],
t(xpathSApply(m, "group", function(g) {
c(
group_id = xmlAttrs(g)["id"],
group = xmlValue(g[["group"]])
)
})),
t(xpathSApply(m, "server",function(b){
sid <- xmlAttrs(b)[["sid"]]
name <- xmlAttrs(b)[["name"]]
xpathSApply(b, "offer",function(of){
c(
sid = sid,
name = name,
id = xmlAttrs(of)[["id"]],
do.call(cbind, xpathApply(of, "states",function(o){
c(s1 <- xmlValue(o[["s1"]]),
s2 <- xmlValue(o[["s2"]]),
s3 <- xmlValue(o[["s3"]])
)
}))
)})
})))
}))
Desired dataframe output:
My problem is, I can't figure out how to place states in the dataframe as well. The other levels are already in, and they do work. I would only need help for the last piece.
These posts helped me a lot xml with nested siblings to data frame in R Transforming data from xml into R dataframe
Thank you!
You can follow (2) in the answer here: Transforming data from xml into R dataframe . The idea is to search for the deepest node, here states
and then compute the ancestors using xmlParent
. From that point on it is routine. For example, using just a few of the fields (you can add the rest):
library(XML)
doc <- xpathTreeParse("games.xml", useInternalNodes = TRUE)
do.call("rbind", xpathApply(doc, "//states", function(states) {
offer <- xmlParent(states)
server <- xmlParent(offer)
game <- xmlParent(server)
games <- xmlParent(game)
data.frame(
gamesId = xmlAttrs(games)[["id"]],
gameId = xmlAttrs(game)[["id"]],
groupid = xmlAttrs(game[["group"]])[["id"]],
groupname = xmlValue(game[["group"]]),
offerId = xmlAttrs(offer)[["id"]],
states_i = as.numeric(xmlAttrs(states)[["i"]]),
s1 = as.numeric(xmlValue(states[["s1"]])),
s2 = as.numeric(xmlValue(states[["s2"]])),
stringsAsFactors = FALSE)
}))
giving:
gamesId gameId groupid groupname offerId states_i s1 s2
1 32134 3962920 33765 Roses 548331136 0 2.75 3.60
2 32134 3962920 33765 Roses 548331136 1 3.00 3.60
3 32134 3962920 33765 Roses 548415893 0 653.00 873.60
4 32134 3962920 33765 Roses 548415893 1 323.00 321.75
5 32134 3962920 33765 Roses 548415893 2 223.10 322.60
6 32134 3962920 33765 Roses 548454059 0 323.00 123.40
7 32134 3962920 33765 Roses 548454059 1 123.00 323.50
8 32134 3962920 33765 Roses 548454059 2 374.00 349.60
这篇关于R中的XML转换,最后一点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!