R中的XML转换,最后一点 [英] XML transformation in R, final bit

查看:105
本文介绍了R中的XML转换,最后一点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



示例xml文件:

 < games id =32134> 
< game id =3962920xsid =0>
< time> 2016-11-26T15:30:00 + 00:00< / time>
< group id =33765> Roses< / group>
< hteam id =2228> BlackSavers< / hteam>
< ateam id =226150> Regeton< / ateam>
< results>
< / results>
< server sid =126name =reg>
< offer id =548331136>
< states i =0time =2016-11-26T10:03:56 + 00:00starting_time =2016-11-26T15:30:00 + 00:00>
< s1> 2.750< / s1>
< s2> 3.600< / s2>
< s3> 2.100< / s3>
< / states>
< states i =1time =2016-11-25T17:05:07 + 00:00starting_time =2016-11-26T15:30:00 + 00:00>
< s1> 3.000< / s1>
< s2> 3.600< / s2>
< s3> 2.000< / s3>
< / states>
< / offer>
< / server>
< server bid =221name =razor>
< offer id =548415893>
< states i =0time =2016-11-26T10:11:26 + 00:00starting_time =2016-11-26T15:30:00 + 00:00>
< s1> 653.000< / s1>
< s2> 873.600< / s2>
< s3> 225.100< / s3>
< / states>
< states i =1time =2016-11-26T10:07:39 + 00:00starting_time =2016-11-26T15:30:00 + 00:00>
< s1> 323.000< / s1>
< s2> 321.750< / s2>
< s3> 211.050< / s3>
< / states>
< states i =2time =2016-11-25T19:54:20 + 00:00starting_time =2016-11-26T15:30:00 + 00:00>
< s1> 223.100< / s1>
< s2> 322.600< / s2>
< s3> 232.050< / s3>
< / states>
< / offer>
< / server>
< server bid =291name =nagie>
< offer id =548454059>
< states i =0time =2016-11-26T13:21:08 + 00:00starting_time =2016-11-26T15:30:00 + 00:00>
< s1> 323.000< / s1>
< s2> 123.400< / s2>
< s3> 342.100< / s3>
< / states>
< states i =1time =2016-11-26T10:07:02 + 00:00starting_time =2016-11-26T15:30:00 + 00:00>
< s1> 123.000< / s1>
< s2> 323.500< / s2>
< s3> 342.050< / s3>
< / states>
< states i =2time =2016-11-25T21:35:50 + 00:00starting_time =2016-11-26T15:30:00 + 00:00>
< s1> 374.000< / s1>
< s2> 349.600< / s2>
< s3> 200.000< / s3>
< / states>
< / offer>
< / server>
< / game>
< / games>

当前代码:

  df<  -  do.call(rbind,xpathApply(doc,// game,function(m){
data.frame(
game_id = xmlAttrs )[id],
t(xpathSApply(m,group,function(g){
c(
group_id = xmlAttrs(g)[id],
group = xmlValue(g [[group]])

})),
t(xpathSApply(m,server,function(b){
sid< ; - xmlAttrs(b)[[sid]]
name< - xmlAttrs(b)[[name]]
xpathSApply(b,offer,function(of){
c(
sid = sid,
name = name,
id = xmlAttrs(of)[[id]],
do.call(cbind,xpathApply ofstate,function(o){
c(s1 < - xmlValue(o [[s1]]),
s2 < - xmlValue(o [[s2]] )
s3< - xmlValue(o [[s3]])

}))
)})

}) ))

})

所需的数据帧输出:





我的问题是,我不知道如何将状态放在数据框中。其他级别已经在进行中,而且它们都在工作。我只需要帮助最后一块。



这些帖子帮了我很多
xml与R中的数据框嵌套的兄弟姐妹
将数据从xml转换为R数据帧



谢谢!

解决方案

您可以在这里回答(2):将数据从xml转换为R数据帧。这个想法是搜索最深的节点,这里状态,然后使用 xmlParent 计算祖先。从那时起就是例行。例如,只使用几个字段(您可以添加其他字段):

 库(XML)
doc< - xpathTreeParse(games.xml,useInternalNodes = TRUE)

do.call(rbind,xpathApply(doc,// states,function(states){
提供< - xmlParent(状态)
服务器< - xmlParent(offer)
游戏< - xmlParent(服务器)
游戏< - xmlParent(游戏)
数据。框架(
gamesId = xmlAttrs(games)[[id]],
gameId = xmlAttrs(game)[[id]],
groupid = xmlAttrs(game [ group]])[[id]],
groupname = xmlValue(game [[group]]),
offerId = xmlAttrs(offer)[[id]]
states_i = as.numeric(xmlAttrs(states)[[i]]),
s1 = as.numeric(xmlValue(states [[s1]])),
s2 = as.numeric(xmlValue(states [[s2]])),
stringsAsFactors = FALSE)
}))

给出:

  gamesId gameId groupid groupname offerId states_i s1 s2 
1 32134 3962920 33765玫瑰548331136 0 2.75 3.60
2 32134 3962920 33765玫瑰548331136 1 3.00 3.60
3 32134 3962920 33765玫瑰548415893 0 653.00 873.60
4 32134 3962920 33765玫瑰548415893 1 323.00 321.75
5 32134 3962920 33765玫瑰548415893 2 223.10 322.60
6 32134 3962920 33765玫瑰548454059 0 323.00 123.40
7 32134 3962920 33765玫瑰548454059 1 123.00 323.50
8 32134 3962920 33765玫瑰548454059 2 374.00 349.60


I am trying to transform an XML file into a dataframe.

Example xml file:

<games id="32134">
    <game id="3962920" xsid="0">
    <time>2016-11-26T15:30:00+00:00</time>
    <group id="33765">Roses</group>
    <hteam id="2228">BlackSavers</hteam>
    <ateam id="226150">Regeton</ateam>
    <results>
    </results>
    <server sid="126" name="reg">
        <offer id="548331136">
            <states i="0" time="2016-11-26T10:03:56+00:00" starting_time="2016-11-26T15:30:00+00:00">
                <s1>2.750</s1>
                <s2>3.600</s2>
                <s3>2.100</s3>
            </states>
            <states i="1" time="2016-11-25T17:05:07+00:00" starting_time="2016-11-26T15:30:00+00:00">
                <s1>3.000</s1>
                <s2>3.600</s2>
                <s3>2.000</s3>
            </states>
        </offer>
    </server>
    <server bid="221" name="razor">
        <offer id="548415893">
            <states i="0" time="2016-11-26T10:11:26+00:00" starting_time="2016-11-26T15:30:00+00:00">
                <s1>653.000</s1>
                <s2>873.600</s2>
                <s3>225.100</s3>
            </states>
            <states i="1" time="2016-11-26T10:07:39+00:00" starting_time="2016-11-26T15:30:00+00:00">
                <s1>323.000</s1>
                <s2>321.750</s2>
                <s3>211.050</s3>
            </states>
            <states i="2" time="2016-11-25T19:54:20+00:00" starting_time="2016-11-26T15:30:00+00:00">
                <s1>223.100</s1>
                <s2>322.600</s2>
                <s3>232.050</s3>
            </states>
        </offer>
    </server>
    <server bid="291" name="nagie">
        <offer id="548454059">
            <states i="0" time="2016-11-26T13:21:08+00:00" starting_time="2016-11-26T15:30:00+00:00">
                <s1>323.000</s1>
                <s2>123.400</s2>
                <s3>342.100</s3>
            </states>
            <states i="1" time="2016-11-26T10:07:02+00:00" starting_time="2016-11-26T15:30:00+00:00">
                <s1>123.000</s1>
                <s2>323.500</s2>
                <s3>342.050</s3>
            </states>
            <states i="2" time="2016-11-25T21:35:50+00:00" starting_time="2016-11-26T15:30:00+00:00">
                <s1>374.000</s1>
                <s2>349.600</s2>
                <s3>200.000</s3>
            </states>
        </offer>
    </server>
</game>
</games>

Current code:

df <- do.call("rbind", xpathApply(doc, "//game", function(m) {
data.frame(
game_id = xmlAttrs(m)["id"],
t(xpathSApply(m, "group", function(g) {
  c(
    group_id = xmlAttrs(g)["id"],
    group = xmlValue(g[["group"]])
  )
})),
t(xpathSApply(m, "server",function(b){
  sid <- xmlAttrs(b)[["sid"]]
  name <- xmlAttrs(b)[["name"]]
  xpathSApply(b, "offer",function(of){
    c(
      sid = sid,
      name = name,
      id = xmlAttrs(of)[["id"]],
      do.call(cbind, xpathApply(of, "states",function(o){
        c(s1 <- xmlValue(o[["s1"]]),
          s2 <- xmlValue(o[["s2"]]),
          s3 <- xmlValue(o[["s3"]])
        )
      }))
      )})

  })))

}))

Desired dataframe output:

My problem is, I can't figure out how to place states in the dataframe as well. The other levels are already in, and they do work. I would only need help for the last piece.

These posts helped me a lot xml with nested siblings to data frame in R Transforming data from xml into R dataframe

Thank you!

解决方案

You can follow (2) in the answer here: Transforming data from xml into R dataframe . The idea is to search for the deepest node, here states and then compute the ancestors using xmlParent. From that point on it is routine. For example, using just a few of the fields (you can add the rest):

library(XML)
doc <- xpathTreeParse("games.xml", useInternalNodes = TRUE)

do.call("rbind", xpathApply(doc, "//states", function(states) {
   offer <- xmlParent(states)
   server <- xmlParent(offer)
   game <- xmlParent(server)
   games <- xmlParent(game)
   data.frame(
     gamesId = xmlAttrs(games)[["id"]],
     gameId = xmlAttrs(game)[["id"]],
     groupid = xmlAttrs(game[["group"]])[["id"]],
     groupname = xmlValue(game[["group"]]),
     offerId = xmlAttrs(offer)[["id"]],
     states_i = as.numeric(xmlAttrs(states)[["i"]]),
     s1 = as.numeric(xmlValue(states[["s1"]])),
     s2 = as.numeric(xmlValue(states[["s2"]])),
     stringsAsFactors = FALSE)
}))

giving:

  gamesId  gameId groupid groupname   offerId states_i     s1     s2
1   32134 3962920   33765     Roses 548331136        0   2.75   3.60
2   32134 3962920   33765     Roses 548331136        1   3.00   3.60
3   32134 3962920   33765     Roses 548415893        0 653.00 873.60
4   32134 3962920   33765     Roses 548415893        1 323.00 321.75
5   32134 3962920   33765     Roses 548415893        2 223.10 322.60
6   32134 3962920   33765     Roses 548454059        0 323.00 123.40
7   32134 3962920   33765     Roses 548454059        1 123.00 323.50
8   32134 3962920   33765     Roses 548454059        2 374.00 349.60

这篇关于R中的XML转换,最后一点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆