解析具有同名子节点的 XML 文件 [英] Parse XML files with subnodes of the same name

查看:32
本文介绍了解析具有同名子节点的 XML 文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 XML 文件 - 下面的简短版本

I have an XML file - brief version below

<resultset>
  <row>
    <column name="indexpatient">2</column>
    <column name="height" null="true"></column>
    <column name="ParameterMeasure">Cardiac/MM/Dimension/LVIDd</column>
    <column name="ParameterId">MM/LVIDd</column>
    <column name="ResultIdentifier">Average</column>
    <column name="ResultValue">0.05617021151</column>
  </row>
  <row>
    <column name="indexpatient">2</column>
    <column name="height" null="true"></column>
    <column name="ParameterMeasure">Cardiac/MM/Dimension/LVIDd</column>
    <column name="ParameterId">MM/LVIDs</column>
    <column name="ResultIdentifier">Measurement No. 1</column>
    <column name="ResultValue">0.05341702</column>
  </row>
</resultset>

理想的输出是每个列名称,例如 indexpatient 在数据框中显示为列,值显示为行.

The ideal output is each of the column names eg indexpatient to appear as columns in a dataframe and values as rows.

有人可以帮助我如何使用 R 做到这一点吗?

Can anybody help how I could do this using R?

我被卡住了,因为每个子节点都具有相同的名称,即列名".

I am stuck as each of the subnodes have the same name i.e. 'column name'.

推荐答案

以下是基于此问题/答案的解决方案:R XML - 将父子节点(同名)合并成数据框

Here is a solution based on this question/answer: R XML - combining parent and child nodes(w same name) into data frame

library(xml2)
library(dplyr)
page<-read_xml('<resultset>
  <row>
         <column name="indexpatient">2</column>
         <column name="height" null="true"></column>
         <column name="ParameterMeasure">Cardiac/MM/Dimension/LVIDd</column>
         <column name="ParameterId">MM/LVIDd</column>
         <column name="ResultIdentifier">Average</column>
         <column name="ResultValue">0.05617021151</column>
         </row>
         <row>
         <column name="indexpatient">2</column>
         <column name="height" null="true"></column>
         <column name="ParameterMeasure">Cardiac/MM/Dimension/LVIDd</column>
         <column name="ParameterId">MM/LVIDs</column>
         <column name="ResultIdentifier">Measurement No. 1</column>
         <column name="ResultValue">0.05341702</column>
         </row>
         </resultset>')


rows<- page %>% xml_find_all('//row') 

dfs<-lapply(rows, function(node){
   #find the attr value from all child nodes
   names<-node %>% xml_children() %>% xml_attr("name")  
   #find all values
   values<-node %>% xml_children() %>% xml_text()

   #create data frame and properly label the columns
   df<-data.frame(t(values), stringsAsFactors = FALSE)
   names(df)<-names
   df
})

#bind together and add uid to final dataframe.
answer<-bind_rows(dfs)
answer

# indexpatient height           ParameterMeasure ParameterId  ResultIdentifier   ResultValue
# 1            2        Cardiac/MM/Dimension/LVIDd    MM/LVIDd           Average 0.05617021151
# 2            2        Cardiac/MM/Dimension/LVIDd    MM/LVIDs Measurement No. 1    0.05341702
> 

这篇关于解析具有同名子节点的 XML 文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆