使用 XML 包将 TCX 导入 R [英] Import TCX into R using XML package
问题描述
我正在尝试使用 XML 包将 GPS 运行数据从 TCX 文件导入 R.这是我拥有的数据的一小部分样本(只有 3 个跟踪点而不是 ~900 个)
I am trying to import GPS running data into R from a TCX file using the XML package. Here is a small sample of the data I have (with only 3 track points instead of ~900)
<?xml version="1.0" encoding="UTF-8"?>
<TrainingCenterDatabase xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2">
<Activities>
<Activity Sport="Running">
<Id>2011-10-30T16:05:48Z</Id>
<Lap StartTime="2011-10-30T16:05:48Z">
<TotalTimeSeconds>3855.99</TotalTimeSeconds>
<DistanceMeters>12498.8115</DistanceMeters>
<MaximumSpeed>4.45662498</MaximumSpeed>
<Calories>1011</Calories>
<Intensity>Active</Intensity>
<TriggerMethod>Manual</TriggerMethod>
<Track>
<Trackpoint>
<Time>2011-10-30T16:05:48Z</Time>
<Position>
<LatitudeDegrees>52.33613318</LatitudeDegrees>
<LongitudeDegrees>-1.58814317</LongitudeDegrees>
</Position>
<AltitudeMeters>77.5234375</AltitudeMeters>
<DistanceMeters>0.00000000</DistanceMeters>
</Trackpoint>
<Trackpoint>
<Time>2011-10-30T16:05:49Z</Time>
<Position>
<LatitudeDegrees>52.33614810</LatitudeDegrees>
<LongitudeDegrees>-1.58814283</LongitudeDegrees>
</Position>
<AltitudeMeters>77.5234375</AltitudeMeters>
<DistanceMeters>1.77584004</DistanceMeters>
</Trackpoint>
<Trackpoint>
<Time>2011-10-30T16:05:54Z</Time>
<Position>
<LatitudeDegrees>52.33627098</LatitudeDegrees>
<LongitudeDegrees>-1.58818323</LongitudeDegrees>
</Position>
<AltitudeMeters>76.0814209</AltitudeMeters>
<DistanceMeters>15.7694969</DistanceMeters>
</Trackpoint>
</Track>
</Lap>
</Activity>
</Activities>
</TrainingCenterDatabase>
我正在尝试使用
doc = xmlParse("filetest.tcx")
xmlToDataFrame(nodes = getNodeSet(doc, "//Trackpoint"))
然而这失败了,结果是一个空的数据框.但是我发现如果我删除
However this fails, and the result is an empty data frame. However I have found that if I remove
xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2"
从文件开头的 TrainingCenterDatabase 标记开始,此导入按预期正确运行.IE.使用以下数据:
from the TrainingCenterDatabase tag at the start of the file, this import runs correctly as anticipated. Ie. using the following data:
<?xml version="1.0" encoding="UTF-8"?>
<TrainingCenterDatabase>
<Activities>
<Activity Sport="Running">
<Id>2011-10-30T16:05:48Z</Id>
<Lap StartTime="2011-10-30T16:05:48Z">
<TotalTimeSeconds>3855.99</TotalTimeSeconds>
<DistanceMeters>12498.8115</DistanceMeters>
<MaximumSpeed>4.45662498</MaximumSpeed>
<Calories>1011</Calories>
<Intensity>Active</Intensity>
<TriggerMethod>Manual</TriggerMethod>
<Track>
<Trackpoint>
<Time>2011-10-30T16:05:48Z</Time>
<Position>
<LatitudeDegrees>52.33613318</LatitudeDegrees>
<LongitudeDegrees>-1.58814317</LongitudeDegrees>
</Position>
<AltitudeMeters>77.5234375</AltitudeMeters>
<DistanceMeters>0.00000000</DistanceMeters>
</Trackpoint>
<Trackpoint>
<Time>2011-10-30T16:05:49Z</Time>
<Position>
<LatitudeDegrees>52.33614810</LatitudeDegrees>
<LongitudeDegrees>-1.58814283</LongitudeDegrees>
</Position>
<AltitudeMeters>77.5234375</AltitudeMeters>
<DistanceMeters>1.77584004</DistanceMeters>
</Trackpoint>
<Trackpoint>
<Time>2011-10-30T16:05:54Z</Time>
<Position>
<LatitudeDegrees>52.33627098</LatitudeDegrees>
<LongitudeDegrees>-1.58818323</LongitudeDegrees>
</Position>
<AltitudeMeters>76.0814209</AltitudeMeters>
<DistanceMeters>15.7694969</DistanceMeters>
</Trackpoint>
</Track>
</Lap>
</Activity>
</Activities>
</TrainingCenterDatabase>
我得到了我想要的数据框:(除了位置没有被分成纬度和经度,但我预计我应该能够处理这个问题,除非有人可以建议一种更简单的方法来直接使用 XPath?)
And I get the dataframe I want: (apart from position not being split into the lat and long, but I anticipate that I should be able to deal with that, unless someone can suggest a simpler way to do this directly using XPath?)
> xmlToDataFrame(nodes = getNodeSet(doc, "//Trackpoint"))
Time Position AltitudeMeters DistanceMeters
1 2011-10-30T16:05:48Z 52.33613318-1.58814317 77.5234375 0.00000000
2 2011-10-30T16:05:49Z 52.33614810-1.58814283 77.5234375 1.77584004
3 2011-10-30T16:05:54Z 52.33627098-1.58818323 76.0814209 15.7694969
显然我不想从我想导入的任何文件中手动删除它.是不是我做错了什么(可能是 XPath?)导致它无法正常工作,或者是否有解决方法可以从 XML 数据中删除该部分?
Obviously I don't want to have to manually remove this from any file I want to import. Is there something I am doing wrong (with XPath perhaps?) which is preventing this from working, or is there a work around to remove the section from the XML data?
非常感谢
推荐答案
这是一个命名空间问题.就这样做
It is a namespace issue. Just do this
xmlToDataFrame(nodes <- getNodeSet(doc, "//ns:Trackpoint", "ns"))
要直接获取经纬度分割的位置,可以执行以下操作
To directly obtain position split by latitude and longitude, you could do the following
nodes <- getNodeSet(doc, "//ns:Trackpoint", "ns")
mydf <- plyr::ldply(nodes, as.data.frame(xmlToList))
setNames(mydf, c('time', 'lat', 'long', 'alt', 'distance'))
它给了
time lat long alt distance
1 2011-10-30T16:05:48Z 52.33613318 -1.58814317 77.5234375 0.00000000
2 2011-10-30T16:05:49Z 52.33614810 -1.58814283 77.5234375 1.77584004
3 2011-10-30T16:05:54Z 52.33627098 -1.58818323 76.0814209 15.7694969
这篇关于使用 XML 包将 TCX 导入 R的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!