如何使用Haskell的xml-conduit解析GPX文件? [英] How to parse GPX file using Haskell's xml-conduit?
问题描述
我想用 xml-conduit
来解析GPX文件。到目前为止,我已经得到了以下内容:
{ - #LANGUAGE OverloadedStrings# - }
import Control.Applicative
导入Data.Text作为T
导入Text.XML
导入Text.XML.Cursor
数据Trkpt = Trkpt {
trkptLat: :Text,
trkptLon :: Text,
trkptEle :: Text,
trkptTime :: Text
}派生(显示)
trkptsFromFile path =
gpxTrkpts。 fromDocument< $> Text.XML.readFile def path
gpxTrkpts =
child> =>元素{http://www.topografix.com/GPX/1/0}trk> =>
child> =>元素{http://www.topografix.com/GPX/1/0}trkseg> =>
child> =>元素{http://www.topografix.com/GPX/1/0}trkpt> =>
child> => \e - > do
let ele = T.concat $ element{http://www.topografix.com/GPX/1/0}elee>> = descendant>> = content
let time = T.concat $ element{http://www.topografix.com/GPX/1/0}timee>> = descendant>>> = content
let lat = T. concat $ attributelate
let lon = T.concat $ attributelone
return $ Trkpt lat lon ele time time
示例GPX文件为此处。
虽然原始的GPX文件数据都是有效的,但我得到了奇怪的结果,其中解析的文本大部分是空的,有一些零星的实际值。当有实际值时,它只在记录的一个字段中。
我很确定我没有使用 xml-conduit
API。我在做什么错了?
两个问题。首先,名称空间中存在拼写错误;它应该是 http://www.topografix.com/GPX/1/1
。其次,你最后的Kleisli箭头( \e-> do - etc。
)正在处理 trkpt
元素,而不是在 trkpt
本身。这是一个 gpxTrkpts
,它应该做你想做的:
gpxTrkpts =
child> =>元素{http://www.topografix.com/GPX/1/1}trk> =>
child> =>元素{http://www.topografix.com/GPX/1/1}trkseg> =>
child> =>元素{http://www.topografix.com/GPX/1/1}trkpt> =>
\e - > do
let cs = child e
ele = T.concat $ cs>> = element{http://www.topografix.com/GPX/1/1}ele>>> ; =后代>> =内容
时间= T.concat $ cs>> =元素{http://www.topografix.com/GPX/1/1}time>> =后代>> =内容
lat = T.concat $属性late
lon = T.concat $属性lone
返回$ Trkpt lat lon ele时间
I'd like to use xml-conduit
to parse GPX files. So far I've got the following:
{-# LANGUAGE OverloadedStrings #-}
import Control.Applicative
import Data.Text as T
import Text.XML
import Text.XML.Cursor
data Trkpt = Trkpt {
trkptLat :: Text,
trkptLon :: Text,
trkptEle :: Text,
trkptTime :: Text
} deriving (Show)
trkptsFromFile path =
gpxTrkpts . fromDocument <$> Text.XML.readFile def path
gpxTrkpts =
child >=> element "{http://www.topografix.com/GPX/1/0}trk" >=>
child >=> element "{http://www.topografix.com/GPX/1/0}trkseg" >=>
child >=> element "{http://www.topografix.com/GPX/1/0}trkpt" >=>
child >=> \e -> do
let ele = T.concat $ element "{http://www.topografix.com/GPX/1/0}ele" e >>= descendant >>= content
let time = T.concat $ element "{http://www.topografix.com/GPX/1/0}time" e >>= descendant >>= content
let lat = T.concat $ attribute "lat" e
let lon = T.concat $ attribute "lon" e
return $ Trkpt lat lon ele time
A sample GPX file is here.
I'm getting strange results where the parsed text is mostly empty, with some sporadic actual values, although the original GPX file data is all valid. When there is an actual value, it is only in one of the fields of the record.
I'm quite certain I'm not using the xml-conduit
API properly. What am I doing wrong?
Two issues. Firstly, there is a typo in the namespace; it should be http://www.topografix.com/GPX/1/1
. Secondly, your final Kleisli arrow (\e -> do -- etc.
) is acting on the children of the trkpt
elements, rather than on the trkpt
themselves. Here is a gpxTrkpts
which should do what you want:
gpxTrkpts =
child >=> element "{http://www.topografix.com/GPX/1/1}trk" >=>
child >=> element "{http://www.topografix.com/GPX/1/1}trkseg" >=>
child >=> element "{http://www.topografix.com/GPX/1/1}trkpt" >=>
\e -> do
let cs = child e
ele = T.concat $ cs >>= element "{http://www.topografix.com/GPX/1/1}ele" >>= descendant >>= content
time = T.concat $ cs >>= element "{http://www.topografix.com/GPX/1/1}time" >>= descendant >>= content
lat = T.concat $ attribute "lat" e
lon = T.concat $ attribute "lon" e
return $ Trkpt lat lon ele time
这篇关于如何使用Haskell的xml-conduit解析GPX文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!