在clojure中解析XML [英] parsing XML in clojure
问题描述
我是clojure的新人,所以请和我一起玩。我有一个像这样的XML
<?xml version =1.0encoding =UTF-8?> ;
< XVar Id =cdx9Type =Dictionary>
< XVar Id =Base.AccrumentPremiumType =MultiValue =Rows =1Columns =1>
< Row Id =0>
< Col Id =0Type =NumValue =0/>
< / Row>
< / XVar>
< XVar Id =TrancheAnalysis.IndexDurationType =MultiValue =Rows =1Columns =1>
< Row Id =0>
< Col Id =0Type =NumValue =3.4380728252313069/>
< / Row>
< / XVar>
< XVar Id =TrancheAnalysis.IndexLevel01Type =MultiValue =Rows =1Columns =1>
< Row Id =0>
< Col Id =0Type =NumValue =30693.926279941188/>
< / Row>
< / XVar>
< XVar Id =TrancheAnalysis.TrancheDeltaType =MultiValue =Rows =1Columns =1>
< Row Id =0>
< Col Id =0Type =NumValue =8.9304387917502073/>
< / Row>
< / XVar>
< XVar Id =TrancheAnalysis.TrancheDurationType =MultiValue =Rows =1Columns =1>
< Row Id =0>
< Col Id =0Type =NumValue =3.0775955481964035/>
< / Row>
< / XVar>
< / XVar>
这重复了
从这里我想能够生成一个CSV文件列
IndexName,TrancheAnalysis.IndexDuration,TrancheAnalysis.TrancheDuration
cdx9,3.4380728252313069,3.0775955481964035
............ .........................
................. ........................
我可以解析一个简单的XML文件,例如
<?xml version =1.0encoding =UTF-8?&
< CalibrationData>
< IndexList>
< Index>
< Calibrate> Y< / Calibrate>
< UseClientIndexQuotes> Y< / UseClientIndexQuotes>
< IndexName> HYCDX10< / IndexName>
< Tenor> 06/20/2013< / Tenor>
< TenorName> 3Y< / TenorName>
< IndexLevels> 219.6< / IndexLevels>
< Tranche> Equity0To0.15< / Tranche>
< TrancheStart> 0< / TrancheStart>
< TrancheEnd> 0.15< / TrancheEnd>
< UseBreakEvenSpread> 1< / UseBreakEvenSpread>
< UseTlet> 0< / UseTlet>
< IsTlet> 0< / IsTlet>
< PctExpectedLoss> 0< / PctExpectedLoss>
< UpfrontFe> 52.125< / UpfrontFee>
< RunningFee> 0< / RunningFee>
< DeltaFee> 5.3< / DeltaFee>
< CentralCorrelation> 0.1< / CentralCorrelation>
<货币>美元< /货币>
< RescalingMethod> PTIndexRescaling< / RescalingMethod>
< EffectiveDate> 06/17/2011< / EffectiveDate>
< / Index>
< / Index>
< / IndexList>
< / CalibrationData>
使用此
code>(ns DynamicProgramming
(:require [clojure.xml:as xml]))
;获取输入文件
(def calibrationFileC:/ ashwani / Eclipse / HistoricalTrancheAnalysis /src/CalibrationQuotes.xml)
(def mktdataFileC:/ashwani/Eclipse/HistoricalTrancheAnalysis/src/MarketData.xml)
(def sampleC:/ ashwani / Eclipse / HistoricalTrancheAnalysis / src /Sample.xml)
;解析校准输入文件
(def CalibOp(for [x
(xml-seq
(xml / parse io.File。calibratFile)))
:when(or
(=:IndexName(:tag x))
(=:Tenor(:tag x))
:UpfrontFee(:tag x))
(=:RunningFee(:tag x))
(=:DeltaFee(:tag x))
(=:IndexLevels(:tag x))
(=:TrancheStart(:tag x))
(=:TrancheEnd(:tag x))
)]
但是第一个XML并不复杂,我真的不知道我迭代什么。通过嵌套结构和拉出信息。
任何帮助将是伟大的/
解决方案我会使用 data.zip (以前是clojure.contrib.zip过滤器)。它提供了很多xml解析能力,它很容易执行类似xpath的表达式。 README将它描述为用于过滤树和特别是XML树的系统
。
下面我有一些示例代码用于创建一个 用于CSV文件。该行是列名称与属性值的映射。
(ns work
(:require [clojure .xml:as xml]
[clojure.zip:as zip]
[clojure.contrib.zip-filter.xml:as zf]))
;从xml文件中创建一个zip文件
(def zip(zip / xml-zip(xml / parsedata.xml)))
;拉出所有根Id属性值的列表
(zf / xml-> zip(zf / attr:Id))
(defn value [xvar-zip]
查找特定元素的id和值
(let [id( - > xvar-zip zip / node:attrs:Id);手动访问
value(zf / xml1 - > xvar-zip;使用xpath像表达式拉出值
:Row;需要行元素
:Col;然后列元素
(zf / attr:Value) ;最后拉出Value out
{id value}))
;获取单列的column-value对
(zf / xml1-> zip
(zf / attr =:Idcdx9); idcdx9 :XVar; filter on XVars under it
(zf / attr =:IdTrancheAnalysis.IndexDuration); filter on id
value);将值函数应用于上面的结果
;创建每个列键的映射到它对应的值
(apply merge(zf / xml-> zip(zf / attr =:Idcdx9):XVar value))
我不知道xml如何使用多个Dictionary XVars,因为它是一个根元素。如果你需要,对这种类型的工作有用的其他函数之一是
mapcat
,其中cat
s所有从映射函数返回的值。
测试来源。
另一个大的建议是,确保你使用很多小功能。你会发现更容易调试,测试和使用。
I am new to clojure so please bear with me. I have a XML which looks like this
<?xml version="1.0" encoding="UTF-8"?> <XVar Id="cdx9" Type="Dictionary"> <XVar Id="Base.AccruedPremium" Type="Multi" Value="" Rows="1" Columns="1"> <Row Id="0"> <Col Id="0" Type="Num" Value="0"/> </Row> </XVar> <XVar Id="TrancheAnalysis.IndexDuration" Type="Multi" Value="" Rows="1" Columns="1"> <Row Id="0"> <Col Id="0" Type="Num" Value="3.4380728252313069"/> </Row> </XVar> <XVar Id="TrancheAnalysis.IndexLevel01" Type="Multi" Value="" Rows="1" Columns="1"> <Row Id="0"> <Col Id="0" Type="Num" Value="30693.926279941188"/> </Row> </XVar> <XVar Id="TrancheAnalysis.TrancheDelta" Type="Multi" Value="" Rows="1" Columns="1"> <Row Id="0"> <Col Id="0" Type="Num" Value="8.9304387917502073"/> </Row> </XVar> <XVar Id="TrancheAnalysis.TrancheDuration" Type="Multi" Value="" Rows="1" Columns="1"> <Row Id="0"> <Col Id="0" Type="Num" Value="3.0775955481964035"/> </Row> </XVar> </XVar>
And this repeats From this I want to be able to produce a CSV file with these column
IndexName,TrancheAnalysis.IndexDuration,TrancheAnalysis.TrancheDuration cdx9,3.4380728252313069,3.0775955481964035 ......................................... .........................................
I am able to parse a simple XML file like
<?xml version="1.0" encoding="UTF-8"?> <CalibrationData> <IndexList> <Index> <Calibrate>Y</Calibrate> <UseClientIndexQuotes>Y</UseClientIndexQuotes> <IndexName>HYCDX10</IndexName> <Tenor>06/20/2013</Tenor> <TenorName>3Y</TenorName> <IndexLevels>219.6</IndexLevels> <Tranche>Equity0To0.15</Tranche> <TrancheStart>0</TrancheStart> <TrancheEnd>0.15</TrancheEnd> <UseBreakEvenSpread>1</UseBreakEvenSpread> <UseTlet>0</UseTlet> <IsTlet>0</IsTlet> <PctExpectedLoss>0</PctExpectedLoss> <UpfrontFee>52.125</UpfrontFee> <RunningFee>0</RunningFee> <DeltaFee>5.3</DeltaFee> <CentralCorrelation>0.1</CentralCorrelation> <Currency>USD</Currency> <RescalingMethod>PTIndexRescaling</RescalingMethod> <EffectiveDate>06/17/2011</EffectiveDate> </Index> </Index> </IndexList> </CalibrationData>
using this
(ns DynamicProgramming (:require [clojure.xml :as xml])) ;Get the Input Files (def calibrationFile "C:/ashwani/Eclipse/HistoricalTrancheAnalysis/src/CalibrationQuotes.xml") (def mktdataFile "C:/ashwani/Eclipse/HistoricalTrancheAnalysis/src/MarketData.xml") (def sample "C:/ashwani/Eclipse/HistoricalTrancheAnalysis/src/Sample.xml") ;Parse the Calibration Input File (def CalibOp (for [x (xml-seq (xml/parse (java.io.File. calibrationFile))) :when (or (= :IndexName (:tag x)) (= :Tenor (:tag x)) (= :UpfrontFee (:tag x)) (= :RunningFee (:tag x)) (= :DeltaFee (:tag x)) (= :IndexLevels (:tag x)) (= :TrancheStart (:tag x)) (= :TrancheEnd (:tag x)) )] (first(:content x)))) (println CalibOp)
But the first XML is little complicate and I dont really know who I iterate through the nested structure and Pull out the info.
Any help will be great/
解决方案I would use data.zip (Formerly clojure.contrib.zip-filter). It provides a lot of xml-parsing power and it's easily capable of performing xpath like expressions. The README describes it as a System for filtering trees, and XML trees in particular.
Below I have some sample code for creating a "row" for the CSV file. The row is a map of the column name to the attribute value.
(ns work (:require [clojure.xml :as xml] [clojure.zip :as zip] [clojure.contrib.zip-filter.xml :as zf])) ; create a zip from the xml file (def zip (zip/xml-zip (xml/parse "data.xml"))) ; pulls out a list of all of the root "Id" attribute values (zf/xml-> zip (zf/attr :Id)) (defn value [xvar-zip] "Finds the id and value for a particular element" (let [id (-> xvar-zip zip/node :attrs :Id) ; manual access value (zf/xml1-> xvar-zip ; use xpath like expression to pull value out :Row ; need the row element :Col ; then the column element (zf/attr :Value))] ; and finally pull the Value out {id value})) ; gets the "column-value" pair for a single column (zf/xml1-> zip (zf/attr= :Id "cdx9") ; filter on id "cdx9" :XVar ; filter on XVars under it (zf/attr= :Id "TrancheAnalysis.IndexDuration") ; filter on id value) ; apply the value function on the result of above ; creates a map of every column key to it's corresponding value (apply merge (zf/xml-> zip (zf/attr= :Id "cdx9") :XVar value))
I'm not sure how the xml would work with multiple Dictionary XVars, as it is a root element. If you need to, one of the other functions which is useful for this type of work is
mapcat
, whichcat
s all of the values returned from the mapping function.There are some more examples in the test source as well.
One other big recommendation I have is to make sure you use a lot of small functions. You'll find things much easier to debug, test, and work with.
这篇关于在clojure中解析XML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!