使用python提取并保存xml文件的信息并进行解析 [英] Extract and save information of an xml file with python and parsing

查看:118
本文介绍了使用python提取并保存xml文件的信息并进行解析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此处的Python新手具有如下XML结构:

Python novice here who has an XML structure which looks like this:

<!-- ====================================================================== -->

    <person id="10007071">
        <attributes>
            <attribute name="age" class="java.lang.Integer" >31</attribute>
            <attribute name="bikeAvailability" class="java.lang.String" >none</attribute>
            <attribute name="carAvailability" class="java.lang.String" >all</attribute>
            <attribute name="censusId" class="java.lang.Integer" >3676634</attribute>
            <attribute name="employed" class="java.lang.Boolean" >true</attribute>
            <attribute name="hasLicense" class="java.lang.String" >yes</attribute>
            <attribute name="htsId" class="java.lang.Long" >1156680400001</attribute>
            <attribute name="isOutside" class="java.lang.Boolean" >true</attribute>
            <attribute name="isPassenger" class="java.lang.Boolean" >false</attribute>
            <attribute name="ptSubscription" class="java.lang.Boolean" >false</attribute>
            <attribute name="sex" class="java.lang.String" >m</attribute>
        </attributes>
        <plan score="-0.525" selected="yes">
            <activity type="outside" link="398700" facility="outside_15" x="653054.0233505964" y="6857528.792600333" end_time="06:58:53" >
            </activity>
            <leg mode="car" dep_time="06:58:53" trav_time="00:11:55">
                <route type="links" start_link="398700" end_link="255203" trav_time="00:11:55" distance="12314.30498323443" vehicleRefId="10007071">398700 398731 506155 506168 398730 517874 279323 284251 660231 129607 129599 139064 641998 641663 159806 170160 85864 635804 572378 435246 190032 526059 525761 525778 525779 450362 63873 63870 63871 350067 350066 85890 202345 202323 202322 85868 569745 569762 535571 535243 616420 7195 584893 205956 205957 205958 536023 150529 150530 392831 392832 392833 37140 476291 107074 107075 74149 74150 74151 74152 646460 646461 646462 190088 190089 190090 276937 276938 276939 276940 276941 276942 477763 270067 132825 277662 277663 181902 181923 132840 132838 132836 132834 245291 245289 245287 245285 666635 666637 666638 666639 666640 666641 344713 344711 344709 344707 344705 142088 149714 149716 251612 251610 251608 251606 428868 223363 223365 149718 283093 259788 428828 81196 260062 614779 614781 614783 614785 255201 255202 255203</route>
            </leg>
            <activity type="work" link="255203" facility="43250" x="652768.9" y="6863857.8" start_time="07:02:50" end_time="19:22:50" >
                <attributes>
                    <attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
                </attributes>
            </activity>
            <leg mode="car" dep_time="19:22:50" trav_time="00:13:13">
                <route type="links" start_link="255203" end_link="398730" trav_time="00:13:13" distance="9083.291323242862" vehicleRefId="10007071">255203 640528 343439 24347 674168 531169 531167 531165 531163 531161 531159 531157 531155 414406 414407 268416 490715 233459 283092 149717 223364 264530 241890 196912 196910 391392 260834 409045 409046 598185 145996 368783 368785 368787 368789 525236 497882 538200 538202 385480 385487 164061 144907 443455 385499 385500 385501 76440 85934 85935 171962 85949 66249 66250 294493 203666 626505 626506 626507 620017 202848 610048 594253 594254 294494 484736 165207 675329 255383 293919 494873 215203 494882 494884 250728 134511 134509 537157 376845 376843 376841 376839 592779 178715 412036 412037 369862 581948 204682 210451 159662 170159 159663 641997 641996 139065 129610 557816 525663 46435 46436 46426 46421 284422 506155 506168 398730</route>
            </leg>
            <activity type="outside" link="398730" facility="outside_10" x="653013.2075560454" y="6857532.214432823" end_time="19:31:40" >
            </activity>
        </plan>

    </person>

整个文件非常大(2.5gb,还有更多人的ID),这就是为什么我需要进行解析的原因,到目前为止,这一直是iterparse.我想要的是一个数据框,该框显示此人所有腿的person id以及(如果可能的话,将其汇总)trav_time.我正在努力获取每个人ID的信息.

The entire file is huge (2.5gb with many more person id's), which is why I need to work with parsing, so far this had been iterparse. What I want is a data frame, which shows the person id as well as the (summed up, if possible) trav_time of all the legs of this person. I'm struggling to access this information for each of the person id's.

我尝试了多种方法,根据我的理解,以下两种方法最接近可能的解决方案.

I've tried multiple ways, the following two are according to my understanding closest to a possible solution.

第一:

tree = ET.iterparse(gzip.open('V0_1pm/output_plans.xml.gz', 'r'))
traveltimes = defaultdict(list)
for xml_event, elem in tree:
        for person in elem:
            for plan in person:
                for leg in plan:
                    if leg.tag == "trav_time":
                        traveltimes[elem.attrib["trav_time"]]
                    elem.clear()
traveltimes = pd.DataFrame.from_dict(traveltimes, orient='index')                          
traveltimes

秒:

tree = ET.iterparse(gzip.open('V0_1pm/output_plans.xml.gz', 'r'))
traveltimes = defaultdict(list)
for xml_event, elem in tree:
    attributes = elem.attrib
    if elem.tag == "trav_time":
            traveltimes[attributes["trav_time"]]
    elem.clear()
traveltimes = pd.DataFrame.from_dict(traveltimes, orient='index')                        
traveltimes

非常感谢您的帮助和提示!

Thank you very much for your help and tips!

更新

扩展代码以复制数据结构

Expansion of the code to replicate the data structure

    <person id="10002042">
        <attributes>
            <attribute name="age" class="java.lang.Integer" >86</attribute>
            <attribute name="bikeAvailability" class="java.lang.String" >none</attribute>
            <attribute name="carAvailability" class="java.lang.String" >some</attribute>
            <attribute name="censusId" class="java.lang.Integer" >3674945</attribute>
            <attribute name="employed" class="java.lang.Boolean" >false</attribute>
            <attribute name="hasLicense" class="java.lang.String" >yes</attribute>
            <attribute name="htsId" class="java.lang.Long" >2601700100002</attribute>
            <attribute name="isOutside" class="java.lang.Boolean" >true</attribute>
            <attribute name="isPassenger" class="java.lang.Boolean" >true</attribute>
            <attribute name="ptSubscription" class="java.lang.Boolean" >false</attribute>
            <attribute name="sex" class="java.lang.String" >f</attribute>
        </attributes>
        <plan score="-0.13749999999999998" selected="yes">
            <activity type="outside" link="284251" facility="outside_1" x="653218.0059491959" y="6857536.564730054" end_time="09:49:38" >
            </activity>
            <leg mode="car_passenger" dep_time="09:49:38" trav_time="00:02:36">
                <route type="links" start_link="284251" end_link="63873" trav_time="00:02:36" distance="3117.285137236383" vehicleRefId="null">284251 660231 129607 129599 139064 641998 641663 159806 170160 85864 635804 572378 435246 190032 526059 525761 525778 525779 450362 63873</route>
            </leg>
            <activity type="outside" link="63873" facility="outside_2" x="656055.3097541996" y="6859009.979613776" end_time="09:52:18" >
            </activity>
            <leg mode="outside" dep_time="09:52:18" trav_time="00:00:00">
                <route type="generic" start_link="63873" end_link="85890" trav_time="00:00:00" distance="746.7439307235369"></route>
            </leg>
            <activity type="outside" link="85890" facility="outside_3" x="656635.5166858744" y="6859480.071535116" end_time="09:53:00" >
            </activity>
            <leg mode="car_passenger" dep_time="09:53:00" trav_time="00:01:21">
                <route type="links" start_link="85890" end_link="47652" trav_time="00:01:21" distance="1499.4956773327315" vehicleRefId="null">85890 202345 202323 202322 85868 569745 569762 535571 535243 616420 7195 408601 47652</route>
            </leg>
            <activity type="outside" link="47652" facility="outside_4" x="657143.7893766644" y="6860882.64702696" end_time="10:41:02" >
            </activity>
            <leg mode="outside" dep_time="10:41:02" trav_time="00:00:00">
                <route type="generic" start_link="47652" end_link="466140" trav_time="00:00:00" distance="16.659217552989976"></route>
            </leg>
            <activity type="outside" link="466140" facility="outside_5" x="657155.3197720037" y="6860894.671149082" end_time="10:43:55" >
            </activity>
            <leg mode="car_passenger" dep_time="10:43:55" trav_time="00:01:32">
                <route type="links" start_link="466140" end_link="85887" trav_time="00:01:32" distance="1841.175613889593" vehicleRefId="null">466140 666788 205956 205957 205958 315381 584891 7193 150557 535291 535555 569763 569764 569744 202426 202425 202424 535572 85887</route>
            </leg>
            <activity type="outside" link="85887" facility="outside_6" x="656620.921626125" y="6859492.595666251" end_time="10:45:38" >
            </activity>
            <leg mode="outside" dep_time="10:45:38" trav_time="00:00:00">
                <route type="generic" start_link="85887" end_link="63872" trav_time="00:00:00" distance="744.9330931635377"></route>
            </leg>
            <activity type="outside" link="63872" facility="outside_7" x="656043.6710628852" y="6859021.737831518" end_time="10:46:13" >
            </activity>
            <leg mode="car_passenger" dep_time="10:46:13" trav_time="00:02:37">
                <route type="links" start_link="63872" end_link="46435" trav_time="00:02:37" distance="3138.4720080186116" vehicleRefId="null">63872 63869 332997 332998 85873 525752 525750 525764 435247 635803 572374 572375 210451 159662 170159 159663 641997 641996 139065 129610 557816 525663 46435</route>
            </leg>
            <activity type="outside" link="46435" facility="outside_8" x="653338.6697731011" y="6857579.601421991" end_time="10:48:56" >
            </activity>
            <leg mode="outside" dep_time="10:48:56" trav_time="00:00:00">
                <route type="generic" start_link="46435" end_link="46426" trav_time="00:00:00" distance="187.1198640488319"></route>
            </leg>
            <activity type="outside" link="46426" facility="outside_9" x="653160.1865588573" y="6857523.409022551" end_time="10:49:17" >
            </activity>
            <leg mode="car_passenger" dep_time="10:49:17" trav_time="00:00:04">
                <route type="links" start_link="46426" end_link="398730" trav_time="00:00:04" distance="131.48553148334906" vehicleRefId="null">46426 46421 284422 506155 506168 398730</route>
            </leg>
            <activity type="outside" link="398730" facility="outside_10" x="653013.2075560454" y="6857532.214432823" end_time="10:49:27" >
            </activity>
        </plan>

    </person>

<!-- ====================================================================== -->

    <person id="10002043">
        <attributes>
            <attribute name="age" class="java.lang.Integer" >90</attribute>
            <attribute name="bikeAvailability" class="java.lang.String" >none</attribute>
            <attribute name="carAvailability" class="java.lang.String" >some</attribute>
            <attribute name="censusId" class="java.lang.Integer" >3674946</attribute>
            <attribute name="employed" class="java.lang.Boolean" >false</attribute>
            <attribute name="hasLicense" class="java.lang.String" >yes</attribute>
            <attribute name="htsId" class="java.lang.Long" >2400810100001</attribute>
            <attribute name="isOutside" class="java.lang.Boolean" >true</attribute>
            <attribute name="isPassenger" class="java.lang.Boolean" >false</attribute>
            <attribute name="ptSubscription" class="java.lang.Boolean" >false</attribute>
            <attribute name="sex" class="java.lang.String" >m</attribute>
        </attributes>
        <plan score="-0.2636111111111111" selected="yes">
            <activity type="outside" link="284251" facility="outside_1" x="653218.0059491959" y="6857536.564730054" end_time="08:29:24" >
            </activity>
            <leg mode="car" dep_time="08:29:24" trav_time="00:02:54">
                <route type="links" start_link="284251" end_link="63873" trav_time="00:02:54" distance="3117.285137236383" vehicleRefId="10002043">284251 660231 129607 129599 139064 641998 641663 159806 170160 85864 635804 572378 435246 190032 526059 525761 525778 525779 450362 63873</route>
            </leg>
            <activity type="outside" link="63873" facility="outside_2" x="656055.3097541996" y="6859009.979613776" end_time="08:32:04" >
            </activity>
            <leg mode="outside" dep_time="08:32:04" trav_time="00:00:00">
                <route type="generic" start_link="63873" end_link="85890" trav_time="00:00:00" distance="746.7439307235369"></route>
            </leg>
            <activity type="outside" link="85890" facility="outside_3" x="656635.5166858744" y="6859480.071535116" end_time="08:32:46" >
            </activity>
            <leg mode="car" dep_time="08:32:46" trav_time="00:01:40">
                <route type="links" start_link="85890" end_link="47652" trav_time="00:01:40" distance="1499.4956773327315" vehicleRefId="10002043">85890 202345 202323 202322 85868 569745 569762 535571 535243 616420 7195 408601 47652</route>
            </leg>
            <activity type="outside" link="47652" facility="outside_4" x="657143.7893766644" y="6860882.64702696" end_time="09:35:48" >
            </activity>
            <leg mode="outside" dep_time="09:35:48" trav_time="00:00:00">
                <route type="generic" start_link="47652" end_link="466140" trav_time="00:00:00" distance="16.659217552989976"></route>
            </leg>
            <activity type="outside" link="466140" facility="outside_5" x="657155.3197720037" y="6860894.671149082" end_time="09:42:26" >
            </activity>
            <leg mode="car" dep_time="09:42:26" trav_time="00:02:00">
                <route type="links" start_link="466140" end_link="85887" trav_time="00:02:00" distance="1841.175613889593" vehicleRefId="10002043">466140 666788 205956 205957 205958 315381 584891 7193 150557 535291 535555 569763 569764 569744 202426 202425 202424 535572 85887</route>
            </leg>
            <activity type="outside" link="85887" facility="outside_6" x="656620.921626125" y="6859492.595666251" end_time="09:44:09" >
            </activity>
            <leg mode="outside" dep_time="09:44:09" trav_time="00:00:00">
                <route type="generic" start_link="85887" end_link="63872" trav_time="00:00:00" distance="744.9330931635377"></route>
            </leg>
            <activity type="outside" link="63872" facility="outside_7" x="656043.6710628852" y="6859021.737831518" end_time="09:44:44" >
            </activity>
            <leg mode="car" dep_time="09:44:44" trav_time="00:03:00">
                <route type="links" start_link="63872" end_link="46435" trav_time="00:03:00" distance="3138.4720080186116" vehicleRefId="10002043">63872 63869 332997 332998 85873 525752 525750 525764 435247 635803 572374 572375 210451 159662 170159 159663 641997 641996 139065 129610 557816 525663 46435</route>
            </leg>
            <activity type="outside" link="46435" facility="outside_8" x="653338.6697731011" y="6857579.601421991" end_time="09:47:28" >
            </activity>
            <leg mode="outside" dep_time="09:47:28" trav_time="00:00:00">
                <route type="generic" start_link="46435" end_link="46426" trav_time="00:00:00" distance="187.1198640488319"></route>
            </leg>
            <activity type="outside" link="46426" facility="outside_9" x="653160.1865588573" y="6857523.409022551" end_time="09:47:49" >
            </activity>
            <leg mode="car" dep_time="09:47:49" trav_time="00:00:14">
                <route type="links" start_link="46426" end_link="398730" trav_time="00:00:14" distance="131.48553148334906" vehicleRefId="10002043">46426 46421 284422 506155 506168 398730</route>
            </leg>
            <activity type="outside" link="398730" facility="outside_10" x="653013.2075560454" y="6857532.214432823" end_time="09:55:48" >
            </activity>
            <leg mode="outside" dep_time="09:55:48" trav_time="00:00:00">
                <route type="generic" start_link="398730" end_link="284251" trav_time="00:00:00" distance="204.84459212547162"></route>
            </leg>
            <activity type="outside" link="284251" facility="outside_1" x="653218.0059491959" y="6857536.564730054" end_time="09:59:24" >
            </activity>
            <leg mode="car" dep_time="09:59:24" trav_time="00:02:07">
                <route type="links" start_link="284251" end_link="525753" trav_time="00:02:07" distance="2349.4934769631172" vehicleRefId="10002043">284251 660231 129607 129599 139064 641998 641663 159806 170160 85864 635804 572378 435246 362748 643661 525753</route>
            </leg>
            <activity type="outside" link="525753" facility="outside_11" x="655306.9611509901" y="6858641.834279304" end_time="10:35:48" >
            </activity>
            <leg mode="outside" dep_time="10:35:48" trav_time="00:00:00">
                <route type="generic" start_link="525753" end_link="133164" trav_time="00:00:00" distance="70.96782044637413"></route>
            </leg>
            <activity type="outside" link="133164" facility="outside_12" x="655356.203591104" y="6858692.93822857" end_time="10:44:25" >
            </activity>
            <leg mode="car" dep_time="10:44:25" trav_time="00:02:45">
                <route type="links" start_link="133164" end_link="46435" trav_time="00:02:45" distance="2594.925451303471" vehicleRefId="10002043">133164 133165 525784 525781 159395 582076 84099 84100 525760 435247 635803 572374 572375 210451 159662 170159 159663 641997 641996 139065 129610 557816 525663 46435</route>
            </leg>
            <activity type="outside" link="46435" facility="outside_8" x="653338.6697731011" y="6857579.601421991" end_time="10:46:48" >
            </activity>
            <leg mode="outside" dep_time="10:46:48" trav_time="00:00:00">
                <route type="generic" start_link="46435" end_link="46426" trav_time="00:00:00" distance="187.1198640488319"></route>
            </leg>
            <activity type="outside" link="46426" facility="outside_9" x="653160.1865588573" y="6857523.409022551" end_time="10:47:09" >
            </activity>
            <leg mode="car" dep_time="10:47:09" trav_time="00:00:14">
                <route type="links" start_link="46426" end_link="398730" trav_time="00:00:14" distance="131.48553148334906" vehicleRefId="10002043">46426 46421 284422 506155 506168 398730</route>
            </leg>
            <activity type="outside" link="398730" facility="outside_10" x="653013.2075560454" y="6857532.214432823" end_time="10:47:19" >
            </activity>
        </plan>

    </person>

<!-- ====================================================================== -->

    <person id="10004136">
        <attributes>
            <attribute name="age" class="java.lang.Integer" >41</attribute>
            <attribute name="bikeAvailability" class="java.lang.String" >none</attribute>
            <attribute name="carAvailability" class="java.lang.String" >some</attribute>
            <attribute name="censusId" class="java.lang.Integer" >3675631</attribute>
            <attribute name="employed" class="java.lang.Boolean" >false</attribute>
            <attribute name="hasLicense" class="java.lang.String" >yes</attribute>
            <attribute name="htsId" class="java.lang.Long" >2403610200001</attribute>
            <attribute name="isOutside" class="java.lang.Boolean" >true</attribute>
            <attribute name="isPassenger" class="java.lang.Boolean" >false</attribute>
            <attribute name="ptSubscription" class="java.lang.Boolean" >false</attribute>
            <attribute name="sex" class="java.lang.String" >f</attribute>
        </attributes>
        <plan score="-0.0375" selected="yes">
            <activity type="outside" link="284251" facility="outside_1" x="653218.0059491959" y="6857536.564730054" end_time="19:22:27" >
            </activity>
            <leg mode="car" dep_time="19:22:27" trav_time="00:02:07">
                <route type="links" start_link="284251" end_link="525753" trav_time="00:02:07" distance="2349.4934769631172" vehicleRefId="10004136">284251 660231 129607 129599 139064 641998 641663 159806 170160 85864 635804 572378 435246 362748 643661 525753</route>
            </leg>
            <activity type="outside" link="525753" facility="outside_11" x="655306.9611509901" y="6858641.834279304" end_time="19:24:31" >
            </activity>
        </plan>

    </person>
<!-- ====================================================================== -->

    <person id="10004137">
        <attributes>
            <attribute name="age" class="java.lang.Integer" >53</attribute>
            <attribute name="bikeAvailability" class="java.lang.String" >none</attribute>
            <attribute name="carAvailability" class="java.lang.String" >some</attribute>
            <attribute name="censusId" class="java.lang.Integer" >3675632</attribute>
            <attribute name="employed" class="java.lang.Boolean" >true</attribute>
            <attribute name="hasLicense" class="java.lang.String" >yes</attribute>
            <attribute name="htsId" class="java.lang.Long" >1157470400001</attribute>
            <attribute name="isOutside" class="java.lang.Boolean" >true</attribute>
            <attribute name="isPassenger" class="java.lang.Boolean" >true</attribute>
            <attribute name="ptSubscription" class="java.lang.Boolean" >true</attribute>
            <attribute name="sex" class="java.lang.String" >m</attribute>
        </attributes>
        <plan score="-1.518611111111111" selected="yes">
            <activity type="outside" link="31240" facility="outside_13" x="652838.038196341" y="6858295.183610428" end_time="07:34:00" >
            </activity>
            <leg mode="access_walk" dep_time="07:34:00" trav_time="00:00:39">
                <route type="generic" start_link="31240" end_link="pt_StopPoint:59298" trav_time="00:00:39" distance="46.250835788845635"></route>
            </leg>
            <activity type="pt interaction" link="31240" x="652838.038196341" y="6858295.183610428" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="07:34:39" trav_time="00:02:21">
                <route type="enriched_pt" start_link="pt_StopPoint:59298" end_link="pt_StopPoint:59666" trav_time="00:02:21" distance="515.6409073075592">{"inVehicleTime":120.0,"transferTime":21.0,"accessStopIndex":26,"egressStopindex":27,"transitRouteId":"93517783-1_287780","transitLineId":"100110007:7","departureId":"93517632-1_287842_06:58:00"}</route>
            </leg>
            <activity type="pt interaction" link="31240" x="652838.038196341" y="6858295.183610428" max_dur="00:00:00" >
            </activity>
            <leg mode="egress_walk" dep_time="07:37:00" trav_time="00:08:29">
                <route type="generic" start_link="pt_StopPoint:59666" end_link="508756" trav_time="00:08:29" distance="610.543587585534"></route>
            </leg>
            <activity type="outside" link="508756" facility="outside_14" x="652601.8490830011" y="6857663.731302492" end_time="07:53:26" >
            </activity>
            <leg mode="access_walk" dep_time="07:53:26" trav_time="00:08:29">
                <route type="generic" start_link="508756" end_link="pt_StopPoint:59666" trav_time="00:08:29" distance="610.543587585534"></route>
            </leg>
            <activity type="pt interaction" link="508756" x="652601.8490830011" y="6857663.731302492" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="08:01:55" trav_time="00:24:05">
                <route type="enriched_pt" start_link="pt_StopPoint:59666" end_link="pt_StopPoint:59209" trav_time="00:24:05" distance="7410.255050348954">{"inVehicleTime":1260.0,"transferTime":185.0,"accessStopIndex":3,"egressStopindex":17,"transitRouteId":"93517741-1_288723","transitLineId":"100110007:7","departureId":"93517701-1_288827_08:01:00"}</route>
            </leg>
            <activity type="pt interaction" link="508756" x="652601.8490830011" y="6857663.731302492" max_dur="00:00:00" >
            </activity>
            <leg mode="transit_walk" dep_time="08:26:00" trav_time="00:01:05">
                <route type="generic" start_link="pt_StopPoint:59209" end_link="pt_StopPoint:59212" trav_time="00:01:05" distance="78.60144797794317"></route>
            </leg>
            <activity type="pt interaction" link="pt_StopPoint:59209" x="651042.0886563308" y="6863599.716479325" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="08:27:05" trav_time="00:08:54">
                <route type="enriched_pt" start_link="pt_StopPoint:59212" end_link="pt_StopPoint:59470" trav_time="00:08:54" distance="2841.5271228126094">{"inVehicleTime":420.0,"transferTime":114.498793351715,"accessStopIndex":17,"egressStopindex":22,"transitRouteId":"95331274-1_267292","transitLineId":"100110008:8","departureId":"95331302-1_267323_08:07:00"}</route>
            </leg>
            <activity type="pt interaction" link="pt_StopPoint:59209" x="651042.0886563308" y="6863599.716479325" max_dur="00:00:00" >
            </activity>
            <leg mode="egress_walk" dep_time="08:36:00" trav_time="00:03:05">
                <route type="generic" start_link="pt_StopPoint:59470" end_link="269385" trav_time="00:03:05" distance="221.08599197383575"></route>
            </leg>
            <activity type="work" link="269385" facility="22974" x="649200.4" y="6861852.6" start_time="07:38:40" end_time="16:38:40" >
                <attributes>
                    <attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
                </attributes>
            </activity>
            <leg mode="access_walk" dep_time="16:38:40" trav_time="00:03:05">
                <route type="generic" start_link="269385" end_link="pt_StopPoint:59470" trav_time="00:03:05" distance="221.08599197383575"></route>
            </leg>
            <activity type="pt interaction" link="269385" x="649200.4" y="6861852.6" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="16:41:45" trav_time="00:09:15">
                <route type="enriched_pt" start_link="pt_StopPoint:59470" end_link="pt_StopPoint:59212" trav_time="00:09:15" distance="2841.5271228126094">{"inVehicleTime":420.0,"transferTime":135.0,"accessStopIndex":6,"egressStopindex":11,"transitRouteId":"95305985-1_264552","transitLineId":"100110008:8","departureId":"95305925-1_264577_16:36:00"}</route>
            </leg>
            <activity type="pt interaction" link="269385" x="649200.4" y="6861852.6" max_dur="00:00:00" >
            </activity>
            <leg mode="transit_walk" dep_time="16:51:00" trav_time="00:01:05">
                <route type="generic" start_link="pt_StopPoint:59212" end_link="pt_StopPoint:59209" trav_time="00:01:05" distance="78.60144797794317"></route>
            </leg>
            <activity type="pt interaction" link="pt_StopPoint:59212" x="650982.2282691017" y="6863608.229197035" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="16:52:05" trav_time="00:19:54">
                <route type="enriched_pt" start_link="pt_StopPoint:59209" end_link="pt_StopPoint:59298" trav_time="00:19:54" distance="6894.614143041396">{"inVehicleTime":1140.0,"transferTime":54.498793351711356,"accessStopIndex":13,"egressStopindex":26,"transitRouteId":"93518107-1_287714","transitLineId":"100110007:7","departureId":"93518059-1_287550_16:35:00"}</route>
            </leg>
            <activity type="pt interaction" link="pt_StopPoint:59212" x="650982.2282691017" y="6863608.229197035" max_dur="00:00:00" >
            </activity>
            <leg mode="egress_walk" dep_time="17:12:00" trav_time="00:00:39">
                <route type="generic" start_link="pt_StopPoint:59298" end_link="31240" trav_time="00:00:39" distance="46.250835788845635"></route>
            </leg>
            <activity type="outside" link="31240" facility="outside_13" x="652838.038196341" y="6858295.183610428" end_time="17:14:00" >
            </activity>
        </plan>

    </person>

推荐答案

尝试将for循环更改为以下内容,并查看其是否有效:

Try changing your for loop to the following, and see if it works:

for xml_event, elem in tree:
if elem.tag=='person':        
    items = list(elem)
    target = items[1]        
    if target.attrib['selected']=='yes':
        traveltimes[elem.attrib["id"]]
        legs = list(items[1])
        for leg in legs:
            if leg.tag=='leg':
                traveltimes[leg.attrib["trav_time"]]
    elem.clear()    


traveltimes = pd.DataFrame.from_dict(traveltimes, orient='index')                        
traveltimes

我的输出,来自您上面的xml:

My output, from your xml above:

10007071

10007071

00:11:55

00:13:13

这篇关于使用python提取并保存xml文件的信息并进行解析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆