使用BeautifulSoup遍历XML以提取特定标签并存储在变量中 [英] Use BeautifulSoup to Iterate over XML to pull specific tags and store in variable

查看：154 发布时间：2020/5/4 5:01:20 python xml variables loops beautifulsoup

本文介绍了使用BeautifulSoup遍历XML以提取特定标签并存储在变量中的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我对编程还很陌生，一直在努力寻找解决方案，但是我所能找到的只是点点滴滴，没有运气将它们放在一起.

I'm fairly new to programming and have been trying to find a solution for this but all I can find are bits and pieces with no real luck putting it all together.

我试图在python中使用BeautifulSoup4刮一些xml并将文本值存储在变量中的特定标记之间.数据来自医学生培训计划，现在必须手动找到所需的一切.因此，我正在尝试通过抓取程序来提高效率.

I'm trying to use BeautifulSoup4 in python to scrape some xml and store the text value in between specific tags in variables. The data is from a med student training program and right now everything needed has to be found manually. So I'm trying to increase efficiency a bit with a scraping program.

例如，假设我正在查看这种类型的测试数据以进行试验:

Let's say for example that I was looking at this type of test data to experiment with:

<AllergyList>
<Allergy>
    <Deleted>n</Deleted>
    <Status>
        <Active/>
    </Status>
    <ExternalID/>
    <Patient>
        <ExternalID/>
        <FirstName>Testcase</FirstName>
        <LastName>casetest</LastName>
    </Patient>
    <Allergen>
        <Name>Flagyl (metronidazole)</Name>
        <Drug>
           <NDCID>00025182151,00025182131,00025182150</NDCID>
        </Drug>
    </Allergen>
    <Reaction>difficulty breathing</Reaction>
    <OnsetDate>02/02/2013</OnsetDate>
 </Allergy>
<Allergy>
    <Deleted>n</Deleted>
    <Status>
        <Active/>
    </Status>
    <ExternalID/>
    <Patient>
        <ExternalID/>
        <FirstName>Testcase</FirstName>
        <LastName>casetest</LastName>
    </Patient>
    <Allergen>
        <Name>Bactrim (sulfamethoxazole-trimethoprim)</Name>
        <Drug>
            <NDCID>13310014501,49999023220</NDCID>
        </Drug>
    </Allergen>
    <Reaction>swelling</Reaction>
    <OnsetDate>05/03/2002</OnsetDate>
  </Allergy>
  <Number>2</Number>
</AllergyList>

我一直试图从多个<Allergen>标签之间提取<Name>标签以及从<Onsetdate>和<Reaction>标签之间提取相应数据，同时将提取结果存储到相应的位置变量.

I've been trying to pull the <Name> tag from in between multiple <Allergen> tags as well as the respective data from in between the <Onsetdate> and <Reaction> tags while storing the results of the pull into respective variables.

例如，我想先拉Flagyl (metronidazole)，difficulty breathing，02/02/2013，然后拉Bactrim (sulfamethoxazole-trimethoprim)，swelling，05/03/2002等，然后将它们放在单独的变量中，以便以后使用

So for example I would want to pull Flagyl (metronidazole), difficulty breathing, 02/02/2013, then Bactrim (sulfamethoxazole-trimethoprim), swelling, 05/03/2002, and so on while placing them in separate variables that I can use later.

从<Allergen>标记中拉出第一个集合很容易，但是我很难弄清楚如何在xml上进行迭代并将提取的数据存储到变量中.我一直在尝试使用for循环，同时将数据存储到数组或列表中，但是我一直在写它的方式总是一遍又一遍地提取相同的数据，具体取决于我根据函数，此后就无法将其中的任何一个存储到数组中.

Pulling the first set from the <Allergen> tag is easy but I'm having trouble figuring out how to iterate over the xml and storing the pulled data into variables. I've been trying to use a for loop while storing the data into an array or list but the way I've been writing it I always pull the same data over and over again depending on the number of iterations I calculate from the len() function and have since failed to store any of it into an array.

我已经为此花了很长时间的思考，我想我可能还不那么聪明，所以任何帮助甚至指向正确方向的帮助都将不胜感激.

I've been racking my brain about this for a while now and I think I may just not be that smart so any help or even pointing me in the right direction would be immensely appreciated.

使用BeautifulSoup遍历XML以提取特定标签并存储在变量中 [英] Use BeautifulSoup to Iterate over XML to pull specific tags and store in variable

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用BeautifulSoup遍历XML以提取特定标签并存储在变量中 [英] Use BeautifulSoup to Iterate over XML to pull specific tags and store in variable

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭