使用PL/SQL解析大型XML文件 [英] Parsing large XML file with PL/SQL

查看:106
本文介绍了使用PL/SQL解析大型XML文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


我正在尝试解析非常大的XML文件.文件足够大,无法使用VARCHAR2,因此我决定使用CLOB.代码看起来不错,但仍然出现错误.
这是我正在解析的xml文件的示例:


I'm trying to parse quite large XML file. FIle is large enough to make impossible to use VARCHAR2, so I've decided to use CLOB. Code looks fine, but I still get an error.
Here is sample of xml file I'm parsing:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<SeznamOvmIndex>
    <Subjekt>
        <Zkratka>CUZK</Zkratka>
        <ICO>00025712</ICO>
        <Nazev>Český úřad zeměměřický a katastrální</Nazev>
        <AdresaUradu>
            <AdresniBod>25133616</AdresniBod>
            <UliceNazev>Pod sídlištěm</UliceNazev>
            <CisloDomovni>1800</CisloDomovni>
            <CisloOrientacni>9</CisloOrientacni>
            <ObecNazev>Praha</ObecNazev>
            <ObecKod>554782</ObecKod>
            <PSC>18200</PSC>
            <KrajNazev>Hlavní město Praha</KrajNazev>
        </AdresaUradu>
        <Email>
            <Polozka>
                <Typ text="podatelna">2</Typ>
                <Email>cuzk@cuzk.cz</Email>
                <Poznamka>Elektronická podatelna ČÚZK</Poznamka>
            </Polozka>
        </Email>
        <TypSubjektu id="11">Orgán státní správy</TypSubjektu>
        <PravniForma type="325">Organizační složka státu</PravniForma>
        <PrimarniOvm>Ano</PrimarniOvm>
        <IdDS>uuaaatg</IdDS>
        <TypDS>OVM</TypDS>
        <StavDS>1</StavDS>
        <StavSubjektu>1</StavSubjektu>
            <DetailSubjektu>http://seznam.gov.cz/ovm/datafile.do?format=xml&amp;service=seznamovm&amp;id=CUZK</DetailSubjektu>
    </Subjekt>
</SeznamOvmIndex>

这是代码:
开启SERVEROUTPUT

And this is code:
SET SERVEROUTPUT ON

DECLARE
  xmlClob CLOB;
  xmlFile UTL_FILE.FILE_TYPE;
  x XMLType;
 BEGIN
   xmlFile := UTL_FILE.FOPEN('XMLPARSERADRESYCUZK', 'pokus.xml','R');
  LOOP
    BEGIN
      UTL_FILE.GET_LINE(xmlFile,xmlClob,NULL);
    EXCEPTION WHEN No_Data_Found THEN EXIT; END;
  END LOOP;
  UTL_FILE.FCLOSE(xmlFIle);
  x := XMLType.createXML(xmlClob);
  FOR r IN (
   SELECT ExtractValue(Value(p),'/Subjekt/Zkratka/text()') as kod
         ,ExtractValue(Value(p),'/Subjekt/AdresaUradu/UliceNazev/text()') as ulice
         ,ExtractValue(Value(p),'/Subjekt/AdresaUradu/CisloDomovni/text()') as  cislo_domovni
         ,ExtractValue(Value(p),'/Subjekt/AdresaUradu/CisloOrientacni/text()') as cislo_orientacni
    FROM   TABLE(XMLSequence(Extract(x,'/SeznamOvmIndex/Subjekt'))) p
   WHERE ExtractValue(Value(p),'/Subjekt/Zkratka/text()') = 'CUZK'
   ) LOOP
      dbms_output.put_line(r.kod||' '||r.ulice||' '||r.cislo_domovni||'/'||r.cislo_orientacni);
   END LOOP;
 END;

我认为它应该运行正确,但是当我获得SQL Developer时,我得到了:

I think that it shoud run right, but when I got SQL Developer I got:

第1行出现错误 ORA-06512:不存在"SYS.XMLTYPE",第5行 ORA-06512:第13行 31011. 00000-"XML解析失败" *原因:尝试解析文档时,XML解析器返回错误. *操作:检查要解析的文档是否有效.

Error at line 1 ORA-06512: na "SYS.XMLTYPE", line 5 ORA-06512: na line 13 31011. 00000 - "XML parsing failed" *Cause: XML parser returned an error while trying to parse the document. *Action: Check if the document to be parsed is valid.

推荐答案

您正在逐行读取文件,但是每行都覆盖了xmlClob,而不是追加.您可以通过读取varchar2缓冲区并追加来建立CLOB,但也可以使用DBMS_LOB内置过程来为您完成此操作:

You are reading the file line by line, but overwritting your xmlClob with each line, not appending. You could build up the CLOB by reading into a varchar2 buffer and appending, but you can also use the DBMS_LOB built-in procedures to do it for you:

DECLARE
  xmlClob CLOB;
  xmlFile BFILE;
  x XMLType;

  src_offset number := 1 ;
  dest_offset number := 1 ;
  lang_ctx number := DBMS_LOB.DEFAULT_LANG_CTX;
  warning integer;
BEGIN
  xmlFile := BFILENAME('XMLPARSERADRESYCUZK', 'pokus.xml');
  DBMS_LOB.CREATETEMPORARY(xmlClob, true);
  DBMS_LOB.FILEOPEN(xmlFile, DBMS_LOB.FILE_READONLY);
  DBMS_LOB.LOADCLOBFROMFILE(xmlClob, xmlFile, DBMS_LOB.LOBMAXSIZE, src_offset,
    dest_offset, DBMS_LOB.DEFAULT_CSID, lang_ctx, warning);
  x := XMLType.createXML(xmlClob);
  DBMS_LOB.FILECLOSEALL();
  DBMS_LOB.FREETEMPORARY(xmlClob);
  FOR r IN (
...

当我使用它并加载文件时,我得到输出:

When I use that and load your file I get the output:

CUZK Pod smdli.t.m 1800/9

您可能希望围绕DBMS_LOB调用进行一些错误检查,这只是一个简单的演示.

You probably want some error checkign around the DBMS_LOB calls, this is just a simple demo.

这篇关于使用PL/SQL解析大型XML文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆