将sqlplus结果假脱机到xml文件时不需要的新行 [英] Unwanted new lines when spooling an sqlplus result to xml file

查看:110
本文介绍了将sqlplus结果假脱机到xml文件时不需要的新行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从数据库中提取一些数据到XML文件中. 为此,请使用bash脚本来调用sqlplus命令,并将结果假脱机到新文件中.

I'm trying to extract some data from my database into a XML file. To do so, in use a bash script that calls sqlplus command and spool the result into a new file.

一旦提取结果,就会发生问题.我的xml文件不再有效,因为添加了一些不需要的新行...

The problem happens once the result extracted. My xml file isn't valid anymore because there are some unwanted new lines added...

这是我想要的例子:

<xml>
 <element>John</element>
 <element>some data</element>
 <element>a longer data line</element>
</xml>

这就是我得到的:

<xml>
 <element>John</element>
 <element>some data</eleme
 nt>
 <element>a longer data 
 line</element>
</xml>

似乎最长的行被剪掉了,但是我在Sqlplus中将行大小设置为32767,这些行并不那么长...

It seems that the longest lines are cut but I set linesize to 32767 in Sqlplus and these lines aren't so long...

这是我的sqlplus命令的样子:

This is what my sqlplus command looks like:

sqlplus -s {connection} << EOF
set serveroutput on size unlimited
set feedback off
set termout off
set linesize 32767

spool file.xml;

DECLARE
l_xmltype XMLTYPE;
l_ctx dbms_xmlgen.ctxhandle;
v_clob CLOB;
v_clob_length INTEGER;
pos INTEGER;
buffer VARCHAR2(32767);
amount BINARY_INTEGER := 32767;

BEGIN

l_ctx := dbms_xmlgen.newcontext('SELECT a.rowid, a.* FROM mytable a');
l_xmltype := dbms_xmlgen.getXmlType(l_ctx);
dbms_xmlgen.closeContext(l_ctx);

v_clob := l_xmltype.getClobVal;
v_clob_length := length(v_clob);

WHILE pos < clob_length LOOP
 dbms_lob.read(v_clob, amount, pos, buffer);
 dbms_output.put_line(buffer);
 pos := pos + amount;
END LOOP;

END;
/
EOF
Spool off;

您有什么帮我解决这个问题的线索吗?

Do you have any clue to help me solve this problem?

谢谢!

推荐答案

正如@kfinity所建议的,这与CLOB处理有关,而且与dbms_output的工作方式有关.您正在以32k的块为单位读取CLOB,并用put_line()将这些块中的每一个都写出,这会在每个32k块之后附加一个换行符.这些不与XML文档中现有的换行符对齐,因此您可以得到原始的换行符,然后得到其他的换行符-看起来有些随机和中间文本,但实际上位于可预测的位置.

As @kfinity suggested, this is related to CLOB handling, but also to how dbms_output works. You're reading the CLOB in chunks of 32k, and writing each of those chunks out with put_line(), which appends a newline character after each 32k chunk. Those are not aligned with any existing line breaks within your XML document, so you get the original breaks, then extra ones - which appear somewhat random and mid-text, but are actually in predictable places.

一个明显的解决方案是从put_line()切换到put(),但这会破坏最大缓冲区大小并抛出类似"ORU-10028:行长溢出,每行限制32767字节"的信息.

An obvious solution is to switch from put_line() to put(), but that will break the maximum buffer size and throw something like "ORU-10028: line length overflow, limit of 32767 bytes per line".

您可以一次只读取一行,而不必读取固定的32k块. CLOB并不真正理解这样的行,但是您可以查找换行符,例如:

Rather than reading in fixed 32k chunks, you could read one line at a time; the CLOB doesn't really understand lines as such, but you can look for line breaks, something like:

WHILE pos < v_clob_length LOOP
  -- read to next newline if there is one, rest of CLOB if not
  if dbms_lob.instr(v_clob, chr(10), pos) > 0 then
    amount := dbms_lob.instr(v_clob, chr(10), pos) - pos;
    dbms_lob.read(v_clob, amount, pos, buffer);
    pos := pos + amount + 1; -- skip newline character
  else
    amount := 32767;
    dbms_lob.read(v_clob, amount, pos, buffer);
    pos := pos + amount;
  end if;

  dbms_output.put_line(buffer);
END LOOP;

if在当前位置之后寻找换行符.如果找到一个字符,则将其数量计算为从当前位置到该新行的字符数(或更确切地说,减去一个-因为您不希望换行符本身),它将读取那么多字符,然后调整位置加上读取的数量加一(跳过换行符-您不需要/不需要,因为put_line()会再加一).

The if looks for a newline character, after the current position. If it finds one then the amount is calculated as the number of characters from the current position to that new line (or rather, minus one - as you don't want the newline itself), it reads that many characters, and then adjusts position by the amount read plus one (to skip the newline - which you don't want/need as put_line() adds one still).

如果找不到它,那么它最多读取32k-希望只能读取一次;如果还有更多可能留下的字符而没有换行符,那么它将进行第二次读取,但仍会添加该流氓额外的新行并中断该行.使用dbms_output对此无能为力,您需要切换到utl_file写入服务器,而不是后台处理到客户端.

If it doesn't find one then it reads up to 32k - hopefully only once; if there are more than that may chars left with no line break then it'll do a second read but still add that rogue extra new line and break that line. Not much you can do about that using dbms_output though, you would need to switch to utl_file writing to the server instead of spooling to the client.

这篇关于将sqlplus结果假脱机到xml文件时不需要的新行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆