批量DOS复制文件的最后一行,限制为65536个字符 [英] Batch DOS copying last lines of a file limited by 65 536 characters

查看:129
本文介绍了批量DOS复制文件的最后一行,限制为65536个字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个很重的1Go XML文件,其结构如下:

I have a heavy XML file of 1Go having the following structure:

 <?xml version='1.0' encoding='windows-1252'?>
 <ext:BookingExtraction>
     <Booking><Code>2016Z00258</Code><Advertiser><Code>00123</Code<Name>LOUIS VUITTON</Name></Advertiser></Booking>
     <Booking><Code>2016Z00259</Code><Advertiser><Code>00124</Code<Name>Adidas</Name></Advertiser></Booking>
 </ext:BookingExtraction>

由于结构非常简单,我的目标是获取XML文件的最后150行并将其复制到新文件中,并在第一行中添加开始标记,以形成格式正确的XML。

As the structure is really simple my goal is to get the 150 last lines of an XML file copy them into new file and add the opening tag in the first line to have a well formed XML.

该算法工作正常,但某些行包含65 536个以上的字符被分成几行。
我读到DOS限制每行的字符数为65536。这就是为什么它在这65 536个字符后添加一个回车符。

The algorithm works fine but some line having more than 65 536 characters are splitted in several lines. I read that DOS limit the number of character per line at 65 536. This is why it add a carriage enter character after this 65 536 characters.

结果是最终的XML格式不正确,因为在行的中间输入了回车符。
例如:

The result is that the final XML is not well formed because of the carriage enter in the middle of the line. For instance:

 <ext:BookingExtraction>
     <Booking><Code>2016Z00258</Code><Advertiser><Code>00123</Code><Name>LOUIS VUIT
TON</Name></Advertiser></Booking>
</ext:BookingExtraction>

我试图删除回车字符,但它不起作用。
您知道我该如何解决吗?

I tried to remove the characters carriage enter but it does not work. Do you have any idea how could I fix this?

`@echo off
setLocal EnableDelayedExpansion

::Get XML file
for /r %%a in (extractedBookings_BookingWithoutUnitsContent_PRD_*.xml) do (
    ::echo "%%~dpa" and full path is "%%~nxa"
    set fileName="%%~nxa"
)


::Get the 150 last line of the file 
    echo File path: "%fileName%"    
    for /f %%i in ('find /v /c "" ^< "%fileName%"') do set /a lines=%%i
    echo nb lines: "%lines%"
    set /a startLine=%lines% - 150
    echo Start line "%startLine%"
    more /e +%startLine% "%fileName%" > extractedBookings_BookingWithoutUnitsContent_PRD.xml



::adding opening tag to the new file
    echo ^<?xml version='1.0' encoding='windows-1252'?^> > newFile.xml
    echo ^<ext:BookingExtraction^> >> newFile.xml

::Get the final file
   type extractedBookings_BookingWithoutUnitsContent_PRD.xml >> newFile.xml
   type newFile.xml > extractedBookings_BookingWithoutUnitsContent_PRD.xml`

提前谢谢

推荐答案

您的问题令人困惑; DOS将行数限制为65 536个字符的短语不准确。当 more命令的输出重定向到磁盘文件时,它将等待65536个之后的字符,并将该字符插入输出中。另外,FIND命令中的最大行长是1070个字符(根据此站点),所以我猜测您的文件行短。您只需要一种可以干净地输出超过64K行的方法。

Your question is confusing; the "DOS limit the number of line at 65 536 characters" phrase is imprecise. When the output of more command is redirected to a disk file, it waits for a character after 65536 lines, and such character is inserted in the output. Also, the max line length in FIND command is 1070 characters (accordingly to this site), so I guess that your file have shorter lines. You just need a method that can cleanly output more than 64K lines.

以下解决方案基本上是您的相同代码,但是使用了的组合设置/ P 命令跳过第一行,并设置 findstr 命令显示其余内容,而不是您的 more +% startLine%命令。

The solution below is basically your same code, but it uses a combination of set /P command to skip the first lines and a findstr command to show the rest, instead of your more +%startLine% command.

@echo off
setLocal EnableDelayedExpansion

::Get XML file
for /r %%a in (extractedBookings_BookingWithoutUnitsContent_PRD_*.xml) do (
    ::echo "%%~dpa" and full path is "%%~nxa"
    set fileName="%%~nxa"
)


::Get the 150 last line of the file 
    echo File path: "%fileName%"    
    for /f %%i in ('find /v /c "" ^< "%fileName%"') do set /a lines=%%i
    echo nb lines: "%lines%"
    set /a startLine=%lines% - 150
    echo Start line "%startLine%"

    REM Use a code block to read from redirected input file (and write to output file)
    < "%fileName%" (

       rem adding opening tag to the new file
       echo ^<?xml version='1.0' encoding='windows-1252'?^>
       echo ^<ext:BookingExtraction^>

       REM Skip the first total-150 lines
       for /L %%i in (1,1,%startLine%) do set /P "="

       REM Copy the rest
       findstr "^"

    ) > extractedBookings_BookingWithoutUnitsContent_PRD.xml

如果输入行长于1023个字符,则此方法可能仍然失败,因为这是 set / P 命令的限制。

This method may still fail if an input line is longer than 1023 characters, because this is the limit of set /P command.

这篇关于批量DOS复制文件的最后一行,限制为65536个字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆