选择文本块并合并到新文档中 [英] Select block of text and merge into new document

查看:22
本文介绍了选择文本块并合并到新文档中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好,我在网上查过,只是找不到正确的答案.我的文件包含 <!--#start#--><!--#stop#-->.
我只想要这两个字符串之间的内容.我的代码仍然打印出所有数据,包括开始/停止行.

Hi I've looked online and just can't find the right answer. I have files which have <!--#start#--> and <!--#stop#--> through out them.
I only want the contents between those two strings. The code I have still prints out all the data including the start/stop lines.

Dim Prefix As String
Dim newMasterFilePath As String
Dim masterFileName As String
Dim newMasterFileName As String
Dim startMark As String = "<!--#start#-->"
Dim stopMark As String = "<!--#stop#-->"
'values from GUI form
searchDir = txtDirectory.Text
Prefix = txtBxUnique.Text


For Each singleFile In allFiles
    If File.Exists(singleFile.FullName) Then
        Dim fileName = singleFile.FullName
        Debug.Print("file name : " & fileName)
        ' A backup first    
        Dim backup As String = fileName & ".bak"
        File.Copy(fileName, backup, True)

        ' Load lines from the source file in memory
        Dim lines() As String = File.ReadAllLines(backup)

        ' Now re-create the source file and start writing lines inside a block
        ' Evaluate all the lines in the file.
        ' Set insideBlock to false
        Dim insideBlock As Boolean = False
        Using sw As StreamWriter = File.CreateText(backup)
            For Each line As String In lines
                If line = startMark Then
                    ' start writing at the line below
                    insideBlock = True
                    ' Evaluate if the next line is <!Stop>
                ElseIf line = stopMark Then
                    ' Stop writing
                    insideBlock = False
                ElseIf insideBlock = True Then
                    ' Write the current line in the block
                    sw.WriteLine(line)
                End If
            Next
        End Using
    End If
Next

在我代码的另一部分中,我从主文档中获取实体名称并将其替换为 start 和 stop 之间的文本

Here in another part of my code I'm grabbing the entity name from the main document and replacing it with the text between start and stop

Dim strMasterDoc = File.ReadAllText(existingMasterFilePath)
Dim newMasterFileBuilder As New StringBuilder(strMasterDoc)

'Create a regex with a named capture group.
Dim rx = New Regex("&" & Prefix & "_Ch(?<EntityNumber>\d+(?:-\d+)*)[;]")
Dim reg1 As String
reg1 = rx.ToString
Debug.Write("Chapter Entity: " & reg1)
Dim rxMatches = rx.Matches(strMasterDoc)

For Each match As Match In rxMatches
    Dim entity = match.ToString
    'Build the file name using the captured digits from the entity in the master file
    Dim entityFileName = Prefix & $"_Ch{match.Groups("EntityNumber")}.sgm"
    Dim entityFilePath = Path.Combine(searchDir, entityFileName)
    'Check if the entity file exists and use its contents
    'to replace the entity in the copy of the master file
    'contained in the StringBuilder
    If File.Exists(entityFilePath) Then
        Dim entityFileContents As String =   File.ReadAllText(entityFilePath)
        newMasterFileBuilder.Replace(entity, entityFileContents)
    End If
Next

'write the processed contents of the master file to a different file
File.WriteAllText(newMasterFilePath, newMasterFileBuilder.ToString)

推荐答案

正如我在评论中提到的,我认为问题可能在于 lines() 中的行包括回车和换行符.您是否尝试过使用 line.Contains(startMark) 而不是测试相等性?

As Mentioned in my comment i think the problem might be that the lines in lines() include the carriage return and line feed characters. have you tried to use line.Contains(startMark) instead of testing for equality?

还有;在循环检查它们之前,您是否有特定的原因要读取所有行并首先存储它们?我认为一口气读、查、写会更有效率:

Also; is there a specific reason that you are reading all the lines and storing them first, before looping over them to check them? I think it would be more efficient to read, check and write in one go:

Using SR As New StreamReader(YourFilePath)
   Using sw As New StreamWriter(OtherFilePath)
       Do Until SR.EndOfStream
            line = SR.ReadLine()
            If line.contains(startMark) Then
                  ' start writing at the line below
                  insideBlock = True

                  ' Evaluate if the next line is <!Stop>
            ElseIf line.Contains(stopMark) Then
                  ' Stop writing
                  insideBlock = False

            ElseIf insideBlock = True Then
                  ' Write the current line in the block
                  sw.WriteLine(line)
            End If
        Loop
    End Using
End Using

这篇关于选择文本块并合并到新文档中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆