选择文本块并合并到新文档中 [英] Select block of text and merge into new document
问题描述
您好,我在网上查过,只是找不到正确的答案.我的文件包含 <!--#start#-->
和 <!--#stop#-->
.
我只想要这两个字符串之间的内容.我的代码仍然打印出所有数据,包括开始/停止行.
Hi I've looked online and just can't find the right answer. I have files which have <!--#start#-->
and <!--#stop#-->
through out them.
I only want the contents between those two strings. The code I have still prints out all the data including the start/stop lines.
Dim Prefix As String
Dim newMasterFilePath As String
Dim masterFileName As String
Dim newMasterFileName As String
Dim startMark As String = "<!--#start#-->"
Dim stopMark As String = "<!--#stop#-->"
'values from GUI form
searchDir = txtDirectory.Text
Prefix = txtBxUnique.Text
For Each singleFile In allFiles
If File.Exists(singleFile.FullName) Then
Dim fileName = singleFile.FullName
Debug.Print("file name : " & fileName)
' A backup first
Dim backup As String = fileName & ".bak"
File.Copy(fileName, backup, True)
' Load lines from the source file in memory
Dim lines() As String = File.ReadAllLines(backup)
' Now re-create the source file and start writing lines inside a block
' Evaluate all the lines in the file.
' Set insideBlock to false
Dim insideBlock As Boolean = False
Using sw As StreamWriter = File.CreateText(backup)
For Each line As String In lines
If line = startMark Then
' start writing at the line below
insideBlock = True
' Evaluate if the next line is <!Stop>
ElseIf line = stopMark Then
' Stop writing
insideBlock = False
ElseIf insideBlock = True Then
' Write the current line in the block
sw.WriteLine(line)
End If
Next
End Using
End If
Next
在我代码的另一部分中,我从主文档中获取实体名称并将其替换为 start 和 stop 之间的文本
Here in another part of my code I'm grabbing the entity name from the main document and replacing it with the text between start and stop
Dim strMasterDoc = File.ReadAllText(existingMasterFilePath)
Dim newMasterFileBuilder As New StringBuilder(strMasterDoc)
'Create a regex with a named capture group.
Dim rx = New Regex("&" & Prefix & "_Ch(?<EntityNumber>\d+(?:-\d+)*)[;]")
Dim reg1 As String
reg1 = rx.ToString
Debug.Write("Chapter Entity: " & reg1)
Dim rxMatches = rx.Matches(strMasterDoc)
For Each match As Match In rxMatches
Dim entity = match.ToString
'Build the file name using the captured digits from the entity in the master file
Dim entityFileName = Prefix & $"_Ch{match.Groups("EntityNumber")}.sgm"
Dim entityFilePath = Path.Combine(searchDir, entityFileName)
'Check if the entity file exists and use its contents
'to replace the entity in the copy of the master file
'contained in the StringBuilder
If File.Exists(entityFilePath) Then
Dim entityFileContents As String = File.ReadAllText(entityFilePath)
newMasterFileBuilder.Replace(entity, entityFileContents)
End If
Next
'write the processed contents of the master file to a different file
File.WriteAllText(newMasterFilePath, newMasterFileBuilder.ToString)
推荐答案
正如我在评论中提到的,我认为问题可能在于 lines() 中的行包括回车和换行符.您是否尝试过使用 line.Contains(startMark)
而不是测试相等性?
As Mentioned in my comment i think the problem might be that the lines in lines() include the carriage return and line feed characters. have you tried to use line.Contains(startMark)
instead of testing for equality?
还有;在循环检查它们之前,您是否有特定的原因要读取所有行并首先存储它们?我认为一口气读、查、写会更有效率:
Also; is there a specific reason that you are reading all the lines and storing them first, before looping over them to check them? I think it would be more efficient to read, check and write in one go:
Using SR As New StreamReader(YourFilePath)
Using sw As New StreamWriter(OtherFilePath)
Do Until SR.EndOfStream
line = SR.ReadLine()
If line.contains(startMark) Then
' start writing at the line below
insideBlock = True
' Evaluate if the next line is <!Stop>
ElseIf line.Contains(stopMark) Then
' Stop writing
insideBlock = False
ElseIf insideBlock = True Then
' Write the current line in the block
sw.WriteLine(line)
End If
Loop
End Using
End Using
这篇关于选择文本块并合并到新文档中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!