streamWriter.Write(""))在写入数据时覆盖现有UTF8编码文件的最后3个字符。 [英] streamWriter.Write(" ")) is overriding last 3 characters of my existing UTF8 encoded file while writing data in it.

查看:86
本文介绍了streamWriter.Write(""))在写入数据时覆盖现有UTF8编码文件的最后3个字符。的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有sample.text文件(UTF8编码格式),最大大小为200个字符。我在sample.text文件中有2行数据(字数= 38,总字节数= 40),如下所示:

-------------- -------------------------------
$ b $b019aIŞLETNO:044

019aIŞLETNO: 045

------------------------------ ---------------

这里我需要编写新数据(假设新数据是文件末尾的字符串newData =Hello C#.NET。这里是代码片段:

----------------------------------- -------------

  var  file =  new  FileInfo(sample.text); 
var contentUTF8 = File.ReadAllText(file.FullName,Encoding.UTF8 );
FileStream fileStream = new FileStream(sample.text,FileMode.Open,FileAccess.ReadWrite,FileShare.Read, 8192 ,FileOptions.WriteThrough);
fileStream.SetLength( 200 ); // 设置文件的最大长度并将null指定为剩余数据长度(剩余160个字符)
fileStream.Position = Encoding.UTF8.GetByteCount(contentUTF8); // 设置写入newData的新位置,该位置应为40,因此它将从第41位开始写入。
StreamWriter streamWriter = new StreamWriter(fileStream,Encoding.UTF8);
streamWriter.Write(newData);
streamWriter.Flush();



------------------------- -----------------------

由于目前sample.text的总字节长度为40,因此上面的示例代码应该从41st写入newData在将newData写入sample.text文件时,它会覆盖sample.text文件第二行的最后3个字符。

新的sample.text文件如下所示:

-------------------------------------
$ b $b019aIŞLETNO:044
$ b $b019aIŞLETNO: Hello C#.NET

--------------------- ----------------

但是sample.text应该看起来(在预期的情况下)如:

----- --------------------------------
$ b $b019aIŞLETNO:044
$ b $b019aIŞLETNO: 045Hello C#.NET

-------------------------- -----------



问题的结论是这样的,无论输入文件长度如何,当我将newData写入我的文件时,它始终会覆盖最后3个字符。仅在UTF8编码的情况下才会发生这种情况。

如果是ANSI编码文件,它可以正常工作。

任何人都可以告诉我为什么我的示例代码(streamWriter.Write(newData))覆盖了最后3个字符我现有的UTF8编码文件?

我希望我试图正确解释问题,如果需要更多信息,请告诉我。

解决方案

问题是序言。前导码是文件开头的几个字节,用于确定编码。碰巧它恰好是UTF-8的3个字节长。因此,当您设置位置时,您应该考虑到这一点:

 fileStream.Position = Encoding.UTF8.GetByteCount(contentUTF8)+ Encoding.UTF8.GetPreamble()。长度; 


I have sample.text file ( UTF8 encoded format) having maximum size of 200 character. I have 2 line of data(word count=38 and total byte count=40) present in sample.text file as given below:
---------------------------------------------
019aIŞLET NO : 044
019aIŞLET NO : 045
---------------------------------------------
Here i need to write new data (assume new data is string newData="Hello C#.NET" at the end of file. here is snippet code for that:
------------------------------------------------

var file = new FileInfo(sample.text);
var contentUTF8 = File.ReadAllText(file.FullName, Encoding.UTF8);
FileStream fileStream = new FileStream(sample.text, FileMode.Open, FileAccess.ReadWrite, FileShare.Read, 8192, FileOptions.WriteThrough);
fileStream.SetLength(200); // setting max length of file and assign null to remaining data length(remaining 160 characters)
fileStream.Position = Encoding.UTF8.GetByteCount(contentUTF8); // setting new position to write newData, which should be 40 so it will start writing from 41st position. 
StreamWriter streamWriter = new StreamWriter(fileStream, Encoding.UTF8);
streamWriter.Write(newData);
streamWriter.Flush();


------------------------------------------------
Since total byte length present sample.text is 40, above sample code should write newData from 41st position BUT while writing newData to sample.text file it overrides last 3 character of second line of sample.text file.
New sample.text file becomes as given below:
-------------------------------------
019aIŞLET NO : 044
019aIŞLET NO : Hello C#.NET
-------------------------------------
But sample.text should looks(in expected case) like:
-------------------------------------
019aIŞLET NO : 044
019aIŞLET NO : 045Hello C#.NET
-------------------------------------

Conclusion of issue is this irrespective of input file length when i am writing newData to my file it always overrides last 3 characters. This is happening only in case of UTF8 encoding.
In case of ANSI encoded file it works fine without any issue.
Can anybody tell me why my sample code(streamWriter.Write(newData)) is overriding last 3 characters of my existing UTF8 encoded file?
I hope i have tried to explain problem properly, please let me know in case more information is required.

解决方案

The problem is preamble. Preamble is a couple of bytes at the start of file which determines the encoding. As it happens it's exactly 3 bytes long for UTF-8. So when you set the position you should take that into account:

fileStream.Position = Encoding.UTF8.GetByteCount(contentUTF8) + Encoding.UTF8.GetPreamble().Length;


这篇关于streamWriter.Write(""))在写入数据时覆盖现有UTF8编码文件的最后3个字符。的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆