可以读取.txt的文件内容吗? [英] Possible to read file content for .txt ?

查看:82
本文介绍了可以读取.txt的文件内容吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

朋友,

我只想在文件上传器中上传.txt.
但是当我将fileName.exe重命名为fileName.txt并上传时,它就被上传了.
最终错了!

可以在C#中检查.txt文件的文件内容吗?

我遇到了验证.pdf,.xls,.doc
的文件内容的问题

提前感谢
~~ Karthik.J ~~

Friends,

I want to upload .txt only in file uploader.
But when i rename fileName.exe to fileName.txt and upload, it got uploaded.
Its eventually wrong!!

Is possible to check file content in C# for .txt files?

I come across validating file content for .pdf, .xls, .doc


Thanx in Advance
~~Karthik.J~~

推荐答案

文本文件可以包含任意文本. 文本"是指任何可打印的字符.因此,您可以通过检查它是否为有效的UTF-8以及所有代码点均大于31来验证它,或者如果它不是有效的UTF-8,则假定它为ANSI并检查所有字节值都大于31.我不记得UTF-8标准,所以只需检查字节值就可以了(我认为UTF-8仅使用了尚未控制字符的字节值) ).

如果您希望读取UTF-16文本文件,那么您就很费劲,因为其中的字节值可以是任何东西.

但是,只要您不将文件设置为可执行文件(Linux)或将其设置为可执行文件名(Windows),那么它是否是伪装成EXE的文件实际上是否重要?仍然无法运行.
A text file can contain, well, any text. ''Text'' means any printable character. So you could validate it by checking that it''s either valid UTF-8, and that all the code points are greater than 31, or if it''s not valid UTF-8, assume that it''s ANSI and check that all the byte values are greater than 31. I don''t remember the UTF-8 standard that well so simply checking the byte values may be okay (I think UTF-8 only uses byte values that aren''t control characters already).

If you expect to read UTF-16 text files then you''re pretty much screwed because the byte values in there can be anything.

However, as long as you don''t set the file to be executable (Linux) or set it to an executable name (Windows), does it actually matter if it''s an EXE in disguise? It can''t be run anyway.


您可以尝试读取文件,看看它是否仅包含哪些APPEARS是有效文本,而txt文件是原始数据,您没有可以检查的标头.

为什么您认为人们会重命名exe文件并上传它们?
You can try to read a file and see if it contains only what APPEARS to be valid text, but a txt file is raw data, there''s no headers you can check.

Why do you think people are going to rename exe files and upload them ?


您没有要求的完整证明方法.

您可以做的一件事是检查内容的二进制数组长度,并将其与文本字符串内容进行比较.
如果相同,那么我们确定没有不可读的字符.

最好的方法是使用一些因素,例如,如果可读性超过90%,则将其标记为文本文件.

此方法需要一些测试.

希望这会有所帮助.
欢呼
There is no full proof method what you are asking for.

One thing you can do is to check the binary array length of the content and compare it with text string content.
If same, then we are sure there is no non-readable characters.

The best way is to use some factor, like if more than 90% is readable, then mark it as text file.

This method needs some testing.

Hope this helps.
cheers


这篇关于可以读取.txt的文件内容吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆