使用Java的CSV文件验证 [英] CSV file validation with Java

查看:1129
本文介绍了使用Java的CSV文件验证的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在逐行读取文件,如下所示:

  FileReader myFile = new FileReader(File file); 
BufferedReader InputFile = new BufferedReader(myFile);
//读取第一行
String currentRecord = InputFile.readLine();

while(currentRecord!= null){
currentRecord = InputFile.readLine();
}

但是如果上传其他类型的文件,它仍然会读取其内容。例如,如果上传的文件是图像,则在读取文件时将输出垃圾字符。所以我的问题是:我怎么能检查的文件是CSV之前读取它肯定?



检查文件的扩展名是一种跛脚,因为有人可以上传文件不是CSV,但有.csv扩展名。提前感谢。

解决方案

确定文件的MIME类型不是一件容易的事,特别是如果ASCII区段可以混合



实际上,当你看看一个java邮件系统如何确定一个电子邮件的MIME类型,它涉及读取它的所有字节,并应用一些规则。

查看 MimeUtility.java





  • 如果此数据源的主类型是文本,并且如果其输入流中的所有字节都是US-ASCII,则编码是7bit。

  • 如果超过一半的字节是非美国ASCII,则编码为base64。

  • 如果不到一半

  • 如果此数据源的主要类型不是text,则如果其输入的所有字节都为流是US-ASCII,编码是7bit。

  • 如果还有一个非US-ASCII字符,则编码为base64。
    @return 7bit,quoted-printable或base64


< blockquote>

如果 mmyers 在已删除的评论中提及, JavaMimeType 应该做同样的事情,但是:




  • 自2006年以来已经死亡

  • 它涉及阅读所有内容!



    •  文件file = new File(/ home / bibi / ); 
      InputStream inputStream = new FileInputStream(file);
      ByteArrayOutputStream byteArrayStream = new ByteArrayOutputStream();
      int readByte;
      while((readByte = inputStream.read())!= -1){
      byteArrayStream.write(readByte);
      }
      String mimetype =;
      byte [] bytes = byteArrayStream.toByteArray();

      MagicMatch m = Magic.getMagicMatch(bytes);
      mimetype = m.getMimeType();

      所以...因为你正在阅读文件的所有内容,你可以利用根据该内容和您自己的规则确定类型。


      I'm reading a file line by line, like this:

       FileReader myFile = new FileReader(File file);
       BufferedReader InputFile = new BufferedReader(myFile);
       // Read the first line
       String currentRecord = InputFile.readLine();
      
       while(currentRecord != null) {
            currentRecord = InputFile.readLine();
       }
      

      But if other types of files are uploaded, it will still read their contents. For instance, if the uploaded file is an image, it will output junk characters when reading the file. So my question is: how can I check the file is CSV for sure before reading it?

      Checking extension of the file is kind of lame since someone can upload a file that is not CSV but has a .csv extension. Thanks in advance.

      解决方案

      Determining the MIME type of a file is not something easy to do, especially if ASCII sections can be mixed with binary ones.

      Actually, when you look at how a java mail system does determine the MIME type of an email, it does involve reading all bytes in it, and applying some "rules".
      Check out MimeUtility.java

      • If the primary type of this datasource is "text" and if all the bytes in its input stream are US-ASCII, then the encoding is "7bit".
      • If more than half of the bytes are non-US-ASCII, then the encoding is "base64".
      • If less than half of the bytes are non-US-ASCII, then the encoding is "quoted-printable".
      • If the primary type of this datasource is not "text", then if all the bytes of its input stream are US-ASCII, the encoding is "7bit".
      • If there is even one non-US-ASCII character, the encoding is "base64". @return "7bit", "quoted-printable" or "base64"

      As mentioned by mmyers in a deleted comment, JavaMimeType is supposed to do the same thing, but:

      • it is dead since 2006
      • it does involve reading the all content!

      :

      File file = new File("/home/bibi/monfichieratester");
      InputStream inputStream = new FileInputStream(file);
      ByteArrayOutputStream byteArrayStream = new ByteArrayOutputStream();
      int readByte;
      while ((readByte = inputStream.read()) != -1) {
          byteArrayStream.write(readByte);
      }
      String mimetype = "";
      byte[] bytes = byteArrayStream.toByteArray();
      
      MagicMatch m = Magic.getMagicMatch(bytes);
      mimetype = m.getMimeType();
      

      So... since you are reading the all content of the file anyway, you could take advantage of that to determine the type based on that content and your own rules.

      这篇关于使用Java的CSV文件验证的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆