从csv中删除一行而不复制文件 [英] Remove a single row from a csv without copying files

查看:310
本文介绍了从csv中删除一行而不复制文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有多个SO问题解决了该主题的某种形式,但是它们似乎都无法从csv文件中仅删除一行(通常涉及复制整个文件),效率非常低下.如果我的csv格式如下:

There are multiple SO questions addressing some form of this topic, but they all seem terribly inefficient for removing only a single row from a csv file (usually they involve copying the entire file). If I have a csv formatted like so:

fname,lname,age,sex
John,Doe,28,m
Sarah,Smith,27,f
Xavier,Moore,19,m

删除Sarah的行最有效的方法是什么?如果可能的话,我想避免复制整个文件.

What is the most efficient way to remove Sarah's row? If possible, I would like to avoid copying the entire file.

推荐答案

您在这里有一个基本问题.当前的文件系统(据我所知)没有提供一种从文件中间删除一堆字节的功能.您可以覆盖现有字节,或写入新文件.因此,您的选择是:

You have a fundamental problem here. No current filesystem (that I am aware of) provides a facility to remove a bunch of bytes from the middle of a file. You can overwrite existing bytes, or write a new file. So, your options are:

  • 创建不包含违规行的文件副本,删除旧的副本,并在适当位置重命名新文件. (这是您要避免的选项).
  • 使用将被忽略的内容覆盖行的字节.完全取决于 要读取文件的内容,注释字符可能起作用,或者空格可能起作用(甚至\0).但是,如果您想完全通用,则对于CSV文件,这不是选项,因为没有定义的注释字符.
  • 作为最后的绝望措施,您可以:
    • 阅读要删除的行
    • 将文件的其余部分读取到内存中
    • 并用您要保留的数据覆盖该行和所有后续行.
    • 将文件截断为最终位置(文件系统通常允许这样做).
    • Create a copy of the file without the offending line, delete the old one, and rename the new file in place. (This is the option you want to avoid).
    • Overwrite the bytes of the line with something that will be ignored. Depending on exactly what is going to read the file, a comment character might work, or spaces might work (or possibly even \0). If you want to be completely generic though, this is not an option with CSV files, because there is no defined comment character.
    • As a last desperate measure, you could:
      • read up to the line you want to remove
      • read the rest of the file into memory
      • and overwrite the line and all subsequent lines with the data you want to keep.
      • truncate the file as the final position (filesystems usually allow this).

      如果您要删除第一行,最后一个选项显然无济于事(但是,如果要删除末尾的行,这很方便).它还非常容易在过程中崩溃.

      The last option obviously doesn't help much if you are trying to remove the first line (but it is handy if you want to remove a line near the end). It is also horribly vulnerable to crashing in the middle of the process.

      这篇关于从csv中删除一行而不复制文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆