删除文件中的最后一个字符 [英] Remove very last character in file

查看:86
本文介绍了删除文件中的最后一个字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

浏览了整个Internet之后,我来了.

After looking all over the Internet, I've come to this.

比方说,我已经制作了一个文本文件,内容为: Hello World

Let's say I have already made a text file that reads: Hello World

好吧,我想从此文本文件中删除最后一个字符(在本例中为d).

Well, I want to remove the very last character (in this case d) from this text file.

所以现在文本文件应如下所示:Hello Worl

So now the text file should look like this: Hello Worl

但是我不知道该怎么做.

But I have no idea how to do this.

我想要的或多或少是对HDD上文本文件的单个退格功能.

All I want, more or less, is a single backspace function for text files on my HDD.

这需要在Linux上正常运行,因为这就是我正在使用的

This needs to work on Linux as that's what I'm using.

推荐答案

使用 file.truncate() 删除文件的其余部分:

Use fileobject.seek() to seek 1 position from the end, then use file.truncate() to remove the remainder of the file:

import os

with open(filename, 'rb+') as filehandle:
    filehandle.seek(-1, os.SEEK_END)
    filehandle.truncate()

这对于单字节编码工作正常.如果您使用多字节编码(例如UTF-16或UTF-32),则需要从头开始查找足够的字节以说明单个代码点.

This works fine for single-byte encodings. If you have a multi-byte encoding (such as UTF-16 or UTF-32) you need to seek back enough bytes from the end to account for a single codepoint.

对于可变字节编码,如果您完全可以使用此技术,则取决于编解码器.对于UTF-8,您需要(从结尾开始)找到bytevalue & 0xC0 != 0x80为true的第一个字节,然后从该位置开始截断.这样可以确保您不会在多字节UTF-8代码点的中间截断:

For variable-byte encodings, it depends on the codec if you can use this technique at all. For UTF-8, you need to find the first byte (from the end) where bytevalue & 0xC0 != 0x80 is true, and truncate from that point on. That ensures you don't truncate in the middle of a multi-byte UTF-8 codepoint:

with open(filename, 'rb+') as filehandle:
    # move to end, then scan forward until a non-continuation byte is found
    filehandle.seek(-1, os.SEEK_END)
    while filehandle.read(1) & 0xC0 == 0x80:
        # we just read 1 byte, which moved the file position forward,
        # skip back 2 bytes to move to the byte before the current.
        filehandle.seek(-2, os.SEEK_CUR)

    # last read byte is our truncation point, move back to it.
    filehandle.seek(-1, os.SEEK_CUR)
    filehandle.truncate()

请注意,UTF-8是ASCII的超集,因此以上内容也适用于ASCII编码的文件.

Note that UTF-8 is a superset of ASCII, so the above works for ASCII-encoded files too.

这篇关于删除文件中的最后一个字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆