如何在Python中删除文件中的重复行 [英] how to delete duplicate lines in a file in Python

查看：992 发布时间：2017/11/3 20:01:01 python file line

本文介绍了如何在Python中删除文件中的重复行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个重复行的文件。我想要的是删除一个副本有一个独特的文件行。但我得到一个错误 output.writelines（uniquelines（filelines））
TypeError：writelines（）参数必须是字符串序列
我已经搜索了相同的问题，但我仍然不明白什么是错的。
我的代码：

pre $

 def uniquelines（lineslist）：
 unique = {} 
 result 
如果item.strip（）在唯一：继续
 unique [item.strip（）] = 1 
 result.append（item） 
返回结果
 file1 = codecs.open（'organizations.txt'，'r +'，'cp1251'）
 filelines = file1.readlines（）
 file1.close（） 
 output = open（wordlist_unique.txt，w）
 output.writelines（uniquelines（filelines））
 output.close（）

解决方案

代码使用不同的打开方式： codecs.open 当它读取时，打开当它写入。

readlines codecs.open 创建的文件对象的c>返回unicode字符串列表。使用打开创建的文件对象的 writelines 期望一串（字节）字符串

替换以下行：

  output = open（wordlist_unique.txt，w） 
 output.writelines（uniquelines（filelines））
 output.close（）

with：

  output = codecs.open（wordlist_unique.txt，w，cp1251）
 output.writelines（uniquelines（filelines））
 output.close（）

（使用和语句）：

  with codecs.open（wordlist_unique输出：
 output.writelines（uniquelines（filelines））

I have a file with duplicate lines. What I want is to delete one duplicate to have a file with unique lines. But i get an error output.writelines(uniquelines(filelines)) TypeError: writelines() argument must be a sequence of strings I have searched the same issues but i still don-t understand what is wrong. My code:

def uniquelines(lineslist):
    unique = {}
    result = []
    for item in lineslist:
        if item.strip() in unique: continue
        unique[item.strip()] = 1
        result.append(item)
    return result
file1 = codecs.open('organizations.txt','r+','cp1251')
filelines = file1.readlines()
file1.close()
output = open("wordlist_unique.txt","w")
output.writelines(uniquelines(filelines))
output.close()

解决方案

The code uses different open: codecs.open when it reads, open when it writes.

readlines of file object created using codecs.open returns list of unicode strings. While writelines of file objects create using open expect a sequence of (bytes) strings.

Replace following lines:
output = open("wordlist_unique.txt","w") output.writelines(uniquelines(filelines)) output.close()
with:
output = codecs.open("wordlist_unique.txt", "w", "cp1251") output.writelines(uniquelines(filelines)) output.close()
or preferably (using with statement):
with codecs.open("wordlist_unique.txt", "w", "cp1251") as output: output.writelines(uniquelines(filelines))

这篇关于如何在Python中删除文件中的重复行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在Python中删除文件中的重复行 [英] how to delete duplicate lines in a file in Python

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何在Python中删除文件中的重复行 [英] how to delete duplicate lines in a file in Python

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭