一次以相同顺序随机播放两个列表 [英] Shuffle two list at once with same order

查看:98
本文介绍了一次以相同顺序随机播放两个列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用nltk库的movie_reviews语料库,该语料库包含大量文档.我的任务是在没有数据预处理的情况下获得这些评论的预测性能.但是有一个问题,在列表documentsdocuments2中,我有相同的文档,因此我需要对它们进行混洗,以便在两个列表中保持相同的顺序.我无法分别对它们进行洗牌,因为每次我对列表进行洗牌时,都会得到其他结果.这就是为什么我需要立即以相同的顺序洗牌,因为我需要最后比较它们(取决于顺序).我正在使用python 2.7

I'm using the nltk library's movie_reviews corpus which contains a large number of documents. My task is get predictive performance of these reviews with pre-processing of the data and without pre-processing. But there is problem, in lists documents and documents2 I have the same documents and I need shuffle them in order to keep same order in both lists. I cannot shuffle them separately because each time I shuffle the list, I get other results. That is why I need to shuffle the at once with same order because I need compare them in the end (it depends on order). I'm using python 2.7

示例(实际上是字符串标记的字符串,但不是相对的):

Example (in real are strings tokenized, but it is not relative):

documents = [(['plot : two teen couples go to a church party , '], 'neg'),
             (['drink and then drive . '], 'pos'),
             (['they get into an accident . '], 'neg'),
             (['one of the guys dies'], 'neg')]

documents2 = [(['plot two teen couples church party'], 'neg'),
              (['drink then drive . '], 'pos'),
              (['they get accident . '], 'neg'),
              (['one guys dies'], 'neg')]

我需要在将两个列表都混洗后得到此结果:

And I need get this result after shuffle both lists:

documents = [(['one of the guys dies'], 'neg'),
             (['they get into an accident . '], 'neg'),
             (['drink and then drive . '], 'pos'),
             (['plot : two teen couples go to a church party , '], 'neg')]

documents2 = [(['one guys dies'], 'neg'),
              (['they get accident . '], 'neg'),
              (['drink then drive . '], 'pos'),
              (['plot two teen couples church party'], 'neg')]

我有此代码:

def cleanDoc(doc):
    stopset = set(stopwords.words('english'))
    stemmer = nltk.PorterStemmer()
    clean = [token.lower() for token in doc if token.lower() not in stopset and len(token) > 2]
    final = [stemmer.stem(word) for word in clean]
    return final

documents = [(list(movie_reviews.words(fileid)), category)
             for category in movie_reviews.categories()
             for fileid in movie_reviews.fileids(category)]

documents2 = [(list(cleanDoc(movie_reviews.words(fileid))), category)
             for category in movie_reviews.categories()
             for fileid in movie_reviews.fileids(category)]

random.shuffle( and here shuffle documents and documents2 with same order) # or somehow

推荐答案

您可以按照以下步骤操作:

You can do it as:

import random

a = ['a', 'b', 'c']
b = [1, 2, 3]

c = list(zip(a, b))

random.shuffle(c)

a, b = zip(*c)

print a
print b

[OUTPUT]
['a', 'c', 'b']
[1, 3, 2]

当然,这是一个列表更简单的示例,但适应情况与您的情况相同.

Of course, this was an example with simpler lists, but the adaptation will be the same for your case.

希望它会有所帮助.祝你好运.

Hope it helps. Good Luck.

这篇关于一次以相同顺序随机播放两个列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆