在Python中使用数组加载数据 [英] Data loading using arrays in Python

查看：76 发布时间：2020/5/18 20:21:04 python numpy pandas

本文介绍了在Python中使用数组加载数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在.txt文件中具有这种格式的数据:

Have a data in such format in .txt file:

UserId   WordID
  1       20
  1       30
  1       40
  2       25
  2       16
  3       56
  3       44
  3       12

我正在寻找的功能-可以为每个用户ID提供结果分组的功能，从而创建wordid列表:

What I'm looking for- some function that can give the result grouping for every userid creating a list of wordid:

[[20，30，40]，[25，16]，[56，44，12]]

[[20, 30, 40], [25, 16], [56, 44, 12]]

我想做的是:

def loadSet(path='/data/file.txt'):
  datset={}
  for line in open(path+'/file.txt'):
    (userid,wordid)=line.split('\t')
    dataset.setdefault(user,{})
    dataset[userid][wordid]=float(wordid)
    return dataset

但是我不能处理.您能建议正确的做法吗?

But I cant handle with it. Can you please advice the right approach for doing that?

推荐答案

我认为您可以使用

I think you can use groupby with apply tolist with values:

print df.groupby('UserId')['WordID'].apply(lambda x: x.tolist()).values
[[20, 30, 40] [25, 16] [56, 44, 12]]

或申请list，谢谢您 BM

print df.groupby('UserId')['WordID'].apply(list).values
[[20, 30, 40] [25, 16] [56, 44, 12]]

时间:

df = pd.concat([df]*1000).reset_index(drop=True)

In [358]: %timeit df.groupby('UserId')['WordID'].apply(list).values
1000 loops, best of 3: 1.22 ms per loop

In [359]: %timeit df.groupby('UserId')['WordID'].apply(lambda x: x.tolist()).values
1000 loops, best of 3: 1.23 ms per loop

这篇关于在Python中使用数组加载数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在Python中使用数组加载数据 [英] Data loading using arrays in Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在Python中使用数组加载数据 [英] Data loading using arrays in Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭