在python中将字典转换为二进制 [英] converting dictionary to binary in python

查看：976 发布时间：2020/5/24 3:55:08 python pandas feature-engineering

本文介绍了在python中将字典转换为二进制的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一本词典，键是我的客户ID，值是我的电影ID.尽管客户已经看过同一部电影很多次，但我还是希望将其制作成一部. 在这里，我需要将字典转换为二进制数据. 在所有行中，我都需要客户ID和列作为电影ID，如果客户已经看过电影，则给出1，否则为0.

I have a dictionary with keys as my customer ID and values as my movie id. Though the customer has watched the same movie many times, I want it to make as one. Here I need to convert my dictionary to binary data. In all the rows I need the customers ID's and columns as movie id's, where if the customer has watched the movie, it gives 1 else 0.

d = {'121212121' : 111, 222, 333, 333,444, 444, '212121212' : 222, 555, 555, 666, '212123322' : 555, 666, 666, 666, 777}

所需的输出:

customer ID 111 222 333 444 555 666 777
121212121   1   1   1   1   0   0   0
212121212   0   1   0   0   1   1   0
121323231   0   0   0   0   1   1   1

我尝试使用count vectorizer()

I have tried using count vectorizer()

代码:

cv = CountVectorizer()
movies = cv.fit_transform(cust['movies_list'])
cols = cv.vocabulary_
movies_ = pd.DataFrame(movies.toarray(), columns = cols, index = 
cust['customer_id'])
movies_

输出:

customer ID 111 222 333 444 555 666 777
212121212   1   1   2   2   0   0   0
121212121   0   1   0   0   2   1   0
121323231   0   0   0   0   1   3   1

客户ID的精巧匹配，我可以算出他看过电影的次数了.

The customer Id's dint match and I got a count on how many times he watched the movie.

推荐答案

您似乎可以使用clip_upper将正值裁剪为1.

It looks like you can just use clip_upper to clip positive values to 1.

movies_.clip_upper(1)

           111  222  333  444  555  666  777
121212121    1    1    1    1    0    0    0
212121212    0    1    0    0    1    1    0
212123322    0    0    0    0    1    1    1

这是从d开始的替代解决方案.您可以使用pd.get_dummies，然后使用clip_upper.

Here's an alternative solution starting with d. You can use pd.get_dummies, followed by clip_upper.

import pandas as pd
df = pd.concat([
          pd.Series(v, name=k).astype(str) for k, v in d.items()  # `d` is your dict
     ], 
     axis=1
)
pd.get_dummies(df.stack()).sum(level=1).clip_upper(1)

           111  222  333  444  555  666  777
121212121    1    1    1    1    0    0    0
212121212    0    1    0    0    1    1    0
212123322    0    0    0    0    1    1    1

这篇关于在python中将字典转换为二进制的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在python中将字典转换为二进制 [英] converting dictionary to binary in python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在python中将字典转换为二进制 [英] converting dictionary to binary in python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭