在Python3.x中绘制列表字典(主题词嵌入) [英] Plotting Dictionary of list (topic-word embeddings) in Python3.x
问题描述
我有一本名为 topic_word的字典
I have a dictionary called "topic_word"
topic_word = {0: [[-0.669712, 0.6868, 0.9821409999999999], [-0.925967, 0.6138399999999999, 1.247525], [-1.09941, 1.0252620000000001, 1.327866]],
1: [[-0.862131, 0.890915, 1.07759], [-0.437658, 0.279271, 0.627497], [-0.437658, 0.279271, 0.627497]],
2: [[-0.671647, 0.670583, 0.937155], [-0.675347, 0.466983, 0.8505440000000001], [-0.706244, 0.612532, 0.762877]],
3: [[-0.8414590000000001, 0.797826, 1.124295], [-0.567535, 0.40820300000000004, 0.811368], [-0.800963, 0.699767, 0.9237989999999999]],
4: [[-0.8560549999999999, 1.0617020000000001, 1.579302], [-0.576105, 0.5029239999999999, 0.9392], [-0.743683, 0.69884, 0.9794930000000001]]
}
其中每个键代表一个主题(此处为0到4; 5个主题),值代表每个主题下单词的嵌入(这里每个主题都有3个字)。
如果需要归一化,如何使用二维散点图可视化数据?我可以在python 3.x中正确表示的数据
where each key represents topic ( here 0 to 4; 5 topics) and value represents embeddings of words under each topic ( here every topic has 3 words).
I want to visualize data using 2-d scatter plot
if need to normalize how can I normalize "topic_word" data that I can represent correctly in python 3.x
如何使用散点图将其可视化,该散点图将在其主题下显示词簇(点)。
如下:
How to visualize it using Scatter plot that will show cluster of words (dots) under their topics.
something as below:
import numpy as np
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
for key, value in topic_word.items():
ax.scatter(value[0],value[1],label=key)
plt.legend()
推荐答案
我从您想要对与键对应的每个列表都具有标准化值的帖子。并且,这些归一化列表中的每一个都表示为散点数据点。这是一种实现方法:
I gather from your post that you want to have normalized values for each list corresponding to a key. And, each one of these normalized lists are represented as scatter datapoints. Here's one way to do it:
import numpy as np
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
topic_word = {0: [[-0.669712, 0.6868, 0.9821409999999999], [-0.925967, 0.6138399999999999, 1.247525], [-1.09941, 1.0252620000000001, 1.327866]],
1: [[-0.862131, 0.890915, 1.07759], [-0.437658, 0.279271, 0.627497], [-0.437658, 0.279271, 0.627497]],
2: [[-0.671647, 0.670583, 0.937155], [-0.675347, 0.466983, 0.8505440000000001], [-0.706244, 0.612532, 0.762877]],
3: [[-0.8414590000000001, 0.797826, 1.124295], [-0.567535, 0.40820300000000004, 0.811368], [-0.800963, 0.699767, 0.9237989999999999]],
4: [[-0.8560549999999999, 1.0617020000000001, 1.579302], [-0.576105, 0.5029239999999999, 0.9392], [-0.743683, 0.69884, 0.9794930000000001]]
}
colorkey={0:'red',1:'blue',2:'green',3:'black',4:'magenta'} # creating a color map for keys
for key, value in topic_word.items():
valno=0 # keeping a count of number of lists under each topic_word (key)
for val in value:
meanval=np.mean(val)
stdval=np.std(val)
val = (val-meanval)/(stdval) # normalized list
ax.scatter(key*np.ones(len(val)),val,color=colorkey[key],label="Topic "+str(key) if valno == 0 else "") # label is done such that duplication of legend elements is avoided
handles, labels = ax.get_legend_handles_labels()
valno=valno+1
fig.legend(handles, labels, loc='best')
这篇关于在Python3.x中绘制列表字典(主题词嵌入)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!