将 UUID 添加到 Pandas DF [英] Add UUID's to pandas DF
问题描述
假设我有一个像这样的 Pandas DataFrame:
Say I have a pandas DataFrame like so:
df = pd.DataFrame({'Name': ['John Doe', 'Jane Smith', 'John Doe', 'Jane Smith','Jack Dawson','John Doe']})
df:
Name
0 John Doe
1 Jane Smith
2 John Doe
3 Jane Smith
4 Jack Dawson
5 John Doe
如果名称相同,我想添加一个带有相同 uuid 的列.比如上面的DataFrame应该变成:
And I want to add a column with uuids that are the same if the name is the same. For example, the DataFrame above should become:
df:
Name UUID
0 John Doe 6d07cb5f-7faa-4893-9bad-d85d3c192f52
1 Jane Smith a709bd1a-5f98-4d29-81a8-09de6e675b56
2 John Doe 6d07cb5f-7faa-4893-9bad-d85d3c192f52
3 Jane Smith a709bd1a-5f98-4d29-81a8-09de6e675b56
4 Jack Dawson 6a495c95-dd68-4a7c-8109-43c2e32d5d42
5 John Doe 6d07cb5f-7faa-4893-9bad-d85d3c192f52
uuid 应该从 uuid.uuid4() 函数生成.
The uuid's should be generated from the uuid.uuid4() function.
我目前的想法是使用 groupby("Name").cumcount() 来识别哪些行具有相同的名称,哪些不同.然后我会创建一个字典,其中包含一个 cumcount 的键和一个 uuid 的值,并使用它来将 uuid 添加到 DF.
My current idea is to use a groupby("Name").cumcount() to identify which rows have the same name and which are different. Then I'd create a dictionary with a key of the cumcount and a value of the uuid and use that to add the uuids to the DF.
虽然这可行,但我想知道是否有更有效的方法来做到这一点?
While that would work, I'm wondering if there's a more efficient way to do this?
推荐答案
这个怎么样
names = df['Name'].unique()
for name in names:
df.loc[df['Name'] == name, 'UUID'] = uuid.uuid4()
可以缩短为
for name in df['Name'].unique():
df.loc[df['Name'] == name, 'UUID'] = uuid.uuid4()
这篇关于将 UUID 添加到 Pandas DF的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!