根据条件插入随机值 [英] Inserting random values based on condition

查看:81
本文介绍了根据条件插入随机值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下DataFrame,其中包含有关某个产品的各种信息. Input3 是创建的句子列表,如下所示:

I have the following DataFrame containing various information about a certain product. Input3 is a list of sentences created as shown below:

sentence_list = (['Køb online her','Sammenlign priser her','Tjek priser fra 4 butikker','Se produkter fra 4 butikker', 'Stort udvalg fra 4 butikker','Sammenlign og køb'])
df["Input3"] = np.random.choice(sentence_list, size=len(df))

Full_Input 是通过连接各个列而创建的字符串,其内容类似于:品牌的产品名称-在此处在线购买-网站名称".它是这样创建的:

Full_Input is a string created by joining various columns, its content being something like: "ProductName from Brand - Buy online here - Sitename". It is created like this:

df["Full_Input"] = df['TitleTag'].astype(str) +  " " + df['Input2'].astype(str) + " " + df['Input3'].astype(str) + " " +  df['Input4'].astype(str) + " " +  df['Input5'].astype(str) 

这里的问题是 Full_Input_Length 应该在55以下.因此,我试图找出如何在随机生成 Input3 时放置条件其他列的字符串,则整个输入长度不会超过55.

The problem here is that Full_Input_Length should be under 55. Therefore I am trying to figure out how to put a condition while randomly generating Input3 so when it adds up with the other columns' strings, the full input length does not go over 55.

这是我尝试过的:

for col in range(len(df)):
    condlist = [df["Full_Input"].apply(len) < 55]
    choicelist = [sentence_list]
    df['Input3_OK'][col] = np.random.choice.select(condlist, choicelist)

不出所料,它不能那样工作. np.random.choice.select无关紧要,我遇到了AttributeError.

As expected, it doesn't work like that. np.random.choice.select is not a thing and I am getting an AttributeError.

我该怎么做呢?

推荐答案

如果保证您在Input3中至少有一项可以满足此条件,则您可能想尝试诸如仅对随机选择进行条件化的操作sentence_list中的值可以接受的长度:

If you are guaranteed to have at least one item in Input3 that will satisfy this condition, you may want to try something like conditioning your random selection ONLY on values in your sentence_list that would be of an acceptable length:

# convert to series to enable use of pandas filtering mechanism:
my_sentences = [s for s in sentence_list if len(s) < MAX_LENGTH]

# randomly select from this filtered list:
np.random.choice(my_sentences)

换句话说,在调用random.choice之前,对每个字符串列表执行过滤.

In other words, perform the filter on each list of strings BEFORE you call random.choice.

您可以像这样在数据框中的每一行运行此代码:

You can run this for each row in a dataframe like so:

def choose_string(full_input):
    return np.random.choice([
        s 
        for s in sentence_list 
        if len(s) + len(full_input) < 55
    ])

df["Input3_OK"] = df.Full_Input.map(choose_string)

这篇关于根据条件插入随机值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆