使用“概率"使用cpdist进行预测.作为证据 [英] Prediction with cpdist using "probabilities" as evidence

查看:185
本文介绍了使用“概率"使用cpdist进行预测.作为证据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个非常快速的问题,有一个简单的可重现的示例,该示例与我使用bnlearn进行预测的工作有关

I have a very quick question with an easy reproducible example that is related to my work on prediction with bnlearn

    library(bnlearn)
    Learning.set4=cbind(c("Yes","Yes","Yes","No","No","No"),c(9,10,8,3,2,1))
    Learning.set4=as.data.frame(Learning.set4)
    Learning.set4[,c(2)]=as.numeric(as.character(Learning.set4[,c(2)]))
    colnames(Learning.set4)=c("Cause","Cons")
    b.network=empty.graph(colnames(Learning.set4))
    struct.mat=matrix(0,2,2)
    colnames(struct.mat)=colnames(Learning.set4)
    rownames(struct.mat)=colnames(struct.mat)
    struct.mat[1,2]=1
    bnlearn::amat(b.network)=struct.mat
    haha=bn.fit(b.network,Learning.set4)


    #Some predictions with "lw" method

    #Here is the approach I know with a SET particular modality. 
    #(So it's happening with certainty, here for example I know Cause is "Yes")
    classic_prediction=cpdist(haha,nodes="Cons",evidence=list("Cause"="Yes"),method="lw")
    print(mean(classic_prediction[,c(1)]))


    #What if I wanted to predict the value of Cons, when Cause has a 60% chance of being Yes and 40% of being no?
    #I decided to do this, according the help
    #I could also make a function that generates "Yes" or "No" with proper probabilities.
    prediction_idea=cpdist(haha,nodes="Cons",evidence=list("Cause"=c("Yes","Yes","Yes","No","No")),method="lw")
    print(mean(prediction_idea[,c(1)]))

以下是帮助内容:

在离散或有序节点的情况下,也可以提供两个或多个值.在这种情况下,将从指定值集中以均匀的概率对该节点的值进行采样"

"In the case of a discrete or ordinal node, two or more values can also be provided. In that case, the value for that node will be sampled with uniform probability from the set of specified values"

当我使用分类变量预测变量的值时,我现在仅使用该变量的某种形式,如示例中的第一次预测一样. (如果将证据设置为是",则Cons会具有很高的价值)

When I predict the value of a variable using categorical variables, I for now just used a certain modality of said variable as in the first prediction in the example. (Having the evidence set at "Yes" gets Cons to take a high value)

但是,如果我想在不确定确定变量Cause的确切模态的情况下预测Cons,我可以使用在第二个预测中所做的事情(仅知道概率)吗? 这是一种优雅的方式还是有我不知道的更好实现的方式?

But if I wanted to predict Cons without knowing the exact modality of the variable Cause with certainty, could I use what I did in the second prediction (Just knowing the probabilities) ? Is this an elegant way or are there better implemented ones I don't know off?

推荐答案

我与软件包的创建者联系了,我将在此处粘贴与该问题有关的答案:

I got in touch with the creator of the package, and I will paste his answer related to the question here:

对cpquery()的调用是错误的:

The call to cpquery() is wrong:

Prediction_idea=cpdist(haha,nodes="Cons",evidence=list("Cause"=c("Yes","Yes","Yes","No","No")),method="lw")
print(mean(prediction_idea[,c(1)]))

包含40%-60%软证据的查询要求您首先将这些新概率放入网络中

A query with the 40%-60% soft evidence requires you to place these new probabilities in the network first

haha$Cause = c(0.40, 0.60)

,然后在没有证据参数的情况下运行查询. (因为您没有确凿的证据,实际上是Cause的概率分布不同.)

and then run the query without an evidence argument. (Because you do not have any hard evidence, really, just a different probability distribution for Cause.)

我将发布代码,使我能够从示例中完成所需的工作.

I will post the code that lets me do what I wanted off of the fitted network from the example.

change=haha$Cause$prob
change[1]=0.4
change[2]=0.6
haha$Cause=change
new_prediction=cpdist(haha,nodes="Cons",evidence=TRUE,method="lw")
print(mean(new_prediction[,c(1)]))

这篇关于使用“概率"使用cpdist进行预测.作为证据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆