SQL Server和R,数据挖掘 [英] Sql server and R, data mining

查看:91
本文介绍了SQL Server和R,数据挖掘的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Microsoft SQL Management Studio 2016,该功能使我可以将R脚本添加到SQL代码中。
我的目标是实现aPriori算法过程,该过程将数据以我喜欢的方式放置,即具有x,第一个对象,y,第二个对象的表。



我被困在这里,因为我认为数据存在一些问题。错误是这样。


在执行
'sp_execute_external_script'且HRESULT为0x80004004的过程中发生了'R'脚本错误。 p>

发生外部脚本错误:eval(expr,envir,enclos)中的错误
:分配错误调用:源-> withVisible-> eval-> eval->呼叫


这是我的代码。
源数据是两列的表,如下所示:

  AB 
af
fa
bc
...
yz

这里是代码:

  GO 
创建过程dbo.apriorialgorithm as

-删除旧表
如果OBJECT_ID('Data')不为空
DROP TABLE数据

-创建一个存储查询结果的表。
CREATE TABLE数据(art1 nvarchar(100),art2 nvarchar(100));

-存储查询
插入数据(art1,art2)
选择
firstfield作为art1,
secondfield作为art2
从allthefields
;

如果OBJECT_ID('输出')不为空
DROP TABLE输出
-创建分析结果表。

CREATE TABLE输出(x nvarchar(100),y nvarchar(100));

插入到输出(x,y)
-R脚本。
EXECUTE sp_execute_external_script
@语言= N'R'
,@script = N'

现在是R脚本。我从查询中获得的数据是数字的,但是对于先验而言,我需要因子,因此首先将数据弯曲为因子。

  df< -data.frame(x = as.factor( art1),y = as.factor( art2 ))

然后,我可以应用先验值:

 库( arules); 
library( arulesViz);

规则=先验(df,parameter = list(minlen = 2,support = 0.05,confidence = 0.05));

我需要没有规则格式的数据,而只是对象:

  Ruledf<-data.frame(
lhs<-标签(lhs(rules)),
rhs< -标签(rhs(rules)),
rules @ quality)

a <-substr(ruledf $ lhs,7,nchar(as.character(ruledf $ lhs))-1)
b< -substr(ruledf $ rhs,7,nchar(as.character(ruledf $ rhs))-1)

Ruledf2< -data.frame(a,b)
'

最后一部分:

 ,@ input_data_1 = N'SELECT * from Data'
,@ output_data_1_name = N'ruledf2'
,@ input_data_1_name = N'ruledf2';

GO

我不知道我在哪里失败,因为这样做在R中使用RODBC捕获数据库数据的相同操作,一切正常。
您能帮我吗?

解决方案

问题出在这里,R脚本这样更好:

  EXECUTE sp_execute_external_script 
@language = N'R'
,@ script = N'

库(规则);

规则= apriori(df [,c( art1, art2)],参数=列表(minlen = 2,support = 0.0005,置信度= 0.0005));

Ruledf<-data.frame(
lhs<-标签(lhs(rules)),
rhs<-标签(rhs(rules)),
rules @ quality)

ruledf2< -data.frame(
lhs2< -substr(ruledf $ lhs,7,nchar(as.character(ruledf $ lhs))-1) ,
rhs2< -substr(ruledf $ rhs,7,nchar(as.character(Ruledf $ rhs))-1)


colnames(ruledf2)<- c( a, b)'

然后它需要正确的输入和输出:

 ,@ input_data_1 = N'SELECT * from数据'
,@ input_data_1_name = N'df'
,@ output_data_1_name = N'ruledf2'

所以结果将是一个名为output的表像这样

  xy 
ar $ artB
artB $ a
...
artY artZ


I'm working on Microsoft SQL Management Studio 2016, using the feature that make me to add an R script into the SQL code. My goals is to achieve an aPriori algorithm procedure, that puts the data in a manner that I like, i.e. a table with x, first object, y, second object.

I am stuck here, because in my opinion I have some problem in data. The error is this.

A 'R' script error occurred during execution of 'sp_execute_external_script' with HRESULT 0x80004004.

An external script error occurred: Error in eval(expr, envir, enclos) : bad allocation Calls: source -> withVisible -> eval -> eval -> .Call

Here my code. The source data are a table of two column like this:

A   B
a   f
f   a
b   c
...
y   z

And here the code:

GO
    create procedure dbo.apriorialgorithm as

-- delete old table
IF OBJECT_ID('Data') IS NOT NULL
    DROP TABLE Data

-- create a table that store the query result.
CREATE TABLE Data ( art1 nvarchar(100),  art2 nvarchar(100)); 

-- store the query
INSERT INTO Data ( art1,  art2)
select
  firstfield as art1,
  secondfield as art2
  from allthefields
  ; 

IF OBJECT_ID('output') IS NOT NULL
    DROP TABLE output
-- create table of the results of the analysis.

CREATE TABLE output (x nvarchar(100), y nvarchar(100)); 

INSERT INTO output (x, y)
-- R script.
 EXECUTE  sp_execute_external_script
                @language = N'R'
              , @script = N'

Now the R script. The data that I get from the query are numeric, but for the apriori, I need factors, so first I bend the data to factor;

                     df<-data.frame(x=as.factor("art1"),y=as.factor("art2"))

Then, I can apply the apriori:

                        library("arules");
                        library("arulesViz");

                        rules = apriori(df,parameter=list(minlen=2,support=0.05, confidence=0.05));

I need the data without the format of the rules, but simply the objects:

                        ruledf <- data.frame(
                              lhs <- labels(lhs(rules)),
                              rhs <- labels(rhs(rules)), 
                              rules@quality)

                        a<-substr(ruledf$lhs,7,nchar(as.character( ruledf$lhs))-1)
                        b<-substr(ruledf$rhs,7,nchar(as.character( ruledf$rhs))-1)

                        ruledf2<-data.frame(a,b)
                        '

And the last part:

              , @input_data_1 = N'SELECT * from Data'
              , @output_data_1_name = N'ruledf2'
              , @input_data_1_name = N'ruledf2';

GO

I do not know where I am failing, because doing the same things in R using RODBC to catch the db data, everything is ok. Could you help me? Thanks in advance!

解决方案

The problem was here, the R script is better this way:

EXECUTE  sp_execute_external_script
                @language = N'R'
              , @script = N'                

                        library("arules");

                        rules = apriori(df[, c("art1", "art2")], parameter=list(minlen=2,support=0.0005, confidence=0.0005));

                        ruledf <- data.frame(
                              lhs <- labels(lhs(rules)),
                              rhs <- labels(rhs(rules)), 
                              rules@quality)

                        ruledf2<-data.frame(
                              lhs2<-substr(ruledf$lhs,7,nchar(as.character( ruledf$lhs))-1),
                              rhs2<-substr(ruledf$rhs,7,nchar(as.character( ruledf$rhs))-1)
                                           )

                        colnames(ruledf2)<-c("a","b")  '

Then it needs to have the right input and output:

      , @input_data_1 = N'SELECT * from Data'
      , @input_data_1_name = N'df'
      , @output_data_1_name = N'ruledf2'

So the result is going to be a table named output like this

x    y
artA artB
artB artA
...
artY artZ

Very helpful this.

这篇关于SQL Server和R,数据挖掘的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆