SSIS:通过SSIS执行Ironpython或Ironruby脚本 [英] SSIS: Execute Ironpython or Ironruby scripts through SSIS

查看:59
本文介绍了SSIS:通过SSIS执行Ironpython或Ironruby脚本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个小的python脚本,贯穿整个网页(http搜寻).该网页托管在Intranet内,并使用NTLM身份验证来收集对其的访问.

I have a little python script, which goes throught a web page (http-crawling). This web-page is hosted inside the intranet and uses NTLM authentication to gather access to it.

因此,我发现使用python可以轻松地编程此任务(获取http内容),而不是尝试将整个python脚本重新编写为C#,然后通过SSIS上的脚本任务"使用它,以完成任务.

So, I found this task (retrieve http-content) easily programmable using python, instead of trying to re-write the whole python script to C# and then use it througth "Script Task" on SSIS, in order to complete the task.

我仔细查看了SSIS工具,发现有一个名为"Execute Process Task"的控制流,可用于执行Win32可执行文件.

I've looked up closely to SSIS tools and I found that there is a Control Flow named "Execute Process Task", which lets you to execute Win32 executables.

但是问题在于如何调用我的python脚本,因为它不是可执行文件,需要由python解释器进行解释(如果您会原谅重复的话).因此,我可以轻松地最终构建一个简单的".bat"文件,该文件同时调用python脚本和解释器.然后通过SSIS执行流程任务"执行该文件.

But the problem resides in how to call my python script since it's not executable and needs to be interpreted by the python interpreter (if you'll forgive the repetition). So, I could easily end up building a simple ".bat" file that calls both the python script and the interpreter. And then execute that file through SSIS "Execute Process Task".

还有其他方法可以实现吗? (整洁的方式)

Is there any other way to implement this? (neat way)

从脚本中检索到的信息将把该信息存储到数据库中的表中,以便从另一个SSIS进程通过数据库表访问该信息.

The information retrieved from the script will be storing that information into a table from a database, So that information will be accessed trough the database table from another SSIS process.

我正在从不同的来源(平面文件,数据库表,http请求等)中检索信息,以便将该信息归档到可以发布在Web服务中然后可以从Excel项目访问的数据库中

I'm retrieving the information from different sources (flat files, database tables, http request, ...) in order to archive that information into a database that could be posted in a web services and then accessed from a Excel project.

预先感谢!

推荐答案

在SSIS范围内使用IronPython的最简单的机制(至少在我看来)是调用外部进程并将其转储到文件,然后使用作为数据流的来源.

The easiest, at least to my brain, mechanism for using IronPython from the confines of SSIS would be to invoke the external process and dump to a file and then use that as a source for a dataflow.

也就是说,我能够从C#托管IronPython应用程序,并使用返回的数据填充输出缓冲区并在管道中与该数据进行交互.我只有一台机器可以执行此操作,因此我列出了我记得做的所有事情,直到包装变成绿色为止.

That said, I was able to host an IronPython app from C# and use the returned data to populate the output buffers and interact with that data in the pipeline. I've only had one machine to perform this on so I'm listing everything I recall doing until the package went green.

本文使我走上了如何进行这项工作的道路. 在C#4.0程序中托管IronPython 我强烈建议您创建一个C#/VB.NET控制台应用程序,并使IronPython集成首先在该处运行,因为SSIS将为所有内容添加一个额外的层.

This article set me down the path of how to make this work. Hosting IronPython in a C# 4.0 program I would strongly urge you to create a C#/VB.NET console app and get your IronPython integration working there first as SSIS is going to add an additional layer to everything.

也许可以在C#中托管较旧版本的IronPython,而无需使用4.0框架,但这远远超出了我的能力范围.我可以说的是,要使用4.0框架,您正在查看的是SQL Server2012.2008程序包最多可以针对3.5框架(默认值为2.0).

There may be the ability to host older versions of IronPython within C# without requiring the 4.0 framework but that's far beyond the realm of my competency. What I can say is that to use the 4.0 framework, you are looking at SQL Server 2012. A 2008 package can target up to the 3.5 framework (default is 2.0).

Global Assembly Cache,简称GAC.这是Windows中签名程序集可以存在的特殊位置. SSIS也许可以使用GAC中没有的程序集,但是我没有运气.这种情况没有什么不同.我的控制台应用程序运行正常,但是当我将该代码复制到SSIS中时,它会显示Could not load file or assembly 'Microsoft.Scripting...错误消息.幸运的是,IronPython-2.7.2.1(可能是以前的版本)是经过严格签名的dll.这意味着您可以而且必须将它们添加到GAC中.

Global Assembly Cache, GAC for short. It is a special place in Windows where signed assemblies can live. SSIS may be able to use assemblies that aren't in the GAC, but I've not had luck doing so. This case was no different. My Console app worked fine but when I copied that code into SSIS, it'd tank with Could not load file or assembly 'Microsoft.Scripting... error messages. Blessedly, IronPython-2.7.2.1 (and probably previous versions) are strongly signed dlls. That means you can and must add them into the GAC.

在Visual Studio目录中,查找Visual Studio命令提示符(2010). 假设您的IronPython安装文件夹为C:\tmp\IronPython-2.7.2.1\IronPython-2.7.2.1,则键入cd C:\tmp\IronPython-2.7.2.1\IronPython-2.7.2.1,然后我注册了以下3个程序集

In your Visual Studio directory, look for the Visual Studio Command Prompt (2010). Assuming your IronPython installation folder is C:\tmp\IronPython-2.7.2.1\IronPython-2.7.2.1 you would type cd C:\tmp\IronPython-2.7.2.1\IronPython-2.7.2.1 Then I registered the following 3 assemblies

C:\tmp\IronPython-2.7.2.1\IronPython-2.7.2.1>gacutil -if Microsoft.Dynamic.dll
Microsoft (R) .NET Global Assembly Cache Utility.  Version 4.0.30319.1
Copyright (c) Microsoft Corporation.  All rights reserved.

Assembly successfully added to the cache

C:\tmp\IronPython-2.7.2.1\IronPython-2.7.2.1>gacutil -if IronPython.dll
Microsoft (R) .NET Global Assembly Cache Utility.  Version 4.0.30319.1
Copyright (c) Microsoft Corporation.  All rights reserved.

Assembly successfully added to the cache

C:\tmp\IronPython-2.7.2.1\IronPython-2.7.2.1>gacutil -if Microsoft.Scripting.dll
Microsoft (R) .NET Global Assembly Cache Utility.  Version 4.0.30319.1
Copyright (c) Microsoft Corporation.  All rights reserved.

Assembly successfully added to the cache

我的SSIS项目中,我已将Run64bitRuntime设置为False,但是在重新测试中,这没有关系.默认值为True,这似乎可以正常工作.

My SSIS project, I had set the Run64bitRuntime to False but in retesting, it does not matter. The default it True and that seems to work fine.

Python脚本-我没有足够的背景知识来使C#和.NET DLR语言之间的集成更加优美.提供一个字符串或包含我要执行的脚本的东西真是太好了,也许这就是脚本块所要解决的问题,但是我没有时间去研究.因此,此解决方案要求脚本文件位于磁盘上的某个位置.我无法从托管脚本(没有名为X异常的模块)进行导入操作.毫无疑问,类路径具有一些魔力,所有需要提供给主机的东西都可以使其正常工作.顺便说一句,这可能是另一个不同的问题.

Python script - I don't have enough of a background to make the integration between C# and .NET DLR languages more graceful. It'd have been nice to supply a string or something containing the script I wanted to execute and perhaps that's what a script block is about but I don't have time to investigate. So, this solution requires a script file sitting out somewhere on disk. I had trouble with the imports working from a hosted script (no module named X exceptions). Undoubtedly there's some magic with class paths and all that stuff that needs to provided to the host to make it work well. That's probably a different SO question btw.

我有一个文件放在C:\ ssisdata \ simplePy.py

I have a file sitting at C:\ssisdata\simplePy.py

# could not get a simple import to work from hosted
# works fine from "not hosted"
#import os

def GetIPData():
    #os.listdir(r'C:\\')
    return range(0,100)

在将脚本任务添加到数据流后,我将其配置为在输出缓冲区(wstr 1000)上具有单个列.然后,我将其用作源代码.

After adding a script task to the Data Flow, I configured it to have a single column on the output buffer (wstr 1000). I then used this as my source code.

using System;
using System.Collections.Generic;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using IronPython.Hosting;
using Microsoft.Scripting.Hosting;

/// <summary>
/// Attempt to use IP script as a source
/// http://blogs.msdn.com/b/charlie/archive/2009/10/25/hosting-ironpython-in-a-c-4-0-program.aspx
/// </summary>
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{

    /// <summary>
    /// Create data rows and fill those buckets
    /// </summary>
    public override void CreateNewOutputRows()
    {
        foreach (var item in this.GetData())
        {
            Output0Buffer.AddRow();
            Output0Buffer.Content = item;
        }

    }

    /// <summary>
    /// I've written plenty of code, but I'm quite certain this is some of the ugliest.
    /// There certainly must be more graceful means of 
    /// * feeding your source code to the ironpython run-time than a file
    /// * processing the output of the code the method call
    /// * sucking less at life
    /// </summary>
    /// <returns>A list of strings</returns>
    public List<string> GetData()
    {
        List<string> output = null;
        var ipy = Python.CreateRuntime();
        dynamic test = ipy.UseFile(@"C:\ssisdata\simplePy.py");
        output = new List<string>();
        var pythonData = test.GetIPData();
        foreach (var item in pythonData)
        {
            output.Add(item.ToString());
        }

        return output;
    }
}

我的推荐人简介

点击运行"按钮并获得巨大成功

Click the run button and great success

这篇关于SSIS:通过SSIS执行Ironpython或Ironruby脚本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆