如何将HTML数据加载到SQL Server(非表格式)? [英] How to load HTML data into SQL Server (non-table format)?

查看:184
本文介绍了如何将HTML数据加载到SQL Server(非表格式)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在这里发布,因为我无法在网络上找到任何这样的场景。我有一个网页,其中包含XLS和PDF格式的一组报告。我应该从页面下载excel文件并加载到我的数据库。我希望我可以直接使用XLS文件的URL,但是问题是命名规则可能会每次都会更改(Sales_Quarter1.xlsx可能是明年的Sales_Q1.xlsx)。在以下示例中唯一不变的是日历年度的销售。我应该查找与此文本对应的文件,并将其下载到加载到数据库表中。

I'm posting it here because I couldn't' find any such scenario on the web so far. I have a webpage which contains a set of reports both in XLS and PDF formats. I should be downloading the excel files from the page and load into my database. I wish I could use the URL for XLS file directly but the problem is the naming convention may keep changing every time (Sales_Quarter1.xlsx can be Sales_Q1.xlsx the next year). The only thing that would be constant in the following example is "Sales for Calendar Year". I should be looking up for the file that corresponds to this text and download it before loading it into database table.

我想从专家那里知道是否可以这样做?

I would like to know from experts if this would be possible?

<li>
   <sub>Sales for Calendar Year 2015--All Countries&#160;</sub> 
   <a href="/Data/Downloads/Documents/Sales/Sales_Quarter1.xlsx"> 
   <sub>[XLS]</sub></a><sub>&#160;, <a href="/Data/Downloads/Documents/Sales/Sales_Quarter1.pdf"><sub>[PDF]</sub></a><sub>​</sub></sub>
</li>

PS:我正在使用SQL Server 2014。

PS: I am using SQL Server 2014.

谢谢! p>

Thanks!

推荐答案

看看Integration Services。创建一个包,以便使用脚本任务,以及一个变量名称,将代表您下载的html文件和excel文件的本地文件名(您还必须解析html文件中的链接)。然后再使用 Excel Source 在您的包中。

Have a look at Integration Services. Create a package for both pulling the web page using a script task, along with a variable name that will represent your downloaded, local filenames for the html file and excel files (you will also have to parse the link out of the html file). Then utilize an Excel Source next in your package.

脚本任务中使用的excel文件的变量名称需要设置为 ReadWrite

The variable name for the excel file used in the script task will need to be set to ReadWrite as well.

您还可以安排通过SQL代理作业生成的程序包执行,如果您计划在重新执行此操作,将逻辑放入脚本或执行路径,

You can also schedule the resulting package execution via SQL Agent job, if you plan to run this on a reoccurring basis, placing logic into the script or the execution paths,

这篇关于如何将HTML数据加载到SQL Server(非表格式)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆