在VBA中的Outlook邮件中读取Excel表 [英] Read an excel table in an outlook mail in vba

查看:307
本文介绍了在VBA中的Outlook邮件中读取Excel表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试读取邮件正文中的表并将其保存在csv文件中(我在下面附加了一个HTML示例,代表我的邮件内容).我关注了一篇相关的文章:如何阅读表格使用vba粘贴到Outlook消息正文中吗?,但问题是它通过表的单元格而不是行来界定表,因此,如果不获取17行和一个标题,我将获得18 x 5 = 90个数组元素.我试图将split选项中的定界符从"vbCrLf"更改为"vbLf","vbCr",Char(10)...但是它们都不起作用...任何人都知道为什么split不能区分从新行开始的空格定界符?这似乎是读取邮件正文中表格的最简单方法,但是如果您有任何其他建议,我也将予以考虑!

I'm trying to read a table in my mail body and save it in a csv file (I attached an html sample below representing the content of my mail). I followed a related post: How to read table pasted in outlook message body using vba? but the problem is that it delimits the table by its cells and not its line so instead of getting 17 rows a one header I get 18 x 5 = 90 elements of an array...I tried to change the delimiter in the split option from "vbCrLf" to "vbLf", "vbCr", Char(10)...but none of them worked...Anyone has an idea of why the split is not distinguishing a space delimiter from a new line? This seems to be the simplest methodology to read a table in a mail body but if you have any other suggestion I will def consider it as well!

以下是示例链接:示例

谢谢

推荐答案

我的例程在 Sample.htm 上运行良好,如下图所示.我将首先解释该图像,然后给出两个健康警告.该代码超出了30,000个字符的堆栈溢出限制,因此我无法包含它.如果这很有趣,请查看我的个人资料,您将在其中找到一个电子邮件地址.给我发送电子邮件,我将向您发送验证码.

My routine worked well on Sample.htm as this image shows. I will first explain this image and then give two health warnings. The code exceeds the Stack Overflow limit of 30,000 characters so I cannot include it. If this is interesting, look at my profile where you will find an email address. Email me and I will send you the code.

图像和破烂数组的解释

此图像后面的例程旨在从网页内的表中提取数据,并将其写入单个破烂的数组中.图像中的工作表是通过将参差不齐的数组复制到其中而创建的.我使用了象牙色的背景来显示哪些细胞在衣衫array的阵列中. (注意:参差不齐的数组是其中行具有不同列数的数组.)

The routine behind this image is designed to extract data from tables within a web page and write it to a single ragged array. The worksheet in the image was created by copying the ragged array to it. I have used a background of ivory to show which cells were in the ragged array. (Note: a ragged array is one in which the rows have varying numbers of columns.)

Sample.htm 包含一个表,该表与获得的表一样简单.我不会从HTML的复杂性中猜到这一点.

Sample.htm contains a single table which is about as simple as they get. I would not have guessed that from the complexity of the Html.

我的例程忽略<table></table>之外的任何内容.在<table></table>中,它识别表的元素.单元格中的任何空格(<td></td>)都变成符合HTML规则的单个空格. <p>替换为两个换行符,而<br>替换为一个换行符.表元素以外的所有标签都将被丢弃,这样:Normal <b>bold</b> <i>italic</i>变为Normal bold italic.属性将被忽略.完成从<table></table>的例程后,例程将寻找另一个.该例程处理嵌套表.字符实体(例如&")被转换为等效的unicode字符(例如&").该例程不处理表定义中的错误;所有内容都必须正确嵌套,且不省略结束标记.

My routine ignores anything outside <table> to </table>. Within <table> to </table> it recognises the elements of a table. Any whitespace within a cell (<td> to </td>) becomes a single space in line with the rules of Html. A <p> is replaced by two linefeeds and a <br> by one line feed. Any tags other than table elements are discarded so: Normal <b>bold</b> <i>italic</i> becomes Normal bold italic. Attributes are ignored. Having finished one <table> to </table>, the routine looks for another. The routine handles nested tables. Character entities (such as "&amp;") are converted to the equivalent unicode character (such as "&"). The routine does NOT handle errors in the table definition; everything must be properly nested with no end tags omitted.

参差不齐的数组的第1行是 Sample.htm 中第一个(且仅在这种情况下)表的标题.其内容为:

Row 1 of the ragged array is the header for the first (and in this case only) table within Sample.htm. Its contents are:

1  20  5  0  0  2

1表示这是一级表.嵌套在该表单元格中的表将是第二层.嵌套在第二级表中的表将是第三级,依此类推.

The 1 says this is a level one table. A table nested within a cell of this table would be level two. A table nested within the level two table would be level three and so on.

有20行,每行最多5列.第一个零表示没有标题部分.第二个零表示没有页脚部分.这两个表示第一个(并且仅在这种情况下)主体部分从参差不齐的数组的第二行开始.如果有多个正文部分,那么在2之后还会有其他行号.

There are 20 rows each with a maximum of 5 columns. The first zero means there is no header section. The second zero means there is no footer section. The two means the first (and in this case only) body section starts at row 2 of the ragged array. There would have been other row numbers after the 2 if there were multiple body sections.

参差不齐的数组的接下来的20行是表的数据行,每行最多包含5列.

The next 20 rows of the ragged array are the data rows for the table each containing up to 5 columns.

在第2行中,只有A列是象牙. HTML表格的该行仅包含一个单元格.该单元格具有colspan属性,因此该单元格跨Html表的所有五列.尽管可以从缺少单元格中推断出colspanrowspan属性的存在,但colspan属性的存在和值不包括在参差不齐的数组中.警告:此例程从调用例程中隐藏了HTML的复杂性.它并没有掩盖表格的复杂性.幸运的是,您的表很简单,只有一个colspan属性.

In row 2, only column A is ivory. That row of the Html table only contained one cell. That cell has a colspan attribute so the cell extends across all five columns of the Html table. The existence and value of the colspan attribute is not included in the ragged array although the existence of either a colspan or a rowspan attribute can be deduced from the lack of cells. WARNING: this routine conceals the complexity of the Html from the calling routine. It does not conceal the complexity of the table. Fortunately, your table is simple with only a single colspan attribute.

Html表的第2行-参差不齐的数组和工作表的第3行-有五个空单元格.

Row 2 of the Html table - row 3 of the ragged array and the worksheet - has five empty cells.

其余单元格几乎完全与粗糙的数组中显示的单元格相同.由于HTML中单元格定义的复杂性,有些换行已成为数据中的空间.单元格数据在<p></p>内,该数据已成为数据内的LineFeed LineFeed.我已将定制代码添加到调用例程中,以丢弃空格和换行符.

The remaining cells are almost exactly as they appear in the ragged array. Because of the complexity of the cell definitions within the Html, there are linefeeds which have become a space within the data. The cell data is within <p> to </p> which has become LineFeed LineFeed within the data. I have added bespoke code to the calling routine to discard the spaces and line feeds.

健康警告1

下面的例程与您所要求的不完全相同.该代码已在Excel工作簿中经过测试,其中Sample.htm与该工作簿位于同一文件夹中.您可以创建一个例程以将所需消息的HTML正文另存为HTML文件,也可以将该代码移至Outlook并进行修改以从Outlook写入Excel.关于这两个选项都存在问题,并带有编码答案.我可以建议其他答案供您学习,但是我认为这个答案足够大.

The routine below is not quite what you ask for. The code was tested within an Excel workbook with Sample.htm being in the same folder as the workbook. You could either create a routine to save the Html body of the required message as an Html file or you could move this code to Outlook and adapt it to write to Excel from Outlook. There are questions, with coded answers, about both these options. I can recommend other answers for you to study but I think this answer is big enough.

健康警告2

下面的代码包括:

  • 我为您编写的一个微型定制宏来演示我的例程.
  • 由Dick Kusleika编写并在Stack Overflow上发布的例程,我对此情有独钟.
  • 我为我编写的许多例程.

健康警告是针对我为我编写的例程的.

The health warning is for the routines written by me for me.

我不经常使用这些例程,因此这些注释提醒我如何使用它们.它们的目的不是帮助别人了解他们的工作.

I do not use these routines very often so the comments are to remind ME how to use them. They are not designed to help someone else understand what they do.

我将Debug.Assert False ' Not tested放在代码的每个路径的顶部,然后在测试该路径时将这些语句注释掉.如果您请求代码,您将看到我尚未测试所有路径.除了一个例外,这些例程适用于我希望解码的网页.唯一例外的是网站的作者通过将表格嵌套到五个深度来进行炫耀.不幸的是,他们随后混淆了<td></td>,并且我的代码未处理无效的HTML.在运行我的例程之前,我更正了网页源,因为这对我来说是最简单的.当我对更多网页感兴趣时,我将测试更多代码,但是由于该代码对我来说我不会寻找测试用例.如果代码不适合您,请通过电子邮件将html文件发送给我,然后我会做些什么.

I place Debug.Assert False ' Not tested at the top of every path through the code and then comment these statement out when I test that path. If you request the code you will see that I have not tested all paths. With one exception, these routines work with the web pages I wish to decode. The exception is a site where the authors are showing off by nesting tables to a depth of five. Unfortunately, they then get their <td>s and </td>s muddled up and my code does not handled invalid Html. I correct the web page source before running my routine because that is the easiest for me. As I become interested in more web pages, I will test more of the code but because the code is for me I do not look for test cases. If the code falls over for you, email me the html file and I will see what I can do.

我写这些例程是因为它们处理Excel无法处理的复杂HTML.我建议您在 Sample.htm 上尝试使用Excel.真正的HTML非常简单,因此,如果Excel可以忽略格式,则它可以导入此文件.

I wrote these routines because they handle complex Html that Excel cannot. I suggest you try Excel on Sample.htm. The real Html is quite simple so if Excel can ignore the formatting it might be able to import this file.

这篇关于在VBA中的Outlook邮件中读取Excel表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆