使用Npgsql将Outlook电子邮件移动到Postgresql数据库的最快方法 [英] Fastest way to move outlook emails to postgresql db using Npgsql

查看:85
本文介绍了使用Npgsql将Outlook电子邮件移动到Postgresql数据库的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Outlook公用文件夹中大约有20万封电子邮件.

I have around 200 000 emails in outlooks public folders.

导出到pst有点快,但是我不知道psts是否可靠.同样完美地解码(带有附件)也变得很头疼. python lib不保存附件. Java库未为所有电子邮件导入html正文.

Exporting to pst is a little bit fast but I don't know if psts are reliable. Also decoding it perfectly(with attachments) became a big headache. python lib doesn’t save attachments. Java lib not importing html body for all emails.

所以,我想到了保存到postgresql db.

So, I thought of saving to postgresql db.

我正在使用Npgsql,但是速度很慢(与我使用它的方式一样).

I am using Npgsql but it is slow(with the way I am using it).

我当前的慢速方式:

sql = "insert into tablename (a,b,c) values (:aa, :bb, :cc)";  
cmd = (sql, con) #I am skipping full commands as I am reading from mind, dont think I made mistake in my code.  

NpgsqlParameter pa = new NpgsqlParameter(":aa", sometype)
#same for other 3

pa.value = somevalue

cmd.Parameters.Add(pa)
#add others also

如果我评论数据库保存部分,它确实非常快(例如10秒钟内发送1000封邮件),但是对于数据库部分,它保存10封邮件约2秒钟.

If I comment the db save part, it is really fast(like 1000 mails in 10 seconds), but with db part it is around 10 mails for 2 seconds.

我认为原因是,我有10列(htmlbody,body,headerstring等),并且正在为这10列创建新的NpgsqlParameter,为每封邮件设置它的值,这让我感觉很慢.

I think reason is, I have 10 columns (htmlbody, body, headerstring etc) and I am creating new NpgsqlParameter for these 10 columns, setting value for it for every mail which is making it slow I feel.

我无法简单地喜欢

"...values('"+htmlbody+"', '"+header_string+"'..

由于错误

syntax error at or near..so I started using npgsqlparameter.  

此外,我仅在以下情况下保存以下字段:如果header_string解析失败(不幸的是,某些电子邮件失败,尤其是在headerstring开头具有"Microsoft ..version 2"的电子邮件).它们足以代表完整的电子邮件,以防万一标题字符串无用?

Also, I am saving below fields only in case parsing the header_string fails(its failing for some emails sadly particularly the ones having 'microsoft ..version 2' at the beginning of the headerstring). Are they enough to represent full emails just in case headerstring is useless?

    sender_entry_id 
    foldername text,
    subject text,

    headerstring text,
    num_of_attachments int,

    body text,
    body_html text,

    bcc text,
    tooo text,
    frm text,
    cc text,
    conversation_id text,
    conversation_index text,
    conversation_topic text,
    message_class text,
    senton timestamp

另外,我收到此错误

System.Runtime.InteropServices.COMException (0x80040304): The operation failed.

在Microsoft.Office.Interop.Outlook.PropertyAccessorClass.GetProperty(String SchemaName)

at Microsoft.Office.Interop.Outlook.PropertyAccessorClass.GetProperty(String SchemaName)

,同时将某些附件另存为byte []变量.这是代码:

while saving some attachments as byte[] variable. Here is the code:

 const string PR_ATTACH_DATA_BIN =
"http://schemas.microsoft.com/mapi/proptag/0x37010102";

 byte[] attachmentData =
 (byte[])attachment.PropertyAccessor.GetProperty(
     PR_ATTACH_DATA_BIN);  #attachment is Outlook.Attachment

推荐答案

您可以先将附件保存到临时文件(Attachment.SaveAsFile),然后读取其数据.

You can save the attachment to a temporary file first (Attachment.SaveAsFile), then read its data.

正如Eugene所说,您可以使用Extended MAPI以IStream的形式访问附件数据,但是在C#中,只有在C ++或Delphi中,它才可以访问.

As Eugene mentioned, you can access the attachment data as IStream using Extended MAPI, but it is not accessible in C#, only in C++ or Delphi.

如果使用兑换是一种选择,则可以使用 RDOAttachment

If using Redemption is an option, you can use the AsText and AsArray properties exposed both by the RDOAttachment and Attachment object (returns by the Safe*Item objects).

这篇关于使用Npgsql将Outlook电子邮件移动到Postgresql数据库的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆