IMAP“搜索标头"搜索文本包含感叹号(!),和号(&)等时,命令失败 [英] IMAP "search header" command failing when search-text contains exclamation mark (!), ampersand (&), etc

查看:47
本文介绍了IMAP“搜索标头"搜索文本包含感叹号(!),和号(&)等时,命令失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在通过python访问GMail的IMAP接口.我运行这样的命令:

I'm accessing GMail's IMAP interface through python. I run a command like this:

UID SEARCH HEADER Message-ID "abcdef@abc.com"

成功(返回匹配消息的1 UID,如果不存在,则返回0).但是,如果搜索文本包含某些字符(如&或!),则搜索文本将在此时被截断.这意味着:

That succeeds (returns 1 UID of the matching message, or 0 if it doesn't exist). However, if the search-text contains certain chars (like & or !), the search-text is truncated at that point. This means:

UID SEARCH HEADER Message-ID "!abcdef@abc.com"

UID SEARCH HEADER Message-ID ""

也:

UID SEARCH HEADER Message-ID "abc!def@abc.com"

被视为:

UID SEARCH HEADER Message-ID "abc"

我已经通过了IMAP语言规范,而从ABNF语言规范中看来,这些字符应该是有效的.gmail为什么在!"处截断这些搜索词组和&"字符?有办法逃脱它们吗?(我已经尝试过!,但由于编码错误而失败).是否有RFC或doc显示了真正应接受的内容?这是gmail的imap实施中的错误吗?

I've gone through the IMAP language spec, and from the ABNF language spec it seems like those chars should be valid. Why is gmail truncating these search phrases at the "!" and "&" chars? Is there a way to escape them? (I've tried !, fails as a badly-encoded string). Is there an RFC or doc that shows what really should be accepted? Is this a bug in gmail's imap implementation?

我也尝试过文字格式,结果相同:

I've also tried literal format, same results:

UID SEARCH HEADER Message-ID {15}
abc!def@abc.com

仍然被视为:

UID SEARCH HEADER Message-ID {3}
abc

谢谢!

IMAP RFC3501搜索命令: http://tools.ietf.org/html/rfc3501#section-6.4.4 形式语法: http://tools.ietf.org/html/rfc3501#section-9

IMAP RFC3501 Search Command: http://tools.ietf.org/html/rfc3501#section-6.4.4 Formal syntax: http://tools.ietf.org/html/rfc3501#section-9

推荐答案

我的回答主要基于(Max)的发现,即对GMail SEARCH实现使用已拆分的后备数据库的原始问题的注释将文本内容转换成单词标记,而不是存储全文并进行子字符串搜索.

I'm largely basing my answer on the discovery (by Max) in the comments to the original question that GMail's SEARCH implementation uses a backing database that has already split textual content into word tokens rather than storing the full text and doing a substring search.

因此,这是一个可能的解决方法,您可以使用我的 MailKit 库(在C#中与GMail一起使用)一个相当底层的IMAP库,因此可以轻松地将其转换为基本的伪代码):

So here's a possible workaround that you could use with GMail in C# using my MailKit library (which is a fairly low-level IMAP library so this should easily translate into basic pseudocode):

// given: text = "abc!abcdef@abc.com"

// split the search text on '!'
var words = text.Split (new char[] { '!' }, StringSplitOptions.RemoveEmptyEntries);

// build a search query...
var query = SearchQuery.HeaderContains ("Message-ID", words[0]);
for (int i = 1; i < words.Count; i++)
    query = query.And (SearchQuery.HeaderContains ("Message-ID", words[i]));

// this will result in a query like this:
// HEADER "Message-ID" "abc" HEADER "Message-ID" "abcdef@abc.com"

// Do the UID SEARCH with the constructed query:
// A001 UID SEARCH HEADER "Message-Id" "abc" HEADER "Message-Id" "abcdef@abc.com"
var uids = mailbox.Search (query);

// Now UID FETCH the ENVELOPE (and UID) for each of the potential matches:
// A002 UID FETCH <uids> (UID ENVELOPE)
var messages = mailbox.Fetch (uids, MessageSummaryItems.UniqueId |
    MessageSummaryItems.Envelope);

// Now perform a manual comparison of the Message-IDs to get only exact matches...
var matches = new UniqueIdSet (SortOrder.Ascending);
foreach (var message in messages) {
    if (message.Envelope.MessageId.Contains (text))
        matches.Add (message.UniqueId);
}

// 'matches' now contains only the set of UIDs that exactly match your search query

这篇关于IMAP“搜索标头"搜索文本包含感叹号(!),和号(&amp;)等时,命令失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆