Google Apps脚本-基于电子邮件正文标记电子邮件[优化代码] [英] Google Apps Script - Label Email Based On Email Body [Optimize Code]

查看:65
本文介绍了Google Apps脚本-基于电子邮件正文标记电子邮件[优化代码]的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我打算执行以下操作:

  • 收到符合条件的电子邮件,该电子邮件将放置在Label A文件夹中.
  • 收到新电子邮件.
  • 这封新电子邮件的正文是第一封电子邮件的正文.
  • 这封新电子邮件将跳过收件箱,并转到Label B文件夹.

这是我的实现方式:

  • 所有新电子邮件都标记为原始"
  • 脚本运行时,会将电子邮件正文与以前的所有正文进行比较.
  • 如果正文是重复项,则将其标记为重复项",移至存档",并删除原始"标签.

代码如下:

function emailLabeling() {


var DUPLICATE = _getLabel();
  var labels = GmailApp.getUserLabelByName("Original");
  if(labels != null){
    var threads = labels.getThreads();
    for (var i = 0; i < threads.length; i++){
      var messages = threads[i].getMessages();
      for (var j = 0; j < messages.length; j++){
        var message = messages[j];
        for (var k = i; k < threads.length; k++){
        var messages_check = threads[k].getMessages();
          for (var l = j; l < messages_check.length; l++){
            var message_check = messages_check[l];
            if(message_check.getPlainBody() == message.getPlainBody()){
              if(i !=  k || j != l){
                Logger.log(i +""+ j +""+ k +""+ l);
                DUPLICATE.addToThread(threads[i]);
                labels.removeFromThread(threads[i]);
                GmailApp.moveThreadToArchive(threads[i]);
              }
            }
          }
        }
      }
    }
  }
  else{
    Logger.log("Label Not Found!");
  }
}

function _getLabel() {
  var label_text = "Duplicates";
  var label = GmailApp.getUserLabelByName(label_text);
  if (label == null) {
    var label = GmailApp.createLabel(label_text);
  }
  return label;
}

代码工作正常.问题在于4个嵌套循环,随着原始"电子邮件数量的增加,它成倍地增加了运行时间.

The code works fine. The problem lies in 4 nested loop, which exponentially increases the runtime as the number of "Original" emails increase.

是否可以优化此代码? 是否有更明智的逻辑来实施此想法?

任何帮助将不胜感激.

推荐答案

在嵌套循环情况下提高性能的一种方法(尤其是重复标识)是存储遍历内容的记录,而不是重复比较.例如,您可以哈希消息正文(使用正确的哈希函数)并存储哈希作为对象属性.请注意,长度上没有长度的正式限制对象属性 ,这样您就可以自己跳过对其进行哈希处理(以获取固定长度的属性),而只需让Google Apps脚本为您完成即可.自然地,在生产中使用这样的假设之前,测试一条消息可以有多大可能是明智的.

One method of improving performance in nested loop situations - especially duplicate identification - is to store a record of traversed content, rather than repeatedly comparing. For example, you could hash the message body (given the right hash function) and store the hashes as object properties. Note that there is no formal limit on the length of an object property so you may be able to skip hashing it yourself (to obtain a fixed length property) and just let Google Apps Script do it for you. It's probably wise to test how large a message can be before using such an assumption in production, naturally.

function updateEmailLabels() {
  // Use an Object to associate a message's plaintext body with the
  // associated thread/message IDs (or other data as desired).
  var seenBodies = {}, // When a message is read, its plaintext body is stored.
      DUPLICATE = _getLabel("SO_Duplicates"),
      ORIGINAL = _getLabel("SO_Original");

  // getThreads() returns newest first. Start with the oldest by reversing it.
  ORIGINAL.getThreads().reverse().forEach(function (thread) {
    thread.getMessages().forEach(function (message, messageIndex) {
      // Use this message's body for fast lookups.
      // Assumption: Apps Script has no reachable limit on Object property length.
      var body = message.getPlainBody();

      // Compare this message to all previously seen messages:
      if (!seenBodies[body]) {
        seenBodies[body] = {
          count: 1,
          msgIndices: [ messageIndex ],
          threads: [ thread ],
          threadIds: [ thread.getId() ]
        };
      } else {
        // This exact message body has been observed previously.
        // Update information about where the body has been seen (or perform
        // more intricate checks, i.e. compare threadIds and message indices,
        // before treating this thread and message as a duplicate).
        seenBodies[body].count += 1;
        seenBodies[body].msgIndices.push(messageIndex);
        seenBodies[body].threads.push(thread);
        seenBodies[body].threadIds.push(thread.getId());
      }
    }); // End for-each-message. 
  }); // End for-each-thread.

  // All messages in all threads have now been read and checked against each other.
  // Determine the unique threads to be modified.
  var threadsToChange = {};
  for (var body in seenBodies) {
    if (seenBodies[body].count === 1)
      continue;
    var data = seenBodies[body];
    for (var threadIndex = 1; threadIndex < data.threads.length; ++threadIndex)
      threadsToChange[data.threadIds[threadIndex]] = data.threads[threadIndex];
  }
  // Update their labels and archive status.
  for (var id in threadsToChange) {
    var thread = threadsToChange[id];
    DUPLICATE.addToThread(thread);
    ORIGINAL.removeFromThread(thread);
    GmailApp.moveThreadToArchive(thread);
  }
}

function _getLabel(labelText) {
  var label = GmailApp.getUserLabelByName(labelText);
  return label ? label : GmailApp.createLabel(labelText);
}

您肯定要调整重复的检测位,因为我周围没有合格的电子邮件;)我怀疑如果至少有2条消息相同,我写的内容会将线程归类为重复,即使该线程是具有特定邮件正文的唯一线程.

You'll definitely want to tweak the duplicate detection bits, since I don't exactly have qualifying emails just laying around ;) I suspect what I've written will classify a thread as duplicate if at least 2 messages are the same, even if that thread is the only thread with that particular message body.

这篇关于Google Apps脚本-基于电子邮件正文标记电子邮件[优化代码]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆