Google Apps 脚本 - 根据电子邮件正文标记电子邮件 [优化代码] [英] Google Apps Script - Label Email Based On Email Body [Optimize Code]

查看:41
本文介绍了Google Apps 脚本 - 根据电子邮件正文标记电子邮件 [优化代码]的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的目标是:

  • 收到一封符合放置在标签 A 文件夹中的条件的电子邮件.
  • 收到一封新电子邮件.
  • 这封新电子邮件的正文与收到的第一封邮件重复.
  • 这封新电子邮件跳过收件箱并转到标签 B 文件夹.

这是我的实现方式:

  • 所有新电子邮件都标记为原始"
  • 当脚本运行时,它会将电子邮件正文与所有以前的正文进行比较.
  • 如果正文是重复的,则将其标记为重复",移至存档"并删除原始"标签.

代码如下:

function emailLabeling() {


var DUPLICATE = _getLabel();
  var labels = GmailApp.getUserLabelByName("Original");
  if(labels != null){
    var threads = labels.getThreads();
    for (var i = 0; i < threads.length; i++){
      var messages = threads[i].getMessages();
      for (var j = 0; j < messages.length; j++){
        var message = messages[j];
        for (var k = i; k < threads.length; k++){
        var messages_check = threads[k].getMessages();
          for (var l = j; l < messages_check.length; l++){
            var message_check = messages_check[l];
            if(message_check.getPlainBody() == message.getPlainBody()){
              if(i !=  k || j != l){
                Logger.log(i +""+ j +""+ k +""+ l);
                DUPLICATE.addToThread(threads[i]);
                labels.removeFromThread(threads[i]);
                GmailApp.moveThreadToArchive(threads[i]);
              }
            }
          }
        }
      }
    }
  }
  else{
    Logger.log("Label Not Found!");
  }
}

function _getLabel() {
  var label_text = "Duplicates";
  var label = GmailApp.getUserLabelByName(label_text);
  if (label == null) {
    var label = GmailApp.createLabel(label_text);
  }
  return label;
}

代码运行良好.问题在于 4 个嵌套循环,随着原始"电子邮件数量的增加,运行时间呈指数增长.

The code works fine. The problem lies in 4 nested loop, which exponentially increases the runtime as the number of "Original" emails increase.

有没有办法优化这段代码?是否有更聪明的逻辑来实现这个想法?

任何帮助将不胜感激.

推荐答案

在嵌套循环情况下提高性能的一种方法 - 特别是重复识别 - 是存储遍历内容的记录,而不是重复比较.例如,您可以散列消息正文(给定正确的散列函数)并存储散列作为对象属性.请注意,没有正式的长度限制对象属性,因此您可以自己跳过散列(以获得固定长度的属性),而让 Google Apps Script 为您完成.在生产中使用这样的假设之前,测试一条消息的大小可能是明智的,自然.

One method of improving performance in nested loop situations - especially duplicate identification - is to store a record of traversed content, rather than repeatedly comparing. For example, you could hash the message body (given the right hash function) and store the hashes as object properties. Note that there is no formal limit on the length of an object property so you may be able to skip hashing it yourself (to obtain a fixed length property) and just let Google Apps Script do it for you. It's probably wise to test how large a message can be before using such an assumption in production, naturally.

function updateEmailLabels() {
  // Use an Object to associate a message's plaintext body with the
  // associated thread/message IDs (or other data as desired).
  var seenBodies = {}, // When a message is read, its plaintext body is stored.
      DUPLICATE = _getLabel("SO_Duplicates"),
      ORIGINAL = _getLabel("SO_Original");

  // getThreads() returns newest first. Start with the oldest by reversing it.
  ORIGINAL.getThreads().reverse().forEach(function (thread) {
    thread.getMessages().forEach(function (message, messageIndex) {
      // Use this message's body for fast lookups.
      // Assumption: Apps Script has no reachable limit on Object property length.
      var body = message.getPlainBody();

      // Compare this message to all previously seen messages:
      if (!seenBodies[body]) {
        seenBodies[body] = {
          count: 1,
          msgIndices: [ messageIndex ],
          threads: [ thread ],
          threadIds: [ thread.getId() ]
        };
      } else {
        // This exact message body has been observed previously.
        // Update information about where the body has been seen (or perform
        // more intricate checks, i.e. compare threadIds and message indices,
        // before treating this thread and message as a duplicate).
        seenBodies[body].count += 1;
        seenBodies[body].msgIndices.push(messageIndex);
        seenBodies[body].threads.push(thread);
        seenBodies[body].threadIds.push(thread.getId());
      }
    }); // End for-each-message. 
  }); // End for-each-thread.

  // All messages in all threads have now been read and checked against each other.
  // Determine the unique threads to be modified.
  var threadsToChange = {};
  for (var body in seenBodies) {
    if (seenBodies[body].count === 1)
      continue;
    var data = seenBodies[body];
    for (var threadIndex = 1; threadIndex < data.threads.length; ++threadIndex)
      threadsToChange[data.threadIds[threadIndex]] = data.threads[threadIndex];
  }
  // Update their labels and archive status.
  for (var id in threadsToChange) {
    var thread = threadsToChange[id];
    DUPLICATE.addToThread(thread);
    ORIGINAL.removeFromThread(thread);
    GmailApp.moveThreadToArchive(thread);
  }
}

function _getLabel(labelText) {
  var label = GmailApp.getUserLabelByName(labelText);
  return label ? label : GmailApp.createLabel(labelText);
}

您肯定想调整重复检测位,因为我没有完全符合条件的电子邮件只是放置;) 我怀疑如果至少有 2 条消息相同,我写的内容会将线程归类为重复,即使该线程是具有该特定消息正文的唯一线程.

You'll definitely want to tweak the duplicate detection bits, since I don't exactly have qualifying emails just laying around ;) I suspect what I've written will classify a thread as duplicate if at least 2 messages are the same, even if that thread is the only thread with that particular message body.

这篇关于Google Apps 脚本 - 根据电子邮件正文标记电子邮件 [优化代码]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆