将值从一张纸粘贴到另一张纸并删除重复的纸 [英] Paste values from one sheet to another and remove duplicates

查看:75
本文介绍了将值从一张纸粘贴到另一张纸并删除重复的纸的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的Google电子表格中有两个工作表:

I have two worksheets in my google spreadsheet:

输入数据通过importxml函数进入Get Data工作表.

Input data is coming into the Get Data worksheet via the importxml function.

但是,我想将Get Data工作表的所有值复制到Final Data工作表,如果有重复项(就行而言),请追加唯一行.

However, I would like to copy all values of the Get Data sheet to the Final Data sheet and if there are duplicates(in terms of rows) append the unique row.

这是我尝试过的:

function onEdit() {
   //get the data from old Spreadsheet
 var ss = SpreadsheetApp.openById("1bm2ia--F2b0495iTJotp4Kv1QAW-wGUGDUROwM9B-D0");
 var dataRange = ss.getSheetByName("Get Data").getRange(1, 1, ss.getLastRow(), ss.getLastColumn());
 var dataRangeFinalData = ss.getSheetByName("Final Data").getRange(1, 1, ss.getLastRow(), ss.getLastColumn());
 var myData = dataRange.getValues();
 //Open new Spreadsheet & paste the data
newSS = SpreadsheetApp.openById("1bm2ia--F2b0495iTJotp4Kv1QAW-wGUGDUROwM9B-D0");
Logger.log(newSS.getLastRow());

newSS.getSheetByName("Final Data").getRange(newSS.getLastRow()+1, 1, ss.getLastRow(), ss.getLastColumn()).setValues(myData);
//remove duplicates in the new sheet
removeDups(dataRangeFinalData)
}

function getId() {
  Browser.msgBox('Spreadsheet key: ' + SpreadsheetApp.getActiveSpreadsheet().getId());
}

function removeDups(array) {
  var outArray = [];
  array.sort(lowerCase);
  function lowerCase(a,b){
    return a.toLowerCase()>b.toLowerCase() ? 1 : -1;// sort function that does not "see" letter case
  }
  outArray.push(array[0]);
  for(var n in array){
    Logger.log(outArray[outArray.length-1]+'  =  '+array[n]+' ?');
    if(outArray[outArray.length-1].toLowerCase()!=array[n].toLowerCase()){
      outArray.push(array[n]);
    }
  }
  return outArray;
}

您可以在下面找到指向示例电子表格的链接:

Below you can find the link to a sample spreadsheet:

示例表

我的问题是数据没有粘贴.

My problem is that the data does not get pasted.

感谢您的答复!

推荐答案

tl; dr:参见底部的脚本.

onEdit()函数不适合您的用例,因为通过电子表格功能修改的单元格内容不视为编辑"事件.您可以在此答案中了解更多信息.如果您希望将其自动化,那么定时触发功能将是适当的.或者,您可以通过菜单项手动调用该功能.我将由您决定,因为问题的实质是如何确保最终数据集中的行级唯一性.

An onEdit() function is inappropriate for your use case, as cell contents modified by spreadsheet functions are not considered "edit" events. You can read more about that in this answer. If you want this to be automated, then a timed trigger function would be appropriate. Alternatively, you could manually invoke the function by a menu item, say. I'll leave that to you to decide, as the real meat of your problem is how to ensure row-level uniqueness in your final data set.

尽管您的原始代码不完整,但您似乎打算首先使用不区分大小写的字符串比较从源数据中删除重复项.我建议改用其他JavaScript魔术在这里有所帮助.

Although your original code is incomplete, it appears you were intending to first remove duplicates from the source data, utilizing case-insensitive string comparisons. I'll suggest instead that some other JavaScript magic would help here.

我们对目标数据的唯一性感兴趣,因此我们需要一种将新行与已有行进行比较的方法.如果我们有字符串或数字数组,那么我们可以使用

We're interested in uniqueness in our destination data, so we need to have a way to compare new rows to what we already have. If we had arrays of strings or numbers, then we could just use the techniques in How to merge two arrays in Javascript and de-duplicate items. However, there's a complication here, because we have an array of arrays, and arrays cannot be directly compared.

精细-我们仍然可以逐个元素比较行,这将需要一个简单的循环遍历正在比较的行中的所有列.简单但缓慢,我们称之为 O(n 2 )解决方案(顺序为n平方).随着要比较的行数增加,唯一比较操作数将成倍增加.所以,我们不要那样做.

Fine - we could still compare rows element-by-element, which would require a simple loop over all columns in the rows we were comparing. Simple, but slow, what we would call an O(n2) solution (Order n-squared). As the number of rows to compare increased, the number of unique comparison operations would increase exponentially. So, let's not do that.

相反,我们将创建一个单独的数据结构,该数据结构镜像我们的目标数据,但是对于比较而言非常有效,它是一个

Instead, we'll create a separate data structure that mirrors our destination data but is very efficient for comparisons, a hash.

在JavaScript中,我们可以通过名称或键快速访问对象的属性.此外,该键可以是任何字符串.然后,我们可以创建一个简单的哈希表,其中包含一个对象,该对象的属性使用从目标数据行生成的字符串命名.例如,这将创建一个哈希对象,然后向其添加数组row:

In JavaScript we can quickly access the properties of an object by their name, or key. Further, that key can be any string. We can create a simple hash table then, with an object whose properties are named using strings generated from the rows of our destination data. For example, this would create a hash object, then add the array row to it:

var destHash = {};
destHash[row.join('')] = true; // could be anything

要创建我们的密钥,我们要在join数组中的所有值row中不带分隔符.现在,为了测试一行的唯一性,我们只需要检查具有相同格式键的对象属性是否存在.像这样:

To create our key, we're joining all the values in the row array with no separator. Now, to test for uniqueness of a row, we just check for existence of an object property with an identically-formed key. Like this:

var alreadyExists = destHash.hasOwnProperty(row.join(''));

另一个注意事项:由于可以想象到源数据可能包含目标数据中还没有的重复行,因此我们需要在标识出唯一行时不断扩展哈希表.

One additional consideration: since the source data can conceivably contain duplicate rows that aren't yet in the destination data, we need to continuously expand the hash table as unique rows are identified.

JavaScript提供了两个内置数组方法,我们将使用它们过滤过滤出已知行,然后仅连接唯一行到我们的目标数据.

JavaScript provides two built-in array methods that we'll use to filter out known rows, and concatenate only unique rows to our destination data.

以简单的形式,看起来像这样:

In its simple form, that would look like this:

// Concatentate source rows to dest rows if they satisfy a uniqueness filter
var mergedData = destData.concat(sourceData.filter(function (row) {
  // Return true if given row is unique
}));

您可以将其理解为创建一个名为mergedData的数组,该数组由名为destData的数组的当前内容组成,并与sourceData数组的过滤后的行串联在一起."

You can read that as "create an array named mergedData that consists of the current contents of the array named destData, with filtered rows of the sourceData array concatenated to it."

由于已经提到的其他注意事项,您将在最终函数中发现它有点复杂.

You'll find in the final function that it's a little more complex due to the other considerations already mentioned.

一旦有了我们的mergedData数组,只需将其写入目标工作表即可.

Once we have our mergedData array, it just needs to be written into the destination Sheet.

填充行:源数据包含宽度不一致的行,这在调用setValues()时会出现问题,因为该行希望所有行都被整齐.这将要求我们检查并填充行,以避免此类错误:

Padding rows: The source data contains rows of inconsistent width, which will be a problem when calling setValues(), which expects all rows to be squared off. This will require that we examine and pad rows to avoid this sort of error:

范围宽度不正确,为6,但应为5(第?行,代码"文件)

Incorrect range width, was 6 but should be 5 (line ?, file "Code")

填充行是通过push在行数组的末尾空白单元格"直到达到预期的长度来完成的.

Padding rows is done by pushing blank "cells" at the end of the row array until it reaches the intended length.

for (var col=mergedData[row].length; col<mergedWidth; col++)
  mergedData[row].push('');

在处理完每一行之后,我们终于可以写出结果了.

With that taken care of for each row, we're finally ready to write out the result.

function appendUniqueRows() {
  var ss = SpreadsheetApp.getActive();
  var sourceSheet = ss.getSheetByName('Get Data');
  var destSheet = ss.getSheetByName('Final Data');

  var sourceData = sourceSheet.getDataRange().getValues();
  var destData = destSheet.getDataRange().getValues();

  // Check whether destination sheet is empty
  if (destData.length === 1 && "" === destData[0].join('')) {
    // Empty, so ignore the phantom row
    destData = [];
  }

  // Generate hash for comparisons
  var destHash = {};
  destData.forEach(function(row) {
    destHash[row.join('')] = true; // could be anything
  });

  // Concatentate source rows to dest rows if they satisfy a uniqueness filter
  var mergedData = destData.concat(sourceData.filter(function (row) {
    var hashedRow = row.join('');
    if (!destHash.hasOwnProperty(hashedRow)) {
      // This row is unique
      destHash[hashedRow] = true;   // Add to hash for future comparisons
      return true;                  // filter -> true
    }
    return false;                   // not unique, filter -> false
  }));

  // Check whether two data sets were the same width
  var sourceWidth = (sourceData.length > 0) ? sourceData[0].length : 0;
  var destWidth = (destData.length > 0) ? destData[0].length : 0;
  if (sourceWidth !== destWidth) {
    // Pad out all columns for the new row
    var mergedWidth = Math.max(sourceWidth,destWidth);
    for (var row=0; row<mergedData.length; row++) {
      for (var col=mergedData[row].length; col<mergedWidth; col++)
        mergedData[row].push('');
    }
  }

  // Write merged data to destination sheet
  destSheet.getRange(1, 1, mergedData.length, mergedData[0].length)
           .setValues(mergedData);
}

这篇关于将值从一张纸粘贴到另一张纸并删除重复的纸的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆