使用Selection的RangeElements获取Google文档中所有嵌套的Text Elements [英] Get all nested Text Elements in a Google Doc using Selection's RangeElements

查看:122
本文介绍了使用Selection的RangeElements获取Google文档中所有嵌套的Text Elements的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在与上述类似的文档中,我可以使用以下代码获取所有段落:

In a document similar to the above, I can get all the paragraphs with the following code:

var paras = body.getParagraphs();

请注意,上面的代码不仅返回顶级段落,而且还返回ListItem s,Table s等内部的所有子级段落.

Notice that the code above not only returns the top level paragraphs but also returns all the sub-level paragraphs inside ListItems, Tables etc.

如何在选定范围内做同样的事情?以下代码仅返回顶级元素.

How can I do the same thing within a selected range? Following code only returns top level elements.

const selection = DocumentApp.getActiveDocument().getSelection();
var rangeElements = selection.getRangeElements();

例如,上表包含9个非空段落,如果它们处于选中状态,我想一一处理它们.

For example, the table above contains 9 non-empty paragraphs and I'd like to process them one by one if they are in selection.

我要实现的目标类似于通过尽可能保留格式,表格,列表项等来翻译所选内容中的文本.

What I'm trying to achieve is similar to translating the text in a selection by preserving the formatting, tables, list items etc. as much as possible.

推荐答案

.getRangeElements()返回 Element对象非常的通用对象,几乎可以代表任何Google文档. Elements具有.getType()方法,该方法返回 ElementType 枚举其中有很多

.getRangeElements() returns an array of RangeElements. A range element is a wrapper object that is used to help us deal with partial selections. We can call .getElement() on each item in this array to get the Element object which is a very generic object that can represent almost any piece of a Google Doc. Elements have a .getType() method that return an ElementType enum; and there are a lot of them!

让我们使用到目前为止所了解的信息来查看Google文档中可能的类型(我已经创建了一个类似于您(img)的示例):

Let's use what we know so far to see what the possible types are in a Google Doc (I've created one similar to yours (img) as an example):

function selectionHasWhichTypes() {
  var doc = DocumentApp.getActiveDocument();
  var selection = doc.getSelection();
  var rangeElems = selection.getRangeElements();

  rangeElems.forEach(function(elem){
    var elem = elem.getElement();

    Logger.log(elem.getType());
  });
}

//Logger OUTPUT:
PARAGRAPH
PARAGRAPH
PARAGRAPH
PARAGRAPH
PARAGRAPH
LIST_ITEM
LIST_ITEM
LIST_ITEM
PARAGRAPH
PARAGRAPH
PARAGRAPH
TABLE
PARAGRAPH

啊哈!看来我们只需要处理 PARAGRAPH LIST_ITEM TABLE 元素类型暂时 >,但也要记住他们的孩子(我们会发现,这是5个孩子中的3个).这听起来像是递归函数的工作,它将不断地挖掘子元素,直到我们找到并处理所有子元素为止.

Ah Ha! It looks like we only have to deal with PARAGRAPH, LIST_ITEM, and TABLE ElementTypes for now, but let's keep their children in mind too (We will find out that these are 3 of 5 that can have children). This sounds like a job for a recursive function that will continually dig down into child elements until we've found and dealt with them all.

因此,我们尝试一下.下一部分可能看起来令人困惑,但本质上是要找到一个元素,检查它是否有子元素,然后查看那些元素以查看它们有孩子,等等.我们要检查是否也要获取元素类型...

So let's try that. This next part may look confusing but essentially it is finding an element, checking if it has children, then looking at those to see if they have children, and so on. We also want to check if we are getting new ElementTypes to deal with as well...

function selectionHasWhichTypes() {
  var doc = DocumentApp.getActiveDocument();
  var selection = doc.getSelection();
  var rangeElems = selection.getRangeElements();

  rangeElems.forEach(function(elem){
    var elem = elem.getElement();

    elemsHaveWhatChildElems(elem, elem.getType());

  });
}

function elemsHaveWhatChildElems(elem, typeChain){
  var elemType = elem.getType();
  if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH"){ //Lets see if element is one of our basic 3. If so they could have children.
    var numChildren = elem.getNumChildren(); //How many children are there?
    if(numChildren > 0){
      for(var i = 0; i < numChildren; i++){ //Let's go through them.
        var child = elem.getChild(i);
        elemsHaveWhatChildElems(child, typeChain + "." + child.getType()); //Recursion step to look for more children.
      }
    }else{
       Logger.log(typeChain); //Let's log the chain of Parent to Child elements.
    }
  }else{
    Logger.log("*" + typeChain); //Let's mark the new elemTypeChains we have not seen.
  }
}

//Logger OUTPUT:
*PARAGRAPH.TEXT
PARAGRAPH
*PARAGRAPH.HORIZONTAL_RULE
PARAGRAPH
*PARAGRAPH.TEXT
*LIST_ITEM.TEXT
*LIST_ITEM.TEXT
*LIST_ITEM.TEXT
PARAGRAPH
*PARAGRAPH.TEXT
PARAGRAPH
*TABLE.TABLE_ROW
*TABLE.TABLE_ROW
PARAGRAPH

好的,因此日志的每一行都是元素及其子元素的链.我们有一些新ElementTypes ( HORIZONTAL_RULE TABLE_ROW TEXT ).如果一条链只是一个Paragraph并且没有子链,则用'PARAGRAPH'表示.我们可以忽略它,因为它是空白行.我们也可以忽略HORIZONTAL_RULE,因为该显然不会包含文本.

Alright, so each line of the log is a chain of Elements and their children. We have some new ElementTypes (HORIZONTAL_RULE, TABLE_ROW, and TEXT). If a chain is only a Paragraph and has no children, indicated by 'PARAGRAPH.' we can ignore it as it is a blank line. We can also ignore HORIZONTAL_RULE as this obviously won't contain text.

如果我们已到达TEXT元素,则意味着我们可以像使用LIST_ITEM和PARAGRAPH一样执行我们的功能(即,对于OP来说就是翻译).但是,我们仍然必须处理 TableRow 对象(其记录如下:TABLE.TABLE_ROW). 类似于我们的主要3个元素,并且可以与更改为if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH" || elemType == "TABLE_ROW")if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH")一起使用.

If we have gotten to a TEXT Element it means we can perform our function (ie. for OP it would be a translation) like we have done with LIST_ITEMs and PARAGRAPHs. However, we still have to deal with TableRow Objects (which logs like this: TABLE.TABLE_ROW). This is similar to our main 3 elements and can be used with our if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH") which changes to if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH" || elemType == "TABLE_ROW").

这给我们链中的另一个新元素; TableCell (日志:TABLE.TABLE_ROW.TABLE_CELL),我们可以再次将 添加到我们的if语句中:if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH" || elemType == "TABLE_ROW" || elemType == "TABLE_CELL")

This gives us another new Element in our chain; TableCell (logs like: TABLE.TABLE_ROW.TABLE_CELL), which we can again add to our if statement making it: if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH" || elemType == "TABLE_ROW" || elemType == "TABLE_CELL")

是时候了解表元素类型了.

function selectionHasWhichtypeChains() {
  var doc = DocumentApp.getActiveDocument();
  var selection = doc.getSelection();
  var rangeElems = selection.getRangeElements();

  rangeElems.forEach(function(elem){
    var elem = elem.getElement();

    elemsHaveWhatChildElems(elem, elem.getType());

  });
}

function elemsHaveWhatChildElems(elem, typeChain){
  var elemType = elem.getType();
  if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH" || elemType == "TABLE_ROW" || elemType == "TABLE_CELL"){ //Lets see if element is one of our basic 5 if so they could have children.
    var numChildren = elem.getNumChildren(); //How many children are there?
    if(numChildren > 0){
      for(var i = 0; i < numChildren; i++){ //Let's go through them.
        var child = elem.getChild(i);
        elemsHaveWhatChildElems(child, typeChain + "." + child.getType()); //Recursion step to look for more children.
      }
    }else{
       Logger.log(typeChain); //Let's log the chain of Parent to Child elements.
    }
  }else{
    Logger.log("*" + typeChain); //Let's mark the new elemTypeChains we have not seen.
  }
}

//Logger OUTPUT:
*PARAGRAPH.TEXT
PARAGRAPH
*PARAGRAPH.HORIZONTAL_RULE
PARAGRAPH
*PARAGRAPH.TEXT
*LIST_ITEM.TEXT
*LIST_ITEM.TEXT
*LIST_ITEM.TEXT
PARAGRAPH
*PARAGRAPH.TEXT
PARAGRAPH
*TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
*TABLE.TABLE_ROW.TABLE_CELL.TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
*TABLE.TABLE_ROW.TABLE_CELL.TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
*TABLE.TABLE_ROW.TABLE_CELL.TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
*TABLE.TABLE_ROW.TABLE_CELL.TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH
*TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
*TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.HORIZONTAL_RULE
*TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
*TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
PARAGRAPH


太棒了!!我们深入到每个父元素的深处,并且达到了 a


This is great! We've reached into the depths of every parent element and reached either a Text Element or a blank paragraph! From here we can slightly modify our code to add the functions that we want to perform while maintaining the structure of the document:

function myFunction() {
  var doc = DocumentApp.getActiveDocument();
  var selection = doc.getSelection();
  var rangeElems = selection.getRangeElements(); //Get main Elements of selection

  rangeElems.forEach(function(elem){ //Let's rn through each to find ALL of their children.
    var elem = elem.getElement(); //We have an ElementType. Let's get the full element.
    getNestedTextElements(elem, elem.getType()); //Time to go down the rabbit hole.
  });
}

function getNestedTextElements(elem, typeChain){
  var elemType = elem.getType();
  if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH" || elemType == "TABLE_ROW" || elemType == "TABLE_CELL"){ //Lets see if element is one of our basic 5, if so they could have children.
    var numChildren = elem.getNumChildren(); //How many children are there?
    if(numChildren > 0){
      for(var i = 0; i < numChildren; i++){ //Let's go through them.
        var child = elem.getChild(i);
        getNestedTextElements(child, typeChain + "." + child.getType()); //Recursion step to look for more children.
      }
    }
  }else if(elemType == "TEXT"){
    //THIS IS WHERE WE CAN PERFORM OUR OPERATIONS ON THE TEXT ELEMENT
    var text = elem.getText();


  }else{
    Logger.log("*" + typeChain); //Let's log the new elem we dont deal with now - for future proofing.
  }
}

BOOM!完成.我知道这是一篇很长的文章,但是我将解决方案的每个部分都分成了几个部分,以帮助新的Apps Script编码人员理解选区的结构(我想是文档主体)以及如何在结构非常复杂(许多嵌套元素)时进行修改. 我真的希望这会有所帮助.如果有人看到可以改进的地方,请告诉我.

BOOM! Done. I know this is a really long post, but I've broken down each section of the solution into parts to help new Apps Script coders understand the structure of a Selection (and Document Body, I guess) and how to modify it when the structure is very complicated (many nested Elements). I really hope this was helpful. If anybody sees a piece that can be improved, let me know.

作为OP的注意事项:请注意,这不一定要处理Element的部分选择,但是可以通过稍微修改第一个函数以检查isPartial()来轻松解决. RangeElement .

As a note to OP: Be warned that this doesn't necessarily deal with partial selections of an Element, but that can easily be dealt with by modifying the first function a little to check for isPartial() on the RangeElement.

这篇关于使用Selection的RangeElements获取Google文档中所有嵌套的Text Elements的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆