使用Google Docs API检索namedRange中的文本 [英] Retrieve text in namedRange with Google Docs API

查看:52
本文介绍了使用Google Docs API检索namedRange中的文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用带有Node的Google Docs/Drive API,我成功地提供了一项服务,该服务可以生成模板"样式的文档,这些文档具有名为Ranges的特征,可供其他用户写入.我想使用Google Docs API读取在这些范围内输入的文本,但是看不到一种干净的方法.鉴于我拥有每个范围的开始和结束索引,我认为这将非常简单!不幸的是,我看不到任何内置的方法?

Using the Google Docs/Drive APIs with Node, I've successfully made a service which produces 'template' style documents which feature namedRanges for other users to write into. I'd like to use the Google Docs API to read the text that gets entered inside of these ranges, but can't see a clean way of doing so. Given that I have the start and end indices of each range, I thought this would be very simple! Unfortunately I can't see any built-in way of doing it?

当前看来,我将不得不请求整个Google文档,并针对我正在查看的每个范围,比较每个节点的开始/结束索引,然后递归遍历树,直到它们匹配为止.没有更好的方法吗?

Currently it looks like I will have to request the whole google doc, and for each range that I'm watching, compare each node's start/end index and recursively traverse down the tree until they match. Is there not a better way of doing this?

欢呼

Tanaike的以下解决方案更加简洁,但是我已经在Firebase Function上运行了一个版本,因此我想也可以共享它.此代码检索具有给定ID的Google文档,并将namedRanges的内容作为字符串存储在Firebase实时数据库中,并通过"BBCode"保持图像和表格的完整性.样式标签.下面的代码(请注意,我知道每个namedRange都位于表单元格内,这使查找它们变得更加容易):

Tanaike's solution below is cleaner, but I had already got a version working on my Firebase Function so thought I might as well share it. This code retrieves a Google Doc with the given ID and stores the contents of the namedRanges as strings within the a Firebase Realtime Database, keeping images and tables intact though "BBCode" style tags. Relevent code below (note that I know that each namedRange is inside of a table cell, which makes finding them easier):

async function StoreResponses(oauth2Client, numSections, documentId, meetingId, revisionId, roomId) 
{
    var gdocsApi = google.docs({version: 'v1', auth: oauth2Client});

    return gdocsApi.documents.get({ "documentId": documentId })
    .then((document) => {
        
        var ranges = document.data.namedRanges;
        var docContent = document.data.body.content;

        var toStore = [];

        for(var i = 0; i < numSections; i++)
        {
            var range = ranges[`zoomsense_section_${i}`].namedRanges[0].ranges[0]
            
            // loop through document contents until we hit the right index
            for(var j = 0; j < docContent.length; j++)
            {
                if(docContent[j].startIndex <= range.startIndex && docContent[j].endIndex >= range.endIndex)
                {
                    // we know that the ranges are inside single table cells
                    var sectionContents = docContent[j].table.tableRows[0].tableCells[0].content;

                    toStore.push(readStructuralElementsRecursively(document, sectionContents));
                }
            }
        }

        return db.ref(`/data/gdocs/${meetingId}/${roomId}/${documentId}/revisions/${revisionId}/responses`).set(toStore);
    })
    .catch((exception) => {
        console.error(exception)
        res.status(500).send(exception);
    })
}

// uses https://developers.google.com/docs/api/samples/extract-text
function readStructuralElementsRecursively(document, elements)
{
    var text = "";
    elements.forEach(element => {
        if(element.paragraph)
        {
            element.paragraph.elements.forEach(elem => {
                text += readParagraphElement(document, elem);
            });
        }
        else if(element.table)
        {
            // The text in table cells are in nested Structural Elements, so this is recursive
            text += "[table]"
            element.table.tableRows.forEach(row => {
                text += "[row]"
                row.tableCells.forEach(cell => {
                    text += `[cell]${readStructuralElementsRecursively(document, cell.content)}[/cell]`;
                })
                text += "[/row]"
            })
            text+= "[/table]"
        }
    });

    return text;
}

// handle text and inline content
function readParagraphElement(document, element)
{
    if(element.textRun)
    {
        // standard text
        return element.textRun.content;
    }
    if(element.inlineObjectElement)
    {
        var objId = element.inlineObjectElement.inlineObjectId;
        var imgTag = "\n[img]404[/img]"

        try
        {
            var embeddedObj = document.data.inlineObjects[objId].inlineObjectProperties.embeddedObject;
            if(embeddedObj.imageProperties)
            {
                // this is an image
                imgTag = `[img]${embeddedObj.imageProperties.contentUri}[/img]`
            }
            else if(embeddedObj.embeddedDrawingProperties)
            {
                // this is a shape/drawing
                // can't find any way to meaningfully reference them externally,
                // so storing the ID in case we can do it later
                imgTag = `[drawing]${objId}[/drawing]`
            }
        }
        catch(exception)
        {
            console.log(exception)
        }
         
        return imgTag;
    }
}

推荐答案

我相信您的目标如下.

  • 您要从Google文档的命名范围中检索值.
  • 在您的Google文档中,已设置了命名范围.
  • 您想使用Node.js实现此目的.
    • 很遗憾,根据您的问题,我无法确认您正在使用的库是否已使用Docs API.
    • You want to retrieve the values from the named range on Google Document.
    • In your Google Document, the named ranges have already been set.
    • You want to achieve this using Node.js.
      • Unfortunately, from your question, I couldn't confirm the library, you are using, for using Docs API.

      为了实现上述目标,我想提出以下解决方法.

      In order to achieve above, I would like to propose the following workarounds.

      不幸的是,在当前阶段,还没有方法可以直接从Google Docs API中的命名范围中检索值.我相信将来可能会添加这种方法,因为Docs API现在正在增长.因此,作为当前使用Docs API的解决方法,需要执行以下流程.

      Unfortunately, in the current stage, there are no methods for directly retrieving the values from the named range in the Google Docs API. I believe that such method might be added in the future, because Docs API is growing now. So as the current workaround using Docs API, it is required to do the following flow.

      1. 使用Docs API中的documents.get方法检索Google Document对象.
      2. 使用命名范围的名称检索startIndexendIndex.
      3. 使用startIndexendIndex检索值.
      1. Retrieve the Google Document object using the method of documents.get in Docs API.
      2. Retrieve startIndex and endIndex using the name of the named range.
      3. Retrieve the values using startIndex and endIndex.

      这已在您的问题中提及.使用Google Docs API时,在当前阶段需要使用此方法.但是,当使用Google文档服务时,可以通过命名范围的名称和/或ID直接检索命名范围的值.在这个答案中,我想提出这种方法作为另一种解决方法.

      This has already been mentioned in your question. When Google Docs API is used, in the curent stage, this method is require to be used. But when Google Document service is used, the values of the named range can be directly retrieved by the name and/or the ID of the named range. In this answer, I would like to propose this method as another workaround.

      请执行以下流程.

      Web Apps的示例脚本是Google Apps脚本.因此,请创建一个Google Apps脚本项目.为了使用文档服务,在这种情况下,将Web Apps用作包装器.

      Sample script of Web Apps is a Google Apps Script. So please create a project of Google Apps Script. In order to use Document service, in this case, Web Apps is used as the wrapper.

      如果您想直接创建它,请访问 https://script.new/.在这种情况下,如果您未登录Google,则会打开登录"屏幕.因此,请登录到Google.这样,将打开Goog​​le Apps脚本的脚本编辑器.

      If you want to directly create it, please access to https://script.new/. In this case, if you are not logged in Google, the log in screen is opened. So please log in to Google. By this, the script editor of Google Apps Script is opened.

      请复制以下脚本(Google Apps脚本)并将其粘贴到脚本编辑器中.该脚本适用于Web Apps.

      Please copy and paste the following script (Google Apps Script) to the script editor. This script is for the Web Apps.

      function doGet(e) {
        Object.prototype.getText = function() {return this.getRange().getRangeElements().map(e => e.getElement().asText().getText().slice(e.getStartOffset(), e.getEndOffsetInclusive() + 1))};
        const doc = DocumentApp.openById(e.parameter.id);
        let res;
        if (e.parameter.name) {
          const ranges = doc.getNamedRanges(e.parameter.name);
          res = ranges.length > 0 ? ranges[0].getText() : [];
        } else if (e.parameter.rangeId) {
          const range = doc.getNamedRangeById(e.parameter.rangeId.split(".")[1]);
          res = range ? range.getText() : [];
        } else {
          res = [];
        }
        return ContentService.createTextOutput(JSON.stringify(res));
      }
      

      3.部署Web应用.

      1. 在脚本编辑器上,通过发布"打开对话框. -> 部署为网络应用".
      2. 选择我" 作为将应用程序执行为:" .
        • 通过这种方式,脚本以所有者身份运行.
      1. On the script editor, Open a dialog box by "Publish" -> "Deploy as web app".
      2. Select "Me" for "Execute the app as:".
        • By this, the script is run as the owner.
      • 在这种情况下,不需要请求访问令牌.我认为我建议您使用此设置来测试您的目标.
      • 当然,您也可以使用访问令牌.那时,请将其设置为仅我自己" 任何人" .并请在访问令牌中包含https://www.googleapis.com/auth/drive.readonlyhttps://www.googleapis.com/auth/drive的范围.这些范围是访问Web应用程序所必需的.
      • In this case, no access token is required to be request. I think that I recommend this setting for testing your goal.
      • Of course, you can also use the access token. At that time, please set this to "Only myself" or "Anyone". And please include the scope of https://www.googleapis.com/auth/drive.readonly and https://www.googleapis.com/auth/drive to the access token. These scopes are required to access to Web Apps.
      1. 点击查看权限".
      2. 选择自己的帐户.
      3. 点击高级"在此应用未验证"中.
      4. 点击转到###项目名称###(不安全)"
      5. 点击允许"按钮.

    • 点击确定".
    • 复制Web应用程序的URL.就像https://script.google.com/macros/s/###/exec.
      • 修改Google Apps脚本后,请重新部署为新版本.这样,修改后的脚本将反映到Web Apps.请注意这一点.
      • Click "OK".
      • Copy the URL of Web Apps. It's like https://script.google.com/macros/s/###/exec.
        • When you modified the Google Apps Script, please redeploy as new version. By this, the modified script is reflected to Web Apps. Please be careful this.
        • 4.使用Web Apps运行该功能.

          您可以使用以下脚本从Google Spreadsheet中检索值.

          4. Run the function using Web Apps.

          You can retrieve the values from Google Spreadsheet using the following script.

          const request = require("request");
          const url = "https://script.google.com/macros/s/###/exec";  // Please set the URL of Web Apps.
          let qs = {
            id: "###",  // Please set the Document ID.
            name: "###",  // Please set the name of named range.
            // rangeId: "kix.###",  // Please set the ID of named range.
          };
          let options = {
            url: url,
            qs: qs,
            method: "get",
          };
          request(options, (err, res, result) => {
            if (err) {
              console.log(err);
              return;
            }
            console.log(result);
          });
          

          • 在这种情况下,结果以包含值的数组形式返回.
          • 在上述Web Apps中,可以使用命名范围的名称和/或ID检索值.当您要使用命名范围的名称时,请使用let qs = {id: "###", name: "###"};.当您要使用命名范围的ID时,请使用let qs = {id: "###", rangeId: "kix.###"};.
            • In this case, the result is returned as an array including the values.
            • In above Web Apps, the values can be retrieved with the name and/or ID of named range. When you want to use the name of named range, please use let qs = {id: "###", name: "###"};. When you want to use the ID of named range, please use let qs = {id: "###", rangeId: "kix.###"};.
              • 修改Web应用程序的脚本后,请重新部署Web应用程序为新版本.这样,最新脚本将反映到Web应用程序中.请注意这一点.
              • Document Service
              • Web Apps
              • Taking advantage of Web Apps with Google Apps Script

              这篇关于使用Google Docs API检索namedRange中的文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆