如何解析存储在我的Google驱动器中但以html类型突出的XML文件? [英] How to parse a XML file stored in my google drive but which stands out as a html type?

查看:102
本文介绍了如何解析存储在我的Google驱动器中但以html类型突出的XML文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何解析存储在我的Google驱动器中但以html类型突出的XML文件?!

How to parse a XML file stored in my google drive but which stands out as a html type ?!

我在Google云端硬盘云上保存了以下来源的xml副本:"或"/>" 我在 https://drive.google.com/上分享了它file/d/16kJ5Nko-waVb8s2T12LaTEKaFY01603n/view?usp = sharing 为您提供访问权限并测试我的脚本. 我知道我可以使用cacheService,但它可以正常工作,但是对于缓冲的其他控制,我可以尝试这种方式

I save on my google Drive cloud a copie of an xml of the source: http://api.allocine.fr/rest/v3/movie?media=mp4-lc&partner=YW5kcm9pZC12Mg&profile=large&version=2&code=265621 I can parsing the source but i cant'xml parsing the copie that look like a html type !! i have parsing error like: The element type "meta" must be terminated by the matching end-tag "" or Element type "a.length" must be followed by either attribute specifications, ">" or "/>" I shared it on https://drive.google.com/file/d/16kJ5Nko-waVb8s2T12LaTEKaFY01603n/view?usp=sharing to give you an access and test my script. I know that i can using cacheService and it works but for have other control of the buffering i woud try this way

function xmlParsingXmlStoreOnGoogleDrive(){
     //So , this is the original xml that is good parsed
 var fetched=UrlFetchApp.fetch("http://api.allocine.fr/rest/v3/movie?media=mp4-lc&partner=YW5kcm9pZC12Mg&profile=large&version=2&code=265621")
 var blob=fetched.getBlob();
 var getAs=blob.getAs("text/xml")
 var data=getAs.getDataAsString("UTF-8")
 Logger.log(data.substring(1,350)); // substring to not saturate the debug display this expected code XML:
 /*
    ?xml version="1.0" encoding="utf-8"?>
    <!-- Copyright © 2019 AlloCiné -->
    <movie code="265621" xmlns="http://www.allocine.net/v6/ns/">
    <movieType code="4002">Long-métrage</movieType>
    <originalTitle>Mise à jour sur Google play</originalTitle>
    <title>Mise à jour sur Google play</title>
    <keywords>Portrait of a Lady on Fire </keywords>
 */
 var xmlDocument=XmlService.parse(data);
 var root=xmlDocument.getRootElement();
 var keywords=root.getChild("keywords",root.getNamespace()).getText();
 Logger.log(keywords);  // Display the expected result :"Portrait of a Lady on Fire "

 // And this my copie of the original xml, that i can't parsing
 var fetched=UrlFetchApp.fetch("https://drive.google.com/file/d/1K3-9dHy-h0UoOOY5jYfiSoYPezSi55h1/view?usp=sharing")
 var blob=fetched.getBlob();
 var getAs=blob.getAs("text/xml")
 var data=getAs.getDataAsString("UTF-8")
 Logger.log(data.substring(1,350)); // substring to not saturate the debug display this non expected code HTML !:
 /*
   !DOCTYPE html><html><head><meta name="google" content="notranslate"><meta http-equiv="X-UA-Compatible" content="IE=edge;">
   <style>@font-face{font-family:'Roboto';font-style:italic;font-weight:400;src:local('Roboto Italic'),local('Roboto-Italic'),
   url(//fonts.gstatic.com/s/roboto/v18/KFOkCnqEu92Fr1Mu51xIIzc.ttf)format('truetype');}@font-face{font-fam......
 */
 var xmlDocument=XmlService.parse(data); // ABORT WITH THE ERROR: Element type "a.length" must be followed by either attribute specifications, ">" or "/>"
 var root=xmlDocument.getRootElement();
 var keywords=root.getChild("keywords",root.getNamespace()).getText();
 Logger.log(keywords);
}

我读过这个类似的问题:

I read on this similar ask :Parse XML file (which is stored on GoogleDrive) with Google app script

不幸的是,我们无法直接在Google驱动器中获取xml文件"! 是对的,这仅仅是意味着我无法实现我的脚本吗?

that "Unfortunately we can't directly get xml files in the google drive" !! Is it right and would that simply mean that I can not realize my script?

推荐答案

很棒!你写.您的两个建议正在起作用. 我只是在代码的其他地方犯了一个错误.因此该解决方案1不再起作用. 这就是为什么要提供一个新脚本进行测试的原因.仅出于培训目的,因为我的项目很安全,谢谢您:)

Wonderful ! You are write. Your two suggestions are working. I just made a mistake elsewhere in my code. So that solution 1 does not work anymore. That is why give a new script to test it . For my training only, because my project is safe thanks to you :)

function storeXmlOnGoogleDriveThenParsIt(url){
  url=url||"http://api.allocine.fr/rest/v3/movie?media=mp4-lc&partner=YW5kcm9pZC12Mg&profile=large&version=2&code=265621"; // to test
  // on my Google Drive i make a copi of the url called. (This to preserve the server from too many request.)
  var bufferedXml=DriveApp.getRootFolder().searchFolders('title = "BufferFiles"').next().createFile("xmlBuffered.xml", UrlFetchApp.fetch(url).getContentText(),MimeType.PLAIN_TEXT);
  var urlBufferedXml=bufferedXml.getUrl()   // The new url ,of the buffered file
  var fileId=urlBufferedXml.match(/https:\/\/drive.google.com\/file\/d\/(.*)\/view.*/)[1];


  //Now i want to pars the buffered xml file
  //[ Your seconde way to get data is working perect ! THANK YOU A LOT !!!
  var data = DriveApp.getFileById(fileId).getBlob().getDataAsString(); 
  var xmlDocument=XmlService.parse(data);                              
  var root=xmlDocument.getRootElement();
  var mynamespace=root.getNamespace();
  var keywords=root.getChild("keywords",root.getNamespace()).getText();
  Logger.log("keywords:"+keywords)                            // and parsing success ]


  //[ The first way to get data was ok BUT DAMNED it now aborting ! Since modifications on the line code that create the xml, and i cant' retrieve the right code
  var downloadUrlBufferedXml="https://drive.google.com/uc?id="+fileId+"&export=download";
  var data = UrlFetchApp.fetch(downloadUrlBufferedXml).getContentText(); // was good but now data is here again like a html text ! :(
  Logger.log("data"+data.substring(1,350)); // this show that data is HTML type and not XML type !  :(
  var xmlDocument=XmlService.parse(data);  // So i have Error like: The element type "meta" must be terminated by the matching end-tag "</meta>"  ]
  var root=xmlDocument.getRootElement();
  var mynamespace=root.getNamespace();
  var keywords=root.getChild("keywords",root.getNamespace()).getText();
  Logger.log("keywords:"+keywords)
}

这篇关于如何解析存储在我的Google驱动器中但以html类型突出的XML文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆