Google Sheets ImportXml()导入的内容无法解析 [英] Google Sheets ImportXml() imported content can not be parsed
问题描述
我试图从XML文件中在线导入数据,但它无法正常工作。
我使用的公式是:
= IMPORTXML(https://mobileapp01.graincorp.com.au/prices?site=8440&commodity=1002000,// * [@ id =collapsible86] / div [1] / div [2] / div [1] / span [2])
并且它返回一个错误,说明:
导入的XML内容无法解析。
我做错了什么,或者它不能用于那个XML文档?
谢谢
所以这里就是发生了什么 - 尽管这个端点在视觉上被格式化为一个xml,事实证明它实际上JSON。在工作表中,转到菜单并选择工具
,然后选择脚本编辑器
并将我在下面添加的代码粘贴到空白脚本中。 (删除粘贴前默认出现的几行)
保存脚本后 - 输入 = IMPORTJSON(https:/ /mobileapp01.graincorp.com.au/prices?site=8440&commodity=1002000)
或 = IMPORTJSON(A1)
取决于它是否仅仅是一个引用,它也可以工作...和瞧 - 问题解决了:)
/ * ============== ================================================== ================================================== ================== *
Trevor Lohrbeer的ImportJSON(@FastFedora)
=============== ================================================== ================================================== =================
版本:1.1
项目页面:http://blog.fastfedora.com/pro (C)2012 by Trevor Lohrbeer
许可证:GNU通用公共许可证,第3版(GPL-3.0)
http://www.opensource.org/ licenses / gpl-3.0.html
--------------------------------------- -------------------------------------------------- -------------------------------------------
一个库将JSON订阅源导入Google电子表格。函数包括:
ImportJSON用于最终用户从URL导入JSON提要
ImportJSONAdvanced用于脚本开发人员轻松扩展此库的功能
未来的增强功能可能包括:
- 支持真正的XPath,类似于ImportXML的查询参数
- 支持经过OAuth认证的API
或者随意编写这些并自己添加到库中!
---------------------------------------------- -------------------------------------------------- ------------------------------------
变更记录:
1.1增加了对noHeaders选项的支持
1.0初始版本
* =============================== ================================================== ================================================== = * /
/ **
*导入JSON供稿并返回要插入Google电子表格的结果。 JSON馈送被平化以创建
*二维数组。第一行包含标题,每个列标题指示
* JSON提要中数据的路径。其余行包含数据。
*
*默认情况下,数据被转换,所以看起来更像是一个正常的数据导入。具体来说:
*
* - 来自父JSON元素的数据会继承到它们的子元素,因此表示子元素的行包含代表其父元素的行的值
*。
* - 超过256个字符的值被截断。
* - 标题将斜杠转换为空格,删除常用前缀并将结果文本转换为标题大小写。
*
*要更改此行为,请在options参数中传入以下值之一:
*
* noInherit:不从父元素继承值
* noTruncate:不截断值
* rawHeaders:不需要标题化
* noHeaders:不包含标题,只包含数据
* debugLocation:在行和前置;列属于
*
*例如:
*
* = ImportJSON(http://gdata.youtube.com/feeds/api/standardfeeds/most_popular?v = 2& alt = json,/ feed / entry / title,/ feed / entry / content,
*noInherit,noTruncate,rawHeaders)
*
* @param { url}公共JSON提要的URL
* @param {query}要导入的路径的逗号分隔列表。任何以这些路径之一开始的路径都会被导入。
* @param {options}逗号分隔的选项列表,用于更改数据的处理
*
*返回包含数据的二维数组,其中第一行包含标题
** /
函数ImportJSON(url,query,options){
return ImportJSONAdvanced(url,query,options,includeXPath_,defaultTransform_);
}
/ **
*高级版本的ImportJSON,可以通过脚本轻松扩展。此版本不能从
*电子表格中调用。
*
*导入JSON供稿并返回要插入Google电子表格的结果。 JSON馈送被平化以创建
*二维数组。第一行包含标题,每个列标题指示
* JSON提要中数据的路径。其余行包含数据。
*
*使用包含函数和转换函数来确定在导入中包含的内容以及在导入
*后如何转换数据。
*
*例如:
*
* = ImportJSON(http://gdata.youtube.com/feeds/api/standardfeeds/most_popular?v=2&alt = json,
*/ feed / entry,
* function(query,path){return path.indexOf(query)== 0;},
* function(data,行,列){data [row] [column] = data [row] [column] .toString()。substr(0,100);})
*
*在这个例子中,函数会检查导入数据的路径是否从查询开始。变换
*函数获取数据并截断它。有关这些函数的更健壮版本,请参阅此库的内部代码。
*
* @param {url}公共JSON提要的URL
* @param {query}传递给包含函数的查询
* @param {options} a以逗号分隔的可能会改变数据处理的选项列表
* @param {includeFunc}具有签名func(查询,路径,选项)的函数,如果给定路径$ b $处的数据元素返回true b *应包括在内,否则为false。
* @param {transformFunc}具有签名func(data,row,column,options)的函数,其中data是数据
*和row&列是正在处理的当前行和列。任何返回值都会被忽略。请注意,行0
*包含数据的标题,因此测试行== 0仅处理标题。
*
*返回包含数据的二维数组,第一行包含头文件
** /
函数ImportJSONAdvanced(url,query,options,includeFunc,transformFunc ){
var jsondata = UrlFetchApp.fetch(url);
var object = JSON.parse(jsondata.getContentText());
return parseJSONObject_(object,query,options,includeFunc,transformFunc);
}
/ **
*对指定的值进行编码以在URL中使用。
*
* @param {value}要编码的值
*
* @return使用URL百分比编码编码的值
* /
函数URLEncode(value){
返回encodeURIComponent(value.toString());
}
/ **
*解析一个JSON对象并返回一个包含该对象数据的二维数组。
* /
function parseJSONObject_(object,query,options,includeFunc,transformFunc){
var headers = new Array();
var data = new Array(); (query&&!Array.isArray(query)&&& query.toString()。indexOf(,)!= -1){
query =
if = query.toString()分割( );
}
if(options){
options = options.toString()。split(,);
parseData_(headers,data,,1,object,query,options,includeFunc);
parseHeaders_(headers,data);
transformData_(data,options,transformFunc);
返回hasOption_(options,noHeaders)? (data.length> 1?data.slice(1):new Array()):data;
}
/ **
*解析给定值中包含的数据,并将其插入从rowIndex开始的数据二维数组中。
*如果要将数据插入到新列中,则将新标题添加到标题数组中。该值可以是一个对象,
*数组或标量值。
*
*如果该值是一个对象,则它的属性会迭代并返回到此函数中,每个
*属性的名称将扩展路径。例如,如果对象包含entry属性并且传入的路径为/ feed,则
*此函数将使用entry属性的值和路径/ feed / entry调用。
*
*如果该值是包含其他数组或对象的数组,则数组中的每个元素都将传递到此函数中,并为每个元素增加了
* rowIndex。
*
*如果该值是只包含标量值的数组,则这些值连接在一起并作为
*单个值插入到数据数组中。
*
*如果该值是标量,则该值将直接插入到数据数组中。
* /
function parseData_(headers,data,path,rowIndex,value,query,options,includeFunc){
var dataInserted = false;
if(isObject_(value)){
for(key in value){
if(parseData_(headers,data,path +/+ key,rowIndex,value [键],查询,选项,includeFunc)){
dataInserted = true;
} else if(Array.isArray(value)&&& isObjectArray_(value)){
for(var i = 0; i< value。 length; i ++){
if(parseData_(headers,data,path,rowIndex,value [i],query,options,includeFunc)){
dataInserted = true;
rowIndex ++;
} else if(!includeFunc || includeFunc(query,path,options)){
//处理仅包含标量值的数组
if( Array.isArray(value)){
value = value.join();
}
//如果不存在,插入新行
if(!data [rowIndex]){
data [rowIndex] = new Array( );
}
//添加一个新的头文件如果不存在
if(!headers [path]&& headers [path]!= 0){
headers [path] = Object.keys(headers).length;
}
//插入数据
data [rowIndex] [headers [path]] = value;
dataInserted = true;
}
返回dataInserted;
}
/ **
*解析头数组并将其插入数据数组的第一行。
* /
function parseHeaders_(headers,data){
data [0] = new Array();
for(key in header){
data [0] [headers [key]] = key;
}
}
/ **
*对数据数组中的每个元素应用转换函数,遍历每行的每一列。 (var j = 0; i< data.length; i ++){
for(var j = 0; $ data $ length; i ++){
* /
函数transformData_(data,options,transformFunc) 0; j< data [i] .length; j ++){
transformFunc(data,i,j,options);
}
}
}
/ **
*如果给定的测试值是一个对象,则返回true;否则为假。
* /
函数isObject_(test){
return Object.prototype.toString.call(test)==='[object Object]';
}
/ **
*如果给定的测试值是一个至少包含一个对象的数组,则返回true;否则为假。
* /
函数isObjectArray_(test){
for(var i = 0; i< test.length; i ++){
if(isObject_(test [i]) ){
返回true;
}
}
return false;
}
/ **
*如果给定查询应用于给定路径,则返回true。
* /
函数includeXPath_(查询,路径,选项){
if(!query){
return true;
} else if(Array.isArray(query)){
for(var i = 0; i< query.length; i ++){
if(applyXPathRule_(query [i],路径,选项)){
return true;
}
}
} else {
return applyXPathRule_(query,path,options);
}
return false;
};
/ **
*如果规则适用于给定路径,则返回true。
* /
函数applyXPathRule_(rule,path,options){
return path.indexOf(rule)== 0;
}
/ **
*默认情况下,这个函数转换给定行&列,所以它看起来更像是一个正常的数据导入。具体来说:
*
* - 来自父JSON元素的数据会继承到它们的子元素,因此表示子元素的行包含代表其父元素的行的值
*。
* - 超过256个字符的值被截断。
* - 行0(标题)中的值将斜线转换为空格,删除常用前缀并将结果文本转换为标题
* case。
*
*要更改此行为,请在options参数中传入以下值之一:
*
* noInherit:不从父元素继承值
* noTruncate:不要截断值
* rawHeaders:不要对头文件进行优化
* debugLocation:将每个值加上行&它属于
* /
函数defaultTransform_(data,row,column,options){
if(!data [row] [column]){
if(row< ; 2 || hasOption_(options,noInherit)){
data [row] [column] =;
} else {
data [row] [column] = data [row-1] [column]; $!
$ b if(!hasOption_(options,rawHeaders)&& row == 0){
if(column == 0& amp; ;& data [row] .length> 1){
removeCommonPrefixes_(data,row);
data [row] [column] = toTitleCase_(data [row] [column] .toString()。replace(/ [\ / \ _ _] / g, )); $!
$ b if(!hasOption_(options,noTruncate)&& data [row] [column]){
data [row] [column] = data [row] [column] .toString()。substr(0,256);
}
if(hasOption_(options,debugLocation)){
data [row] [column] =[+ row +,+ column + ]+ data [row] [column];
}
}
/ **
*如果给定行中的所有值共享相同的前缀,请删除该前缀。
* /
函数removeCommonPrefixes_(data,row){
var matchIndex = data [row] [0] .length; (数据[行] [i-1],数据[行])的
(var i = 1; i< data [row] .length; i ++){
matchIndex = findEqualityEndpoint_ [i],matchIndex);
if(matchIndex == 0){
return;
for(var i = 0; i< data [row] .length; i ++){
data [row] [i] = data [row] [i] .substring(matchIndex,data [row] [i] .length);
}
}
/ **
*找到两个字符串值停止相等的索引,并在stopAt索引处自动停止。
* /
函数findEqualityEndpoint_(string1,string2,stopAt){
if(!string1 ||!string2){
return -1;
}
var maxEndpoint = Math.min(stopAt,string1.length,string2.length);
for(var i = 0; i< maxEndpoint; i ++){
if(string1.charAt(i)!= string2.charAt(i)){
return一世;
}
}
return maxEndpoint;
}
/ **
*将文本转换为标题大小写。
* /
函数toTitleCase_(text){
if(text == null){
return null;
return text.replace(/ \ w \S * / g,function(word){return word.charAt(0).toUpperCase()+ word.substr( 1).toLowerCase();});
}
/ **
*如果给定的选项集包含给定的选项,则返回true。
* /
函数hasOption_(options,option){
return options&& options.indexOf(option)> = 0;
}
[1]:https://i.stack.imgur.com/MaicW.png
I'm trying to import data from an XML file online, though it's not working correctly.
The formula I'm using is this:
=IMPORTXML(https://mobileapp01.graincorp.com.au/prices?site=8440&commodity=1002000, //*[@id="collapsible86"]/div[1]/div[2]/div[1]/span[2])
And it is returning an error saying
Imported XML content can not be parsed.
Have I done something wrong, or will it simply not work with that XML document?
Thanks
So here is what is happening - despite the fact that that endpoint is visually incredibly formatted as an xml - it turns out its actually json.
In your sheet, go to the menu and choose Tools
and then script editor
and paste the code I added down below into a blank script. ( delete the couple of lines it comes by default before pasting)
After you save the script - just type in =IMPORTJSON("https://mobileapp01.graincorp.com.au/prices?site=8440&commodity=1002000")
or =IMPORTJSON(A1)
depending on whether or not its just a reference which also works... and voila - problem solved :)
/*====================================================================================================================================*
ImportJSON by Trevor Lohrbeer (@FastFedora)
====================================================================================================================================
Version: 1.1
Project Page: http://blog.fastfedora.com/projects/import-json
Copyright: (c) 2012 by Trevor Lohrbeer
License: GNU General Public License, version 3 (GPL-3.0)
http://www.opensource.org/licenses/gpl-3.0.html
------------------------------------------------------------------------------------------------------------------------------------
A library for importing JSON feeds into Google spreadsheets. Functions include:
ImportJSON For use by end users to import a JSON feed from a URL
ImportJSONAdvanced For use by script developers to easily extend the functionality of this library
Future enhancements may include:
- Support for a real XPath like syntax similar to ImportXML for the query parameter
- Support for OAuth authenticated APIs
Or feel free to write these and add on to the library yourself!
------------------------------------------------------------------------------------------------------------------------------------
Changelog:
1.1 Added support for the noHeaders option
1.0 Initial release
*====================================================================================================================================*/
/**
* Imports a JSON feed and returns the results to be inserted into a Google Spreadsheet. The JSON feed is flattened to create
* a two-dimensional array. The first row contains the headers, with each column header indicating the path to that data in
* the JSON feed. The remaining rows contain the data.
*
* By default, data gets transformed so it looks more like a normal data import. Specifically:
*
* - Data from parent JSON elements gets inherited to their child elements, so rows representing child elements contain the values
* of the rows representing their parent elements.
* - Values longer than 256 characters get truncated.
* - Headers have slashes converted to spaces, common prefixes removed and the resulting text converted to title case.
*
* To change this behavior, pass in one of these values in the options parameter:
*
* noInherit: Don't inherit values from parent elements
* noTruncate: Don't truncate values
* rawHeaders: Don't prettify headers
* noHeaders: Don't include headers, only the data
* debugLocation: Prepend each value with the row & column it belongs in
*
* For example:
*
* =ImportJSON("http://gdata.youtube.com/feeds/api/standardfeeds/most_popular?v=2&alt=json", "/feed/entry/title,/feed/entry/content",
* "noInherit,noTruncate,rawHeaders")
*
* @param {url} the URL to a public JSON feed
* @param {query} a comma-separated lists of paths to import. Any path starting with one of these paths gets imported.
* @param {options} a comma-separated list of options that alter processing of the data
*
* @return a two-dimensional array containing the data, with the first row containing headers
**/
function ImportJSON(url, query, options) {
return ImportJSONAdvanced(url, query, options, includeXPath_, defaultTransform_);
}
/**
* An advanced version of ImportJSON designed to be easily extended by a script. This version cannot be called from within a
* spreadsheet.
*
* Imports a JSON feed and returns the results to be inserted into a Google Spreadsheet. The JSON feed is flattened to create
* a two-dimensional array. The first row contains the headers, with each column header indicating the path to that data in
* the JSON feed. The remaining rows contain the data.
*
* Use the include and transformation functions to determine what to include in the import and how to transform the data after it is
* imported.
*
* For example:
*
* =ImportJSON("http://gdata.youtube.com/feeds/api/standardfeeds/most_popular?v=2&alt=json",
* "/feed/entry",
* function (query, path) { return path.indexOf(query) == 0; },
* function (data, row, column) { data[row][column] = data[row][column].toString().substr(0, 100); } )
*
* In this example, the import function checks to see if the path to the data being imported starts with the query. The transform
* function takes the data and truncates it. For more robust versions of these functions, see the internal code of this library.
*
* @param {url} the URL to a public JSON feed
* @param {query} the query passed to the include function
* @param {options} a comma-separated list of options that may alter processing of the data
* @param {includeFunc} a function with the signature func(query, path, options) that returns true if the data element at the given path
* should be included or false otherwise.
* @param {transformFunc} a function with the signature func(data, row, column, options) where data is a 2-dimensional array of the data
* and row & column are the current row and column being processed. Any return value is ignored. Note that row 0
* contains the headers for the data, so test for row==0 to process headers only.
*
* @return a two-dimensional array containing the data, with the first row containing headers
**/
function ImportJSONAdvanced(url, query, options, includeFunc, transformFunc) {
var jsondata = UrlFetchApp.fetch(url);
var object = JSON.parse(jsondata.getContentText());
return parseJSONObject_(object, query, options, includeFunc, transformFunc);
}
/**
* Encodes the given value to use within a URL.
*
* @param {value} the value to be encoded
*
* @return the value encoded using URL percent-encoding
*/
function URLEncode(value) {
return encodeURIComponent(value.toString());
}
/**
* Parses a JSON object and returns a two-dimensional array containing the data of that object.
*/
function parseJSONObject_(object, query, options, includeFunc, transformFunc) {
var headers = new Array();
var data = new Array();
if (query && !Array.isArray(query) && query.toString().indexOf(",") != -1) {
query = query.toString().split(",");
}
if (options) {
options = options.toString().split(",");
}
parseData_(headers, data, "", 1, object, query, options, includeFunc);
parseHeaders_(headers, data);
transformData_(data, options, transformFunc);
return hasOption_(options, "noHeaders") ? (data.length > 1 ? data.slice(1) : new Array()) : data;
}
/**
* Parses the data contained within the given value and inserts it into the data two-dimensional array starting at the rowIndex.
* If the data is to be inserted into a new column, a new header is added to the headers array. The value can be an object,
* array or scalar value.
*
* If the value is an object, it's properties are iterated through and passed back into this function with the name of each
* property extending the path. For instance, if the object contains the property "entry" and the path passed in was "/feed",
* this function is called with the value of the entry property and the path "/feed/entry".
*
* If the value is an array containing other arrays or objects, each element in the array is passed into this function with
* the rowIndex incremeneted for each element.
*
* If the value is an array containing only scalar values, those values are joined together and inserted into the data array as
* a single value.
*
* If the value is a scalar, the value is inserted directly into the data array.
*/
function parseData_(headers, data, path, rowIndex, value, query, options, includeFunc) {
var dataInserted = false;
if (isObject_(value)) {
for (key in value) {
if (parseData_(headers, data, path + "/" + key, rowIndex, value[key], query, options, includeFunc)) {
dataInserted = true;
}
}
} else if (Array.isArray(value) && isObjectArray_(value)) {
for (var i = 0; i < value.length; i++) {
if (parseData_(headers, data, path, rowIndex, value[i], query, options, includeFunc)) {
dataInserted = true;
rowIndex++;
}
}
} else if (!includeFunc || includeFunc(query, path, options)) {
// Handle arrays containing only scalar values
if (Array.isArray(value)) {
value = value.join();
}
// Insert new row if one doesn't already exist
if (!data[rowIndex]) {
data[rowIndex] = new Array();
}
// Add a new header if one doesn't exist
if (!headers[path] && headers[path] != 0) {
headers[path] = Object.keys(headers).length;
}
// Insert the data
data[rowIndex][headers[path]] = value;
dataInserted = true;
}
return dataInserted;
}
/**
* Parses the headers array and inserts it into the first row of the data array.
*/
function parseHeaders_(headers, data) {
data[0] = new Array();
for (key in headers) {
data[0][headers[key]] = key;
}
}
/**
* Applies the transform function for each element in the data array, going through each column of each row.
*/
function transformData_(data, options, transformFunc) {
for (var i = 0; i < data.length; i++) {
for (var j = 0; j < data[i].length; j++) {
transformFunc(data, i, j, options);
}
}
}
/**
* Returns true if the given test value is an object; false otherwise.
*/
function isObject_(test) {
return Object.prototype.toString.call(test) === '[object Object]';
}
/**
* Returns true if the given test value is an array containing at least one object; false otherwise.
*/
function isObjectArray_(test) {
for (var i = 0; i < test.length; i++) {
if (isObject_(test[i])) {
return true;
}
}
return false;
}
/**
* Returns true if the given query applies to the given path.
*/
function includeXPath_(query, path, options) {
if (!query) {
return true;
} else if (Array.isArray(query)) {
for (var i = 0; i < query.length; i++) {
if (applyXPathRule_(query[i], path, options)) {
return true;
}
}
} else {
return applyXPathRule_(query, path, options);
}
return false;
};
/**
* Returns true if the rule applies to the given path.
*/
function applyXPathRule_(rule, path, options) {
return path.indexOf(rule) == 0;
}
/**
* By default, this function transforms the value at the given row & column so it looks more like a normal data import. Specifically:
*
* - Data from parent JSON elements gets inherited to their child elements, so rows representing child elements contain the values
* of the rows representing their parent elements.
* - Values longer than 256 characters get truncated.
* - Values in row 0 (headers) have slashes converted to spaces, common prefixes removed and the resulting text converted to title
* case.
*
* To change this behavior, pass in one of these values in the options parameter:
*
* noInherit: Don't inherit values from parent elements
* noTruncate: Don't truncate values
* rawHeaders: Don't prettify headers
* debugLocation: Prepend each value with the row & column it belongs in
*/
function defaultTransform_(data, row, column, options) {
if (!data[row][column]) {
if (row < 2 || hasOption_(options, "noInherit")) {
data[row][column] = "";
} else {
data[row][column] = data[row-1][column];
}
}
if (!hasOption_(options, "rawHeaders") && row == 0) {
if (column == 0 && data[row].length > 1) {
removeCommonPrefixes_(data, row);
}
data[row][column] = toTitleCase_(data[row][column].toString().replace(/[\/\_]/g, " "));
}
if (!hasOption_(options, "noTruncate") && data[row][column]) {
data[row][column] = data[row][column].toString().substr(0, 256);
}
if (hasOption_(options, "debugLocation")) {
data[row][column] = "[" + row + "," + column + "]" + data[row][column];
}
}
/**
* If all the values in the given row share the same prefix, remove that prefix.
*/
function removeCommonPrefixes_(data, row) {
var matchIndex = data[row][0].length;
for (var i = 1; i < data[row].length; i++) {
matchIndex = findEqualityEndpoint_(data[row][i-1], data[row][i], matchIndex);
if (matchIndex == 0) {
return;
}
}
for (var i = 0; i < data[row].length; i++) {
data[row][i] = data[row][i].substring(matchIndex, data[row][i].length);
}
}
/**
* Locates the index where the two strings values stop being equal, stopping automatically at the stopAt index.
*/
function findEqualityEndpoint_(string1, string2, stopAt) {
if (!string1 || !string2) {
return -1;
}
var maxEndpoint = Math.min(stopAt, string1.length, string2.length);
for (var i = 0; i < maxEndpoint; i++) {
if (string1.charAt(i) != string2.charAt(i)) {
return i;
}
}
return maxEndpoint;
}
/**
* Converts the text to title case.
*/
function toTitleCase_(text) {
if (text == null) {
return null;
}
return text.replace(/\w\S*/g, function(word) { return word.charAt(0).toUpperCase() + word.substr(1).toLowerCase(); });
}
/**
* Returns true if the given set of options contains the given option.
*/
function hasOption_(options, option) {
return options && options.indexOf(option) >= 0;
}
[1]: https://i.stack.imgur.com/MaicW.png
这篇关于Google Sheets ImportXml()导入的内容无法解析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!