Google表格importXML返回空值 [英] Google Sheets importXML Returns Empty Value
问题描述
我正尝试通过Google抓取此网站( https://kamadan.gwtoolbox.com/)我玩的游戏的材料成本表.有两个表;右上角的下拉菜单中的常用材料"和稀有材料".我试图在价格更新时同时提取这两个值.我复制了完整的Xpath,并在工作表上的空白单元格中使用了下面的功能.
Im trying to scrape this website (https://kamadan.gwtoolbox.com/) with google sheets for material costs for a game that I play. There are two tables; "Common Materials" and "Rare Materials" in a drop down in the top right corner. I am trying to pull the values for both as the prices update. I copied the full Xpath and used the function below in an empty cell on a sheet.
=importxml("https://kamadan.gwtoolbox.com/","/html/body/div[2]/div[1]/div/div[2]/table/tbody")
这将返回#N/A错误,表明它正在返回一个空值.
This returns a #N/A error saying it is returning an empty value.
我也用普通的xpath尝试过...
I also tried it with the regular xpath...
=importxml("https://kamadan.gwtoolbox.com/","//*[@id='trader-overlay-items']")
这只是返回一个空白单元格.我还尝试通过chrome在祖先上使用检查功能对这两种方法进行尝试,并且子代返回上述两个错误中的任何一个.
Which just returns a blank cell. I have also tried both methods using the inspect function through chrome on the ancestors and children they return either of the two errors above.
对不起,如果这真的很简单.我对Xpaths或html一点都不熟悉.我主要涉猎Excel中的VBA.
Sorry if this is a really easy one. I am not familiar at all with Xpaths or html. I mostly dabble in VBA in excel.
推荐答案
答案:
IMPORTXML
无法检索由脚本填充的数据,因此无法使用此公式从该表中检索数据.
Answer:
IMPORTXML
can not retrieve data which is populated by a script, and so using this formula to retrieve data from this table is not possible to do.
正如您已经提到的,您可以尝试使用以下方法直接从表中获取数据:
As you've already mentioned, you can attempt to get the data directly from the table using:
=IMPORTXML("https://kamadan.gwtoolbox.com/","//table[@id='trader-overlay-items']")
哪个只是一个空白单元格.
Which just gets a blank cell.
我走了一步,尝试通过在页面上的HTML
元素上依次调用IMPORTXML
来对此进行逆向工程:
I went a step further and tried to reverse-engineer this by calling IMPORTXML
on the HTML
elements on the page in steps:
=IMPORTXML("https://kamadan.gwtoolbox.com/","html")
=IMPORTXML("https://kamadan.gwtoolbox.com/","html/body")
=IMPORTXML("https://kamadan.gwtoolbox.com/","html/body/div[1]")
=IMPORTXML("https://kamadan.gwtoolbox.com/","html/body/div[1]/div[0]")
...
html/body/div[1]/div[0]
是不提供任何导入内容的第一个路径,从导入html/body
可以看出,整个主体不包含该信息,而仅包含它的模板-在单元格B1
中,我们引用了该信息. 常用材料"和稀有材料":
html/body/div[1]/div[0]
is the first path which gives no imported content, and we can see from importing html/body
that the full body does not contain the imformation and only a template of it - in cell B1
we have references to 'Common materials' and 'Rare materials':
在D1
中,我们开始看到IMPORTXML
未调用的JavaScript和JSON对象,因此无法检索其结果:
And in D1
we start to see JavaScript and JSON objects which are not called by IMPORTXML
and so the results of which can not be retrieved:
如您所见,如果您在网站上禁用了JavaScript,实际上几乎不会呈现任何内容,因此无法使用IMPORTXML
来获得:
As you can see if you disable JavaScript on the site, almost nothing is actually rendered and so can't be obtained using IMPORTXML
:
我知道这通常是个坏消息,但我希望这对您有帮助!
I know this is generally bad news, but I hope this is helpful to you!
这篇关于Google表格importXML返回空值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!