< DIV> < div>内的标签在Google Spreadsheet中使用importXML Xpath查询 [英] <div> tags inside <div> using importXML Xpath query, in Google Spreadsheet
问题描述
我在Google文档中使用Xpath获取< div>
中的文字。
我想将文本保存在Google文档电子表格的一个单元格内的< div id =job_description>
中,但它显示了每个<$ c $
< div id =job_description>< div>
;
< div>
< strong>
基本目的:
< / strong>
< br>< / br>
< / div>
< div>
与开发人员,产品所有者和Q ...密切合作
< br>< / br>
< / div>
< div>
测试分析师对开发人员负责...
< br>< / br>
< / div>
< div>
< strong>
职责和责任:
< / strong>
< / div>
< ul>
< li>< / li>
< li>< / li>
< / ul>
< div>
< strong>
要求:
< / strong>
< br>< / br>
< / div>
< ul>
< li>< / li>
< li>< / li>
< / ul>
< / div>
图片:
http://i.stack.imgur.com/K0mAY.png
这就是代码我写道:
= IMPORTXML(E4,// div [@ id ='job_description'])
可以帮我把所有文字(包括< div>
< div id =job_description>中的code> < ul>
...)< / code >只有一个单元格?
使用JOIN是一个很好的开始,但是您可以将它作为单个操作。 >
您没有显示要导入的网页的网址,因此我只能给出另一个网页的示例。例如,如果您正在导入www.w3.org并寻找 div
,其中 @ class ='event closed expand_block'$ c $使用
= JOIN(CHAR(10),IMPORTXML(http://www.w3.org/ ,// div [@ class ='event closed expand_block'] // text()))
<请注意,我还修改了XPath表达式: // text()
确保只有后代文本节点被检索到,即所有文本。
编辑:回应您的评论:
我可以知道CHAR(10)指的是什么?
当然可以。 CHAR
返回一个字符并将一个数字作为输入。在 CHAR(10)
的情况下,返回一个换行符(我假定是因为&#10; )。
在公式中, CHAR(10)
用作第一个参数 JOIN
,它是要连接的对象的分隔符。
I'm using Xpath in Google docs to get the text inside <div>
.
I want to save the text inside <div id="job_description">
in one cell of Google doc spreadsheet, but it shows each <div>
in separate cell.
<div id="job_description">
<div>
<strong>
Basic Purpose:
</strong>
<br></br>
</div>
<div>
Work closely with developers, product owners and Q…
<br></br>
</div>
<div>
The Test Analyst is accountable for the developmen…
<br></br>
</div>
<div>
<strong>
Duties and Responsibilities:
</strong>
</div>
<ul>
<li></li>
<li></li>
</ul>
<div>
<strong>
Requirements:
</strong>
<br></br>
</div>
<ul>
<li></li>
<li></li>
</ul>
</div>
Image: http://i.stack.imgur.com/K0mAY.png
and this is the code I wrote:
=IMPORTXML(E4,"//div[@id='job_description']")
May you help me to put all of the text (including <div>
<ul>
...) inside the <div id="job_description">
in only one cell ?
Using JOIN is a good start, but you can make it a single operation.
You did not show the URL to the page you're importing, so I can only give you an example with another page. For instance, if you are importing www.w3.org and looking for a div
where @class='event closed expand_block'
, use
=JOIN(CHAR(10),IMPORTXML("http://www.w3.org/","//div[@class='event closed expand_block']//text()"))
Notice that I also modified the XPath expression: //text()
makes sure only descendant text nodes are retrieved, that is, all the text.
EDIT: Responding to your comment:
May I know what is CHAR(10) referring to?
Yes, of course. CHAR
returns a character and takes a number as input. In the case of CHAR(10)
, a newline character is returned (I assume because of
).
In the formula, CHAR(10)
is used as the first argument of JOIN
, which is the delimiter of the objects that are to be joined.
这篇关于< DIV> < div>内的标签在Google Spreadsheet中使用importXML Xpath查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!