使用YQL提取HTML内容? [英] Extract HTML content using YQL?

查看：133 发布时间：2019/11/26 19:03:21 javascript json yahoo yql

本文介绍了使用YQL提取HTML内容?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设我要使用以下标记从网页中提取数据:

Let say I want to extract data from a web page with the following markup:

<table>
  <tr>
    <td><a href="Link 1">Column 1 Text</a></td>
    <td>Column 2 Text</td>
    <td>Column 3 Text</td>
  </tr>
  <tr>
    <td><a href="Link 2">Column 1 Text</a></td>
    <td>Column 2 Text</td>
    <td>Column 3 Text</td>
  </tr>
  ...
</table>

转换为 JSON 格式:

[
  {
    link: 'Link 1',
    text: 'Column 1 Text',
    data: 'Column 3 Text'
  },
  {
    link: 'Link 2',
    text: 'Column 1 Text',
    data: 'Column 3 Text'
  }
]

我们可以用YQL做到吗?如果是，请给我一个示例查询.

Can we make it with YQL? If yes then please give me an example query.

任何帮助将不胜感激！

推荐答案

使用HTML表和一些XPath查询，这是一个很好的起点查询(请参阅

Here's a query that's a good starting point, using the HTML table along with some XPath query (see Extracting HTML Content With XPath for more details on this technique):

哪个会产生这样的JSON结果:

Which produces JSON results like this:

{
 "query": {
  "count": 2,
  "created": "2012-01-06T20:16:46Z",
  "lang": "en-US",
  "results": {
   "tr": [
    {
     "td": [
      {
       "a": {
        "href": "Link%201",
        "content": "Column 1 Text"
       }
      },
      {
       "p": "Column 2 Text"
      },
      {
       "p": "Column 3 Text"
      }
     ]
    },
    {
     "td": [
      {
       "a": {
        "href": "Link%202",
        "content": "Column 1 Text"
       }
      },
      {
       "p": "Column 2 Text"
      },
      {
       "p": "Column 3 Text"
      }
     ]
    }
   ]
  }
 }
}

这篇关于使用YQL提取HTML内容?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用YQL提取HTML内容? [英] Extract HTML content using YQL?

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

使用YQL提取HTML内容? [英] Extract HTML content using YQL?

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭