从维基百科的文章节选取？ [英] Fetch excerpt from Wikipedia article?

查看：138 发布时间：2016/5/22 19:23:16 api parsing wikipedia wikipedia-api

本文介绍了从维基百科的文章节选取？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我一直在向上和向下的维基百科的API ，但我想不通，如果有一个的很好的方式来获取文章（通常是第一款）的摘录。这将是很好得到该段的HTML格式了。

I've been up and down the Wikipedia API, but I can't figure out if there's a nice way to fetch the excerpt of an article (usually the first paragraph). It would be nice to get the HTML formatting of that paragraph, too.

我目前看到得到的东西，类似于一个片段的唯一方法是通过执行全文检索（<一个href=\"http://en.wikipedia.org/w/api.php?format=xmlfm&action=query&list=search&srsearch=Fight+Club&srlimit=1\">example),但是这不是我真正想要什么（太短）。

The only way I currently see of getting something that resembles a snippet is by performing a fulltext search (example), but that's not really what I want (too short).

是否有任何其他方式来获取维基百科文章不是野蛮解析HTML / wikitext的？

Is there any other way to fetch the first paragraph of a Wikipedia article than barbarically parsing HTML/WikiText?

推荐答案

我发现没有通过API这样的方式，所以我使出解析HTML，使用的 PHP的DOM功能的。这是pretty方便，东西线中：

I found no way of doing this through the API, so I resorted to parsing HTML, using PHP's DOM functions. This was pretty easy, something among the lines of:

$doc = new DOMDocument();
$doc->loadHTML($wikiPage);
$xpath = new DOMXpath($doc);
$nlPNodes = $xpath->query('//div[@id="bodyContent"]/p');
$nFirstP = $nlPNodes->item(0);
$sFirstP = $doc->saveXML($nFirstP);
echo $sFirstP; // echo the first paragraph of the wiki article, including <p></p>

这篇关于从维基百科的文章节选取？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

从维基百科的文章节选取？ [英] Fetch excerpt from Wikipedia article?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

从维基百科的文章节选取？ [英] Fetch excerpt from Wikipedia article?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭