将页面上的术语链接到纯JavaScript中的Wikipedia文章 [英] Link terms on page to Wikipedia articles in pure JavaScript

查看:70
本文介绍了将页面上的术语链接到纯JavaScript中的Wikipedia文章的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在浏览时,我遇到了此博客文章关于使用维基百科API ://en.wikipedia.org/wiki/JavaScript\"rel =nofollow noreferrer> JavaScript ,将单个搜索词链接到它的定义。在博客文章的最后,作者提到了可能的扩展名,包括:

While browsing I came across this blog post about using the Wikipedia API from JavaScript, to link a single search term to it's definition. At the end of the blog post the author mentions possible extensions including:


一个自动将术语链接到维基百科文章的插件。

A plugin which auto links terms to Wikipedia articles.

这完全适合我正在处理的项目要求,但遗憾的是我缺乏编程技能来扩展原始源代码。我想要的是拥有一个可以添加到网页的纯JavaScript代码段,它将该网页上包含内部维基文章的所有条款链接到该维基。

This fits the bill perfectly for a project requirement I'm working on, but sadly I lack the programming skills to extend the original source code. What I'd like is to have a pure JavaScript snippet I can add to a webpage, that links all the terms on that webpage that have an article on an internal wiki to that wiki.

我知道这可能要求很多,但代码看起来几乎就在那里,如果有人为这个虚拟信用做剩下的工作,我愿意加一个赏金..;)我也怀疑这对其他几个人来说可能是有价值的,因为我已经看过类似的请求,但没有工作实现(这只是一个JavaScript(因此也是便携式)库/代码段包含)。

I know this might be asking for much, but the code looks like it's nearly there, and I'd be willing to add a bounty if anyone will do the remaining work for that virtual credit.. ;) I also suspect this might be of value to a few others, as I've seen similar requests but no working implementation (that's a mere JavaScript (and therefore portable) library/snippet include).

这是原始源代码的示例,我希望任何人都可以添加到此或指向我需要添加的内容,如果我自己实现这一点(在这种情况下,我会分享代码,如果我设法把东西放在一起)。

Here's a sample of the original source code, I hope anyone is able to add to this or point me to what I'd need to add if I were to implement this myself (in which case I'll share the code if I manage to put something together).

<script type="text/javascript"><!--
var spellcheck = function (data) {
    var found = false; var url=''; var text = data [0];
    if (text != document.getElementById ('spellcheckinput').value)
        return;
    for (i=0; i<data [1].length; i++) {
        if (text.toLowerCase () == data [1] [i].toLowerCase ()) {
            found = true;
            url ='http://en.wikipedia.org/wiki/' + text;
            document.getElementById ('spellcheckresult').innerHTML = '<b style="color:green">Correct</b> - <a target="_top" href="' + url + '">link</a>';
        }
    }
    if (! found)
        document.getElementById ('spellcheckresult').innerHTML = '<b style="color:red">Incorrect</b>';
};

var getjs = function (value) {
    if (! value)
        return;
    url = 'http://en.wikipedia.org/w/api.php?action=opensearch&search='+value+'&format=json&callback=spellcheck';
    document.getElementById ('spellcheckresult').innerHTML = 'Checking ...';
    var elem = document.createElement ('script');
    elem.setAttribute ('src', url);
    elem.setAttribute ('type','text/javascript');
    document.getElementsByTagName ('head') [0].appendChild (elem);
};--></script>
<form action="#" method="get" onsubmit="return false"> 
<p>Enter a word - <input id="spellcheckinput" onkeyup="getjs (this.value);" type="text"> <span id="spellcheckresult"></span></p></form>






更新

正如评论中所指出的,链接所有单词所需的时间以及如何处理多个单词跨越文章名称也是我的担忧..


Update
As pointed out in the comments, both the time it would take to link all words and how to handle multiple word spanning article names were concerns of mine as well..

我认为从单词文章开始已经涵盖了很大一部分用例,在跳过英语中最常用的500个单词时可能会获得一些性能上的好处,但我仍然不确定它是多么可行这种方法将是..

I'd think starting with single word articles would already cover a large percentage of the use cases, with maybe some performance benefits gained when skipping the 500 most common words in the English language, but still I'm uncertain how feasible this approach will be..

然而,这将是客户端,并且链接条款的一些延迟是完全可以接受的。

On the upside however this would all be client side, and some delay in linking terms is fully acceptable.

或者搜索鼠标悬停/选择的术语也可以接受,但我不确定这是否会降低或增加复杂性..

Alternatively searching for terms the mouse is hovering over / selected might be acceptable as well, but I'm unsure if this would decrease or increase complexity..

更新2

'Pointy'在下面说明此功能可以实现在从 api.php?action = query& list = allpages 获得文章主题列表之后,通过更改一些相当标准的突出显示脚本来编辑。

重新投入:我们正在使用内部维基,因此文章列表可能有限,不含糊不清且属于特定领域,足以克服匹配单词中的一些预期问题。

'Pointy' explained below that this functionality could be achieved by altering some fairly standard highlighting scripts, after having obtained a list of article topics from api.php?action=query&list=allpages.
To reinterate: we're using an internal wiki, so the list of articles is likely limited, non ambiguous and domain specific enough to overcome some of the expected problems in matching words.

因为到目前为止我们已经有了一些很好的建议,还有一些可行的想法,所以我开始赏金,看看我能不能得到一些答案..

Since we've had some good suggestions so far, and a few workable ideas, I'm starting a bounty to see if I can get a few answers on this..

推荐答案

也许这样的事情可能有所帮助:

Perhaps something like this might help:

假设非常简单的HTML /文字如下:

Assuming very simple HTML/Text like so:

<div id="theText">Testing the auto link system here...</div>

还有两个非常小的脚本。

And two very small scripts.

dictionary.js 设置您的条款列表。我的想法是,如果你想要的话,可以通过查询文章数据库在php中生成。它也可以跨域加载(因为它设置 window.termsRE )。如果您不需要从数据库生成列表,您也可以手动将其与 termlinker.js 一起使用。

dictionary.js sets up your list of your terms. My thought was that this could be generated in php by querying the articles database if you wanted. It also can be loaded cross domain (as it sets window.termsRE). If you don't need to generate the list from the database, you could also manually put it with termlinker.js.

生成RegExp的代码假定您的 terms 数组包含使用正则表达式匹配的格式正确的字符串,因此请务必使用 \ \ 要逃避 [] \。?* + |(){} ^&

This code that generates the RegExp assumes that your terms array contains properly formatted strings to match using Regular Expressions, so be sure to use \\ to escape []\.?*+|(){}^&

// dictionary.js - define some terms
var terms = ['testing', 'auto link'];
window.termsRE = new RegExp("\\b("+terms.join("|")+")\\b",'gi');

termlinker.js 只是一个简单的正则表达式搜索替换定义的术语。它也可以是内联< script> 。要求在运行之前加载 dictionary.js

termlinker.js is just a simple regexp search replace on the defined terms. It could be an inline <script> too. requires that the dictionary.js has been loaded before you run it.

// termlinker.js - add some tags
var element = document.getElementById("theText");

element.innerHTML = element.innerHTML.replace(termsRE, function(term) {
  return "<a href='http://en.wikipedia.org/wiki/"+escape(term)+"'>"+term+"</a>";
}); 

这只是搜索术语数组中的任何单词,并用指向该术语的链接替换它们。当然,它也会匹配HTML标记内的属性和值,这可能会破坏您的标记。

This simply searches for any words in the terms array and replaces them with a link to the term. Of course, it will also match properties and values inside HTML tags, which could break your markup a little.

所有一起抛出这个(jsbin预览)

基于以前的最小案例,这里是使用API​​直接接收单词列表的代码示例和 jsbin预览

Based off of the "minimum case" from before, here is the code sample for using the API to receive the list of words directly and the jsbin preview

// Utility Function
RegExp.escape = function(text) {
  if (!arguments.callee.sRE) {
    var specials = [
      '/', '.', '*', '+', '?', '|',
      '(', ')', '[', ']', '{', '}', '\\'
    ];
    arguments.callee.sRE = new RegExp(
      '(\\' + specials.join('|\\') + ')', 'g'
    );
  }
  return text.replace(arguments.callee.sRE, '\\$1');
};

// JSONP Callback for receiving the API
function receiveAPI(data) {
  var terms = [];
  if (!data || !data['query'] || !data['query']['allpages']) return false;  
  var pages = data.query.allpages
  for (var x in pages) {
    terms.push(RegExp.escape(pages[x].title));
  }
  window.termsRE = new RegExp("\\b("+terms.reverse().join("|")+")\\b",'gi');
  linkterms();
}  

function linkterms() {
  var element = document.getElementById("theText");

  element.innerHTML = element.innerHTML.replace(termsRE, function(term) {
    return "<a href='http://en.wikipedia.org/wiki/"+escape(term)+"'>"+term+"</a>";
  });
}


// the apfrom=testing can be removed, it is only there so that
// we can get some useful terms near "testing" to work with.
// we are limited to 500 terms for the purpose of this demo:
url = 'http://en.wikipedia.org/w/api.php?action=query&list=allpages&aplimit=500&format=json&callback=receiveAPI' + '&apfrom=testing';
var elem = document.createElement('script');
elem.setAttribute('src', url);
elem.setAttribute('type','text/javascript');
document.getElementsByTagName('head')[0].appendChild (elem);

这篇关于将页面上的术语链接到纯JavaScript中的Wikipedia文章的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆