Wikimedia API从JSON字符串获取相关数据 [英] wikimedia api getting relavant data from json string

查看:123
本文介绍了Wikimedia API从JSON字符串获取相关数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我昨天问的问题.我能够获得所需的数据.最终数据是这样的.请点击此链接.

This is the question I asked yesterday. I was able to get the required data. The final data is like this. Please follow this link.

我尝试使用以下代码来获取所有信息框数据

I tried with the following code to get all the infobox data

                                content = content.split("}}\n");
                                for(k in content)
                                {
                                    if(content[k].search("Infobox")==2)
                                    {
                                        var infobox  = content[k];
                                        alert(infobox); 
                                        infobox = infobox.replace("{{","");
                                        alert(infobox);
                                        infobox = infobox.split("\n|");
                                        //alert(infobox[0]);
                                        var infohtml="";
                                        for(l in infobox)
                                        {
                                            if(infobox[l].search("=")>0)
                                            {
                                                var line = infobox[l].split("=");

                                                infohtml = infohtml+"<tr><td>"+line[0]+"</td><td>"+line[1]+"</td></tr>";

                                            }
                                        }
                                        infohtml="<table>"+infohtml+"</table>";
                                        $('#con').html(infohtml);
                                        break;
                                    }
                                }

我最初认为每个元素都包含在{{}}中.所以我写了这段代码.但是我看到的是,我无法由此获得整个信息框数据.有这个元素

I initially thought each element is enclosed in {{ }}. So I wrote this code. But what I see is, I was not able to get the entire infobox data with this. There is this element

{{Sfn|National Informatics Centre|2005}}

正在发生,这结束了我的信息框数据.

occuring which ends my infobox data.

不使用json似乎要简单得多.请帮助我

It seems to be far simpler without using json. Please help me

推荐答案

您是否尝试过 DBpedia ? Afaik他们提供模板使用信息.还有一个名为 Templatetiger 的工具服务器工具,该工具会从静态转储(非实时)中提取模板.

Have you tried DBpedia? Afaik they provide template usage information. There is also a toolserver tool named Templatetiger, which does template extraction from the static dumps (not live).

但是,我曾经写过一个小片段来从javascript中的Wikitext中提取模板:

However, I once wrote a tiny snippet to extract templates from wikitext in javascript:

var title; // of the template
var wikitext; // of the page
var templateRegexp = new RegExp("{{\\s*"+(title.indexOf(":")>-1?"(?:Vorlage:|Template:)?"+title:title)+"([^[\\]{}]*(?:{{[^{}]*}}|\\[?\\[[^[\\]]*\\]?\\])?[^[\\]{}]*)+}}", "g");
var paramRegexp = /\s*\|[^{}|]*?((?:{{[^{}]*}}|\[?\[[^[\]]*\]?\])?[^[\]{}|]*)*/g;
wikitext.replace(templateRegexp, function(template){
    // logabout(template, "input ");
    var parameters = template.match(paramRegexp);
    if (!parameters) {
        console.log(page.title + " ohne Parameter:\n" + template);
        parameters  = [];
        }
    var unnamed = 1;
    var p = parameters.reduce(function(map, line) {
        line = line.replace(/^\s*\|/,"");
        var i = line.indexOf("=");
        map[line.substr(0,i).trim() || unnamed++] = line.substr(i+1).trim();
        return map;
    }, {});
    // you have an object "p" in here containing the template parameters
});

它具有一级嵌套模板,但仍然非常容易出错.使用regexp解析Wikitext与尝试在html上进行邪恶::)

It features one-level nested templates, but still is very error-prone. Parsing wikitext with regexp is as evil as trying to do it on html :-)

从api查询解析树: api.php?action = query& prop = revisions& rvprop = content& rvgeneratexml = 1 &titles = ... . 从该parsetree中,您将能够轻松提取模板.

It may be easier to query the parse-tree from the api: api.php?action=query&prop=revisions&rvprop=content&rvgeneratexml=1&titles=.... From that parsetree you will be able to extract the templates easily.

这篇关于Wikimedia API从JSON字符串获取相关数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆