给定公司名称列表,如何获取公司名称,网站网址,成立年份,员工人数等 [英] Given a list of company names, how to fetch company names, website url, year established, number of employees etc

查看:131
本文介绍了给定公司名称列表,如何获取公司名称,网站网址,成立年份,员工人数等的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个公司名称列表,例如Microsoft Corp,Kimberly Clark Corporation等,对于每个公司,我想检索以下字段:

I have a list of company names such as Microsoft Corp, Kimberly Clark Corporation etc, and for each company, I would like to retrieve fields such as:


  1. 其公司徽标

  2. 谷歌地图的地理标识符

  3. 网站网址

  4. 年份已建立

  5. 证券交易所和证券交易所股票代码

  6. 一种获取最近几天股价的方法

  7. 关于维基百科的摘要/摘要

  8. 子公司和母公司的列表。例如,对于波音公司来说就是Jeppessen和Availl,Inc等。

  1. Its company logo
  2. Georgraphic identifier for google maps
  3. Website url
  4. Year established
  5. Stock exchange and stock exchange ticker symbol
  6. A way to get the stock prices over the last few days
  7. About / abstract from wikipedia
  8. A list of subsidiaries and parent companies. For instance, for Boeing it would be Jeppessen and Availl, Inc etc.

我研究了Sparql和Dbpedia。关于如何提出sparql查询以检索其中一些信息的任何建议? (我不需要检索所有字段就可以开始使用几个字段。)

I have looked into Sparql and Dbpedia. Any suggestion on how to come up with the sparql query to retrieve some of those information? (I don't need to retrieve all the fields just a couple fields for me to get started.)

谢谢!

推荐答案

您可以开始使用以下查询:

You can start using a query like this:

select * where {
  values ?company { dbpedia:Microsoft
                    <http://dbpedia.org/resource/Apple_Inc.>
                    dbpedia:Kimberly-Clark
                  } 
  OPTIONAL { { ?company dbpprop:logo ?logo  FILTER(isIRI(?logo)) }
             UNION 
             { ?company foaf:depiction ?logo FILTER(isIRI(?logo)) } }
  OPTIONAL { ?company dbpedia-owl:abstract ?abstract 
             FILTER(langMatches(lang(?abstract),"EN")) }
  OPTIONAL { ?company geo:lat ?latitude ;
                      geo:long ?longitude }
  OPTIONAL { ?company dbpedia-owl:foundingDate ?foundingDate }
  OPTIONAL { ?company dbpedia-owl:wikiPageExternalLink ?externalLink }
  OPTIONAL { ?company dbpprop:symbol ?stockSymbol }
  OPTIONAL { ?company dbpedia-owl:subsidiary ?subsidiaryPage }
}

SPARQL结果

我基于我在DBpedia页面上看到的 Microsoft Kimberly-Clark 苹果公司。。数据不是特别干净,因此,我在查询中添加了一些过滤器:

I based this on the properties I saw on the DBpedia pages for Microsoft, Kimberly-Clark, and Apple, Inc.. The data isn't particularly clean, and because of that, I added a few filters to the query:


  • 并非全部这些列出的子公司,而Microsoft的子公司属性与子公司无关,而是一个页面,该页面可能枚举了一些子公司。)

  • Not all of these list subsidiaries, and the subsidiary property for Microsoft doesn't relate to subsidiaries, but a page that presumably enumerates some subsidiaries).

其中一些公司拥有徽标的错误信息(因此 FILTER isIRI 相同)。例如,Apple的 dbpprop:logo 是整数 150 。我认为这来自Wikipedia信息框行 |徽标= [[文件:{{#property:p154}} | 150px]] ,其中 150 越来越不实用了值。通过 isIRI 进行过滤会有所帮助。

Some of the companies have bad information for the logos (hence the FILTERs with isIRI). For instance, Apple's dbpprop:logo is the integer 150. I think that that comes from the Wikipedia infobox line | logo = [[File:{{#property:p154}}|150px]], where 150 is getting pulled out rather than a more meaningful value. Filtering by isIRI helps a little bit.

有些公司有多个成立日期。我不确定您将如何决定使用多个。

Some of the companies have multiple founding dates. I'm not sure how you might decided which of the multiple ones to use.

虽然公司页面通常被列为外部链接,但并非全部与页面关联的外部链接是公司页面。我不确定如何选择一个作为公司页面。

While the company page is usually listed as an external link, not all of the external links associated with a page are the company page. I'm not sure how you could select one as the company page.

所有这些,看来您可以从DBpedia中获得很多此类信息。

All that said, it looks like you can get a lot of this information from DBpedia.

这篇关于给定公司名称列表,如何获取公司名称,网站网址,成立年份,员工人数等的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆