有关“微格式与HTML + RDFa"的统计信息;采用 [英] Statistics about "Microformat vs HTML+RDFa" adoption

查看:138
本文介绍了有关“微格式与HTML + RDFa"的统计信息;采用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

关于网络使用"是否有一些最近可靠的统计信息, (使用一种标准或另一种标准的网页)?

还是有关vCard(人员和/或组织)使用范围的特定统计信息?

仅统计,此问题与什么是最好的想法?"有关.或如何使用?".寻找统计数字以将采用微格式微数据是一种RDFa-HTML.


注释

解释上下文

RDFa Lite 是唯一的 RDFa Lite . HTML5已成为2014-10-28的W3C建议书,但没有一个人受到W3C的祝福.我了解 schema.org 是采用(重用社区方案)RDFa的最佳方法.

另一方面,微格式比较老,而且最简单;因此,也许是网络上使用最多的(!?是吗?).

关于"vCard数据统计信息"

如果我们需要一些作用域进行统计,让我们使用vCard作为范围:

  • Microformat的 hCard 人员组织使用(标准)RDFa Lite 或微数据对vCard信息进行编码.

其他注意事项

Wikipedia表达了一个古老的(2012年)且无法确认的断言(无资料!),例如hCard之类的微格式,在网络上的发布要比模式和其他形式更多" ,而 Webdatacommons 是一团糟,没有统计报告.

(编辑)现在已解决维基百科的引用错误.


(在@sashoalm评论后编辑) 对于那些不同意该问题有效的人,请注意.

此问题是软件问题,而不是请求异地资源" ...

问题:要确定项目中的库,框架,数据模型等,我们需要使用目前和未来几年正在使用的工具...在软件开发中项目决策,我们需要有关用户趋势,框架采用等方面的统计信息./p>

PS:在Stackoverflow中,有很多关于语言统计的讨论,即相同问题集".例如: 1 2 3 4 5 标记为[usage-statistics] 的问题.

解决方案

现在,我看到了一些统计信息(!!),Wikipedia的链接丢失了……我已更正.它没有更新,来自"2013年冬季" (约1.5或2岁的数据收集者),但显示出现实情况和趋势.

http://webdatacommons.org/structureddata/index.html#toc2

这是报告中的图表(具有 RDFa + HTML 优势!):

翻译:

  • 第5节提取过程",即说在每个页面上,我们都基于Anything To Triples(Any23)库运行RDF提取器" ,因此(RDF和Microformat)导致三联". (不仅是RDF).

  • 每个域"的思想据统计,域对所有页面都使用统一的策略...但是我认为这种统一性是错误的,每个域中只有很少的页面采用语义标记". ...它不比URL更公正,只是另一张图片.无论如何,结果是死热,约57%比43%.

  • 只有21%的语义标记网址" 是2013年的Microformat ,其他都是RDFa-HTML(Microdata也是RDFa的一种).

  • 使用域(Ds)和URL(Us),(Ds + Us)/2的百分比百分比的平均值,RDF的结果为〜60%,微格式的结果为〜40%.

  • 在2013年之前,微格式一直占据主导地位,因此,很明显,"RDFa-HTML"的增长非常迅猛.自2011年以来...趋势很明显.

  • 如果我们采用每个域"的算术平均值,和每个URL"统计数字显示,我们之间的Microformats和RDFa-HTML彼此接近,但Microformat却少得多(2014年RDFa-HTML呈强劲增长趋势).

这里有一个@sashoalm讨论的表格,其中显示了百分比和总计


注意1:HTML5仅在2014-10-28发行,因此只有〜2015-10,我们才能检查新标准在网络上的真实(确定性)影响.预期的重要影响是HTML5并未使Microdata受益,因此唯一的标准是 HTML + RDFa (建议使用 RDFa Lite )...在将来,也许会有更少的微数据,更多的 schema.org .

注意2:使用样板文字来计数网页的方法问题一些巨大的克隆语义标记":我认为下一代"是指统计信息可以使用一些每域analisys"使多样性(语义标记页面)的URL子统计信息(抽样)成为可能.理想的方法是对样板进行称重(例如,对非克隆进行一次计数并使用克隆的1+SQRT(count)).

结论

今天也许有些人使用 Microformat ,但是Web上有更多使用 RDFa-HTML 的页面(Microdata,RDFa,RDFa Lite等),并且趋势是增长.

如果您的项目是明年的项目,那么统计数字表明您将使用RDFa.


注意

RDFa的另一个令人讨厌的计数不是使用,而是词汇的重用(!).请参见链接的开放词汇(LOV)

Are there some recent and reliable statistics about "Web use" (webpages using one standard or another) of these standards?

Or an specific statistic about vCard (person and/or organization) scope of use?

Only statistics, this question is not about "what the best ideia?" or "how to use it?". Looking for statistics numbers to compare Microformats adoption with (any kind of) RDFa in HTML adoption.

We can considere, for "counting pages" statistics, that Microdata is a kind of RDFa-HTML.


NOTES

Explain context

The RDFa Lite is the only W3C recommendation, when we talk about "Microdata vs Microformat", and Microdata have a better map to RDFa Lite. HTML5 has become a W3C Recommendation in 2014-10-28, and neither one was blessed by W3C. I understand that schema.org is the best way to adopt (reuse community-schemas) RDFa.

By other hand Microformats is older, and the most simple; so, perhaps, the most used in the Web (!? is it?).

About "vCard data statistics"

If we need some scope for the statistics, let's use vCard as scope:

  • Microformat's hCard and h-Card are standards for display vCards on (any) HTML, and was used for people and organizations.

  • schema.org's Person and Organization encodes vCard information with (standard) RDFa Lite or Microdata.

Other notes

Wikipedia express an old (2012's) and not-confirmable assertion (no source!), "Microformats such as hCard, however, continue to be published more than schema and others on the web", and Webdatacommons is a mess, no statistical report.

(edit) now Wikipedia's citation error is fixed.


(edit after @sashoalm comment) Note for those who disagree that this question is valid.

This question is a software problem, not a "request for off-site resource"...

PROBLEM: to decide what library, framework, data-model, etc. in a project, we need to use tools that are in use today and in the next few years... To make project decisions in a software development, we need statistics about user tendency, framework adoption, etc.

PS: here in Stackoverflow there are a lot of discussions about language statistics, that is the same "set of problems". Example: 1, 2, 3,4, 5, 6. See also the questions tagged with [usage-statistics].

解决方案

Now I see, there are some statistics (!!), the link of Wikipedia was lost... I corrected. It isn't updated, is from "Winter 2013" (~1.5 or 2 years old collected data), but show reality and tendencies.

http://webdatacommons.org/structureddata/index.html#toc2

This is the chart at the report (with RDFa+HTML dominance!):

Interpreting:

  • the section 5, "Extraction Process", say that "on each page, we run our RDF extractor based on the Anything To Triples (Any23) library", so all (RDF and Microformat) resulted in "triples" (not only RDF).

  • The ideia for "per domain" statistics is that domains use uniform politics for all pages... But I think this uniformity is false, only few pages per domain adopt "semantic markup" ... It is not more unbiased than URLs, is only another picture. Anyway, the outcome was dead heat, ~57% vs 43%.

  • Only 21% of the "semantic markup URLs" of 2013 was Microformat, all other are RDFa-HTML (Microdata is also a kind of RDFa).

  • using the average of percentuals of Domains (Ds) and URLs (Us), (Ds+Us)/2, the outcome is ~60% for RDFs and ~40% for Microformats.

  • before 2013 there was a dominance of Microformats, so, is evident the big growing of "RDFa-HTML" since 2011... The tendency is clear.

  • If we adopt the arithmetic mean of "per domain" and "per URL" countings, we have Microformats and RDFa-HTML near each other, with but with little less Microformat (and the strong tendency to RDFa-HTML grow in 2014).

Here a table for @sashoalm discussion, showing the percentuals and totals


NOTE1: HTML5 was released only 2014-10-28, so only ~2015-10 we will can check the real (definitive) impact of the new standard on the Web. An important expected impact is that Microdata not was blessed by HTML5, so the only standard is HTML+RDFa (that recommends RDFa Lite)... In the future perhaps there will less Microdata and more schema.org.

NOTE2: methodological problem of counting web-pages, of boilerplate text with some huge-cloned "semantic markup": I think that the "next generation" of statiscs can use some "per domain analisys" to make URL substatistics (sampling) of diversity (of semantically marked pages). Ideal is to weigh (p. ex. count once the non-clones and use 1+SQRT(count) of clones) the boilerplate.

Conclusion

Today perhaps some people use Microformat, but there are more pages in the Web using RDFa-HTML (Microdata, RDFa, RDFa Lite, etc.), and the tendency is to grow.

If your project is for next years, the statistics say to use RDFa.


NOTE

Another insteresting counting for RDFa is not the use, but the reuse of vocabularies (!). See Linked Open Vocabularies (LOV)

这篇关于有关“微格式与HTML + RDFa"的统计信息;采用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆