寻找易于被搜索引擎索引的唯一 ID 模式 [英] Look for unique ID pattern which easy indexed by search engines

查看:22
本文介绍了寻找易于被搜索引擎索引的唯一 ID 模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

来自 Microsoft - KB2756872" 或来自 National Vulnerability数据库 - CVE-2010-1428" 或来自 Red Hat - RHSA-2010:0376" 或来自 OID - 1.3.6.1.4.1.311" 或来自 UUID/GUID- 550e8400-e29b-41d4-a716-446655440000".

Like from Microsoft - "KB2756872" or from National Vulnerability Database - "CVE-2010-1428" or from Red Hat - "RHSA-2010:0376" or from OIDs - "1.3.6.1.4.1.311" or from UUID/GUID - "550e8400-e29b-41d4-a716-446655440000".

我想将几个作业放到 UID 中.接下来看...

I want to put several jobs to UIDs. See next...

我开发博客软件并有想法将唯一 ID 放在每个帖子都可以轻松识别来自本地存储的副本对应于远程发布的副本.

I develop blog software and have idea to put unique ID in body of each post so can easily identify that copy from local storage is correspond to remote published copy.

我还想发布到许多不同的博客服务,所以如果一个is down 文章将可以从另一个访问.所以链接可以死了,但如果我添加 UID - 任何人都可以尝试网络搜索来查找帖子另一个服务!

Also I want to post to many different blogging services so if one is down articles will be accessible from another. So link can dead but if I add UID - anyone can try web-search to find post on another service!

这也允许收集一些文章传播统计数据.许多网站只是复制内容(文案和重写机器人和人)来破坏搜索引擎.使用 UID I可以轻松识别此类网站...

Also this allow to gather some article spreading statistics. Many sites just replicate content (copy-writing and rewriting bots and people) to broke search engines. With UID I easily can identify such sites...

所以我的问题是如何制作 UID(以哪种形式)所以它会是很容易被搜索引擎(网络,如谷歌/雅虎,和企业,例如 Lucene/Solr/Sphinx/Xapian/等).

So my question how is to make UIDs (in which form) so it would be easily indexed by search engines (web, like Google/Yahoo, and corporate, like Lucene/Solr/Sphinx/Xapian/etc).

我知道搜索引擎的一些限制,例如:

I know about some limitation of search engine like:

  • 每个搜索部分仅 >= 3 个字符
  • 它不是像 gfh6wytrh6wu56he5gahj763 这样的索引灰尘

所以这项任务并不容易......

so this task s not easy...

感谢任何建议(书籍/博客文章/等).

Any advice is appreciated (books/blog articles/etc).

推荐答案

你可以使用 标签 URIs,由 RFC 4151 定义.

You could use Tag URIs, as defined by RFC 4151.

它们是全球唯一的,拥有域名或电子邮件地址至少一天的每个人都可以铸造它们.

They are globally unique, and everyone who owned a domain name or an email address for at least a day can mint them.

请注意,这些 URI 仅标识,它们不定位.因此,标签 URI 不会说明发布内容的位置.

Note that these URIs only identify, they don’t locate. So a Tag URI doesn’t say anything about where something is published.

假设您网站的域是example.com".如果您创建博客文章,则可以创建以下标签 URI:

Let’s say your site’s domain is "example.com". If you create a blog post, you could create the following Tag URI:

tag:example.com,2012-12:cute-cat

请注意,此 URI 中的日期不是发布日期!它必须是您拥有该域(或电子邮件地址)的(过去)日期.如果您在 2003 年注册了您的域,则始终可以使用以 tag:example.com,2004: 开头的标记 URI(不是2003",因为2003"表示2003-01"-01",这可能是您尚未拥有该域的时间),然后是您控制的(唯一)字符串.但是,如果您愿意,当然可以随时使用发布日期.但不要使用未来的日期.

Note that the date in this URI is not a publication date! It must be a (past) date on which you owned the domain (resp. email address). If you registered your domain in 2003, you could always use Tag URIs starting with tag:example.com,2004: (not "2003", because "2003" would mean "2003-01-01", which might be a time where you didn’t own the domain yet), followed by a (unique) string under your control. However, if you like you could always use the publication date, of course. But don’t use future dates.

这篇关于寻找易于被搜索引擎索引的唯一 ID 模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆