htaccess的搜索引擎优化Googlebot抓取没有hashbangs单页的应用 [英] .htaccess for SEO bots crawling single page applications without hashbangs

查看:263
本文介绍了htaccess的搜索引擎优化Googlebot抓取没有hashbangs单页的应用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用 pushState 启用页,通常你重定向使用 escaped_fragment 约定SEO机器人。你可以阅读更多关于 <一个href="https://developers.google.com/webmasters/ajax-crawling/docs/specification">here.

Using a pushState enabled page, normally you redirect SEO bots using the escaped_fragment convention. You can read more about that here.

公约假定你将使用(#!)hashbang preFIX之前,所有的URI的一个单页的应用程序。搜索引擎机器人会用它自己的识别约定 escaped_fragment 制作网页请求时。

The convention assumes that you will be using a (#!) hashbang prefix before all of your URI's on a single page application. SEO bots will escape these fragments by replacing the hashbang with it's own recognizable convention escaped_fragment when making a page request.

//Your page
http://example.com/#!home

//Requested by bots as
http://example.com/?_escaped_fragment=home

这允许网站管理员来检测机器人,并将其重定向到一个缓存的prerendered页。

This allows the site administrator to detect bots, and redirect them to a cached prerendered page.

RewriteCond %{QUERY_STRING} ^_escaped_fragment_=(.*)$
RewriteRule ^(.*)$  https://s3.amazonaws.com/mybucket/$1 [P,QSA,L]

现在的问题是,该hashbang是越来越淘汰迅速与广泛适应 pushState 支持。这也很丑陋,是不是很直观的用户。

The problem is that the hashbang is getting phased out quickly with the widely adapted pushState support. It's also really ugly and isn't very intuitive to a user.

那么,如果我们使用HTML5模式,pushState引导的全部的用户应用程序?

So what if we used HTML5 mode where pushState guides the entire user application?

//Your index is using pushState
http://example.com/

//Your category is using pushState (not a folder)
http://example.com/category

//Your category/subcategory is using pushState
http://example.com/category/subcategory

能否重写规则向导机器人使用这个新的约定缓存的版本? <一href="http://stackoverflow.com/questions/17108931/how-to-do-a-specific-condition-for-escaped-fragment-with-rewrite-rule-in-htacce">Related但仅占指数边缘情况。的谷歌也的 有一篇文章 的建议使用的选择在的方法使用&LT这种单边情况; META NAME =片段的内容=&GT!; &LT; HEAD&GT; 页面。再次,这是为单个边缘的情况下。在这里,我们谈论的是处理每一页作为的选入的塞纳里奥。

Can rewrite rules guide bots to your cached version using this newer convention? Related but only accounts for index edge case. Google also has an article that suggests using an opt-in method for this single edge case using <meta name="fragment" content="!"> in the <head> of the page. Again, this is for a single edge case. Here we are talking about handling every page as an opt-in senario.

http://example.com/?escaped_fragment=
http://example.com/category?escaped_fragment=
http://example.com/category/subcategory?escaped_fragment=

我在想,在 escaped_fragment 仍然可以被用作标识符的搜索引擎优化的机器人,那我可以提取一切其间的域和识别号来追加像我斗的位置:

I'm thinking that the escaped_fragment could still be used as an identifier for SEO bots, and that I could extract everything inbetween the the domain and this identifier to append to my bucket location like:

RewriteCond %{QUERY_STRING} ^_escaped_fragment_=$
# (high level example I have no idea how to do this)
# extract "category/subcategory" == $2
# from http://example.com/category/subcategory?escaped_fragment=
RewriteRule ^(.*)$  https://s3.amazonaws.com/mybucket/$2 [P,QSA,L]

什么是处理这个问题的最好方法是什么?

What's the best way to handle this?

推荐答案

在一个页面的Web应用程序也有类似的问题。

Had a similar problem on a single page web app.

我发现这个问题的唯一解决办法是建立有效的使一些通航由谷歌(等)机器人的目的,页面静态版本。

The only solution I found to this problem was effectively creating static versions of pages for the purpose of making something navigable by the Google (and other) bots.

您可以自己做,但也有服务,做的正是这一点,并创建静态缓存为你(和快照服务到机器人在他们的CDN)。

You could do this yourself, but there are also services that do exactly this and create your static cache for you (and serve up the snapshots to the bots over their CDN).

我结束了使用SEO4Ajax,但其他类似的服务都可以!

I ended up using SEO4Ajax, although other similar services are available!

这篇关于htaccess的搜索引擎优化Googlebot抓取没有hashbangs单页的应用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆