如何使用Node.js服务从Backbone.js的应用内容搜索爬虫的搜索引擎优化 [英] Using node.js to serve content from a Backbone.js app to search crawlers for SEO

查看:138
本文介绍了如何使用Node.js服务从Backbone.js的应用内容搜索爬虫的搜索引擎优化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

无论是我的谷歌福失败我还是真的没有太多的人这样做呢。如你所知,Backbone.js的有一个致命的弱点 - 它不能起到它呈现给网页爬虫的HTML如Googlebot的,因为他们不运行JavaScript(虽然考虑到其与谷歌自己的资源,V8发动机,以及发人深省的事实: JavaScript应用程序都在上升,我希望这一天发生)。我知道,谷歌有一个解决办法hashbang政策,但它只是一个坏主意。另外,我使用pushState的。这是一个非常重要的问题,我和我希望它成为其他人。 SEO是什么,是不容忽视的,因此不能被视为对许多应用在那里,要求或取决于就可以了。

Either my google-fu has failed me or there really aren't too many people doing this yet. As you know, Backbone.js has an achilles heel--it cannot serve the html it renders to page crawlers such as googlebot because they do not run JavaScript (although given that its Google with their resources, V8 engine, and the sobering fact that JavaScript applications are on the rise, I expect this to someday happen). I'm aware that Google has a hashbang workaround policy but it's simply a bad idea. Plus, I'm using PushState. This is an extremely important issue for me and I would expect it to be for others as well. SEO is something that cannot be ignored and thus cannot be considered for many applications out there that require or depend on it.

输入node.js中我只是刚刚开始进入这个热潮,但它似乎可能有客户端上存在相同的Backbone.js的应用是服务器手拉手node.js中上Node.js的话将能够满足来自Backbone.js的应用呈现给网页爬虫HTML。这似乎是可行的,但我要找的人谁是更有经验与node.js的,甚至更好,别人谁实际上已经这样做了,劝我在此。

Enter node.js. I'm only just starting to get into this craze but it seems possible to have the same Backbone.js app that exists on the client be on the server holding hands with node.js. node.js would then be able to serve html rendered from the Backbone.js app to page crawlers. It seems feasible but I'm looking for someone who is more experienced with node.js or even better, someone who has actually done this, to advise me on this.

哪些步骤,我需要采取允许我使用Node.js的为我的Backbone.js的应用网络爬虫?另外,我的骨干应用程序消耗上所写的,我认为将使得这个少了头痛的Rails的API。

What steps do I need to take to allow me to use node.js to serve my Backbone.js app to web crawlers? Also, my Backbone app consumes an API that is written in Rails which I think would make this less of a headache.

编辑:我没有提到我已经写在Backbone.js的生产应用。我期待这一技术应用到应用程序。

I failed to mention that I already have a production app written in Backbone.js. I'm looking to apply this technique to that app.

推荐答案

首先,让我补充一个声明,我认为这种使用的node.js是一个坏主意。二免责声明:我已经做过类似的黑客,但只是进行自动化测试,不是爬虫的目的

First of all, let me add a disclaimer that I think this use of node.js is a bad idea. Second disclaimer: I've done similar hacks, but just for the purpose of automated testing, not crawlers.

有了这样的方式,让我们去。如果你打算运行在服务器上的客户端应用程序,你需要重新创建服务器上的浏览器环境:

With that out of the way, let's go. If you intend to run your client-side app on server, you'll need to recreate the browser environment on your server:


  1. 最明显的是,你错过的DOM(文档对象模型) - 基本上您解析HTML文档顶部的AST。这Node.js的解决方案是 jsdom

这不过是不够的。您的浏览器也暴露了BOM(浏览器对象模型) - 获取浏览器的功能,比如,例如, history.pushState 。这是它得到棘手。有两种选择:你可以​​试着弯曲 phantomjs 或的 casperjs 来运行你的应用程序,然后刮HTML关闭它。因为你正在运行与锯掉UI部件一个巨大的充满WebKit浏览器它是脆弱的。

That however will not suffice. Your browser also exposes BOM (Browser Object Model) - access to browser features like, for example, history.pushState. This is where it gets tricky. There are two options: you can try to bend phantomjs or casperjs to run your app and then scrape the HTML off it. It's fragile since you're running a huge full WebKit browser with the UI parts sawed off.

另外一种选择是僵尸 - 这是轻量级重新实现浏览器的Javascript功能。根据页面支持pushState的,但我的经验是,浏览器仿真还远远没有完成 - 但试一试,看看你走多远

The other option is Zombie - which is lightweight re-implementation of browser features in Javascript. According to the page it supports pushState, but my experience is that the browser emulation is far from complete - however give it a try and see how far you get.

这篇关于如何使用Node.js服务从Backbone.js的应用内容搜索爬虫的搜索引擎优化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆