搜索引擎如何索引多语言内容? [英] How do search engines index multilingual content?

查看:83
本文介绍了搜索引擎如何索引多语言内容?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用四种不同的

语言构建一个内容相同的网站。在第一次访问时,搜索引擎根据访问者的IP地址确定内容的语言

。用户看到的内容是

内容,一次只能使用一种语言。然后,他或她可以切换到另一种语言,然后将其设置为首选语言,但他再次使用
或者仅使用另一种语言查看内容。


现在的问题是:如何让搜索引擎以所有语言索引所有

内容?


我是否应在DIV中包含未显示的内容,并将显示设置为

" none" (就像我们以前在noframes标签中包含完整的网站一样)?

或者搜索引擎会忽略不可见的DIV吗?


或者我可以以某种方式检测到搜索引擎正在访问并发送一个

页面,其中包含所有四种语言的完整内容?或者

让我被禁止?


或者我必须依赖本地链接后的搜索引擎

其他语言的页面?这可能是一个问题,因为

变化的内容总是显示在同一页面上,所以URI保持相同的只有一个参数,因此:

" content.php?language = oneoffourlanguages"。事实上它甚至可能是不可能的,因为我不想通过GET通过URL传输语言信息

,但是想通过POST通过表单发送它。

所以所有语言的URI都是一样的(至少在我瞄准的

版本中)。


如果你已经在你的网站上解决了这个问题或者知道如何去讨论这个问题,我将不胜感激。

I am building a website with identical content in four different
languages. On a first visit, the search engine determines the language
of the content by the IP address of the visitor. What the user sees is
content in only one language at a time. He or she can then switch to
another language and set this as the preferred language, but again he
or she sees content in only this one other language.

The question now is: How do I get search engines to index ALL of the
content, in all languages?

Should I include the non-displayed content in DIVs with display set to
"none" (like we used to include complete websites in the noframes tag)?
Or do search engines ignore invisible DIVs?

Or can I somehow detect that a search engine is visiting and deliver a
page with the complete content of all four languages in it? Or would
that get me banned?

Or do I have to rely on the search engine following the local links to
the pages in the other languages? This might be a problem, because the
varying content is always displayed on the same page, so the URI stays
the same, and only one parameter changes, thus:
"content.php?language=oneoffourlanguages". In fact it might even be
impossible, because I do not want to transfer the language information
through the URL via GET, but want to send it through a form via POST.
So the URI is exactly the same for all languages (at least in the
version I am aiming at).

If you have solved this problem on your website or know how to go about
it, I''d be grateful for some help.

推荐答案

Manfred Kooistra写道:
Manfred Kooistra wrote:
我正在用四种不同的语言构建一个内容相同的网站。


罚款。你已经明确地将不同的版本相互链接,

对吗?使用合理的链接文本,例如其他

语言中的页面名称,或者,作为可容忍的选项,

语言本身的语言名称,对吧?没有标记,没有下拉菜单,mm''kay?

首次访问时,搜索引擎会根据访问者的IP地址确定内容的语言。


不,不。这个想法很荒谬。当搜索引擎为您的内容编制索引时,没有访问者(超出

搜索引擎本身)。

此外,使用IP地址来确定某人的语言

荒谬。

用户看到的内容一次只能使用一种语言。


很好,但是他也应该对其他版本有简单的访问权限(链接)。

他或她可以切换到
另一个语言并将其设置为首选语言,


设置首选语言将通过URL字符串或

cookie进行。对于一些(很多)用户来说,这将是额外的奖励,但需要先建立坚实的

基础。

现在的问题是:我如何获得搜索引擎索引所有语言的所有内容?


你让他们找到你所有页面的方式:使用链接。

我应该在显示设置为的DIV中包含未显示的内容
无 (就像我们过去在noframes标签中包含完整的网站一样)?


不,这将是荒谬和破坏性的(特别是当您的风格

表未被使用时)。

或者搜索引擎忽略了隐形DIV?


他们可能会,或者他们可能没有,或者他们可能会因怀疑

关键字垃圾邮件或伪装而惩罚该页面。

或我可以以某种方式检测到搜索引擎正在访问并提供一个包含所有四种语言的完整内容的页面吗?


一些索引机器人可以启发式检测。但是不要这样做。

或者那会让我被禁止吗?


希望是的。

或者我是否必须依赖本地链接搜索引擎才能使用其他语言的页面?


这是一般的想法。

这可能是一个问题,因为
不同的内容总是显示在同一页面上,所以URI保持相同,只有一个参数发生变化,因此:
content.php?language = oneoffourlanguages。


如果你有理由怀疑这是一个问题,那么就不要这样做了。但我不担心搜索引擎会忽略页面

并使用表单的简单查询部分?foo = bar - 它们可能存在,但

它们是搜索引擎竞争中的输家。

实际上它甚至可能是不可能的,因为我不想通过GET通过URL传输语言信息,但是想要通过POST通过表单发送。


就是不要这样做。简单,嗯?

因此所有语言的URI都是完全相同的(至少在我所针对的
版本中)。
I am building a website with identical content in four different
languages.
Fine. You have explicitly linked the different versions to each other,
right? With reasonable link texts like the name of the page in the other
language, or, as tolerable option, the name of the language in the
language itself, right? No flags, no dropdowns, mm''kay?
On a first visit, the search engine determines the language
of the content by the IP address of the visitor.
No it does not. The idea is absurd. There is no visitor (beyond the
search engine itself) when a search engine indexes your content.
Besides, using the IP address to determine the language of a person is
absurd, too.
What the user sees is
content in only one language at a time.
Fine, but he should have simple access (links) to the other versions, too.
He or she can then switch to
another language and set this as the preferred language,
Setting a preferred language would take place via a URL string or via
cookies. That would be extra bonus to some (many) users, but the solid
basis needs to be built first.
The question now is: How do I get search engines to index ALL of the
content, in all languages?
The way you make them find all of your pages in general: using links.
Should I include the non-displayed content in DIVs with display set to
"none" (like we used to include complete websites in the noframes tag)?
No, that would be absurd and destructive (especially when your style
sheet is not used).
Or do search engines ignore invisible DIVs?
They may, or they may not, or they may punish the page for suspected
keyword spamming or cloaking.
Or can I somehow detect that a search engine is visiting and deliver a
page with the complete content of all four languages in it?
Some indexing robots can be detected heuristically. But don''t do it.
Or would that get me banned?
Hopefully yes.
Or do I have to rely on the search engine following the local links to
the pages in the other languages?
That''s the general idea.
This might be a problem, because the
varying content is always displayed on the same page, so the URI stays
the same, and only one parameter changes, thus:
"content.php?language=oneoffourlanguages".
If you have reasons to suspect that this is a problem, then don''t do
that. But I wouldn''t be worried about search engines that ignore pages
with a simple query part of the form ?foo=bar - they probably exist, but
they are losers in the search engine competition.
In fact it might even be
impossible, because I do not want to transfer the language information
through the URL via GET, but want to send it through a form via POST.
Just don''t do that. Simple, eh?
So the URI is exactly the same for all languages (at least in the
version I am aiming at).




这是一个完全错误的想法。但是,您可以使用_additional_

通用URL,该URL通过HTTP级别的语言

协商解析为特定URL之一。见
http://www.cs.tut.fi / ~jkorpela / multi /


Manfred Kooistra写道:
Manfred Kooistra wrote:
我正在建立一个内容相同的网站四种不同的语言。在第一次访问时,搜索引擎根据访问者的IP地址确定内容的语言。


我不明白你的意思。指定HTML页面的

语言的最佳方式是在HTML标记中使用lang属性(例如,

< HTML lang =" en"> )

用户看到的内容一次只能使用一种语言。然后,他或她可以切换到另一种语言,并将其设置为首选语言,但他或她再次只用另一种语言看内容。

现在的问题是:如何让搜索引擎以所有语言索引所有内容?


使用< LINK>元素,以指示可以找到oehr翻译的位置。


< http://www.w3.org/TR/REC-html40/struct/links.html#edef-LINK> ;

我应该在DIV中包含未显示的内容,其显示设置为
none (就像我们过去在noframes标签中包含完整的网站一样)?
或者搜索引擎会忽略不可见的DIV吗?


他们通常会因此而惩罚你。这是一种经常被滥用的

技术,用于将关键字填充到网页中,以便在

结果列表中排名更高。

或者我可以以某种方式检测到搜索引擎正在访问并发送一个包含所有四种语言的完整内容的页面?或者会让我被禁止?


这就是所谓的隐藏,并且也被搜索引擎所厌恶。

或者我必须依赖以下搜索引擎本地链接到其他语言的页面?这可能是一个问题,因为
不同的内容总是显示在同一页面上,因此URI保持相同,只有一个参数发生变化,因此:
content.php ?language = oneoffourlanguages"。
I am building a website with identical content in four different
languages. On a first visit, the search engine determines the language
of the content by the IP address of the visitor.
I don''t understand what you mean by that. The best way of specifying the
language of an HTML page is with a lang attribute in the HTML tag (e.g.,
<HTML lang="en">)
What the user sees is
content in only one language at a time. He or she can then switch to
another language and set this as the preferred language, but again he
or she sees content in only this one other language.

The question now is: How do I get search engines to index ALL of the
content, in all languages?
Use <LINK> elements to indicate where the oehr translations can be found.

<http://www.w3.org/TR/REC-html40/struct/links.html#edef-LINK>
Should I include the non-displayed content in DIVs with display set to
"none" (like we used to include complete websites in the noframes tag)?
Or do search engines ignore invisible DIVs?
They generally penalize you for doing that. This is a frequently abused
technique for stuffing keywords into web pages in order to rank higher in the
result listings.
Or can I somehow detect that a search engine is visiting and deliver a
page with the complete content of all four languages in it? Or would
that get me banned?
That''s called "cloaking", and is also frowned upon by search engines.
Or do I have to rely on the search engine following the local links to
the pages in the other languages? This might be a problem, because the
varying content is always displayed on the same page, so the URI stays
the same, and only one parameter changes, thus:
"content.php?language=oneoffourlanguages".




因此URI实际上是不同的。但我不确定这是做事的最佳方式。 Apache服务器有一些非常有用的内置功能,用于这个

类的东西。


< http://www.google.com/search? q = apache%20content%20negotiation>


-

philronan [@] blueyonder [dot] co [dot] uk

>



So the URI is actually different. But I''m not sure this is the best way of
doing things. Apache servers have some very useful built-in features for this
sort of thing.

<http://www.google.com/search?q=apache%20content%20negotiation>

--
philronan [@] blueyonder [dot] co [dot] uk


Philip Ronan写道:
Philip Ronan wrote:
指定HTML页面的
语言的最佳方法是使用lang属性HTML标记(例如,
< HTML lang =" en">)
The best way of specifying the
language of an HTML page is with a lang attribute in the HTML tag (e.g.,
<HTML lang="en">)




lang属性值得推荐,但它有_no_验证效果

搜索引擎。



The lang attribute is recommendable, but it has _no_ verified effect on
search engines.

现在的问题是:如何让搜索引擎为所有内容,所有语言?
The question now is: How do I get search engines to index ALL of the
content, in all languages?



使用< LINK>元素,以指示可以找到oehr翻译的位置。



Use <LINK> elements to indicate where the oehr translations can be found.




没有证据显示搜索引擎使用此类< LINK>

元素。当然,普通链接(< A>元素)更好,因为它们是浏览器_和_搜索引擎更广泛认可的




There is no evidence that shows that search engines utilize such <LINK>
elements. Surely normal links (<A> elements) are better, since they are
much more widely recognized by browsers _and_ search engines.


这篇关于搜索引擎如何索引多语言内容?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆