谷歌造成过多的带宽损失。 [英] Google causing excessive bandwidth uasage.

查看:130
本文介绍了谷歌造成过多的带宽损失。的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道这个帖子不适合此列表,但ciwaservers

已经死在我的地址。


Google已经到了我的网站这个月两次下载了几乎两美元的b $ b GB,让我超过了我的带宽限制两次我想象如果我没有支付固定费用,那将是花钱给我钱。


有没有办法限制这一点,同时允许Google

合理索引?我想他们正在下载整个网站

的缓存,但其中大部分(21 MB)是一个PHP程序,并且

不需要存档。我注意到谷歌搜索

我的网站上列出了不少PHP报告。这些是即时生成的
。我最好在程序网站上询问这个吗?

来自Sourceforge?


Doug。

-

已注册的Linux用户编号277548.我的真实电子邮件地址有热键支持

myaccess。

参与和承诺之间的区别就好像是鸡蛋和火腿早餐:鸡肉是参与的 - 猪是'b $ b''承诺''。

- 未知。

解决方案

" Doug Laidlaw"写道:

我知道这个帖子不适合这个列表,但是c.i.w.a.servers
已经死了我的地址。


alt.internet.search-engines可能是一个更好的地方。

谷歌本月两次访问我的网站并下载了几乎一个
GB,让我超过我的带宽限制两次我想象如果我没有支付固定费用,那将花费我的钱。


然后我不知道问题是什么。你已经在你的网站上获得了所有这些内容,并且大概你希望它被谷歌索引。因此,当googlebot出现并查看内容时,你不能抱怨
抱怨。

有没有办法限制这一点,同时允许Google

你所能做的就是打开或关闭索引。使用robots.txt

文件或单个页面中的机器人元标记。如果您不想谷歌

来抓取您网站的部分内容,请告诉它不要。这真的很简单。

我从谷歌搜索中注意到我的网站上列出了不少PHP报告。这些是即时生成的。我最好在Sourceforge的程序网站上询问这个吗?




我觉得你最好还是读这个:

< http://www.google.com/intl/zh-CN/webmasters/bot.html>


-

phil [dot] ronan @ virgin [dot] net
http://vzone.virgin。 net / phil.ronan /

blockquote>

Philip Ronan写道:

" Doug Laidlaw"写道:

我知道这个帖子不适合这个列表,但是
ciwaservers已经死了我的地址。



alt.internet.search-engines可能是一个更好的问题。

谷歌本月两次访问我的网站并下载了几乎
一个
GB,让我超过带宽限制我想象如果我没有支付固定费用,这将花费我的钱。



然后我不要我真的知道问题是什么。您已经在自己的网站上获得了所有这些内容,并且可能是您希望它被Google编入索引。因此,当googlebot出现并查看内容时,你不能抱怨。

有没有办法限制这一点,同时允许Google 不需要存档。



您所能做的就是打开或关闭索引。使用robots.txt
文件或在各个页面中使用漫游器元标记。如果您不想谷歌抓取您网站的某些部分,请告诉它不要。这真的很简单。

我从谷歌的搜索中注意到,我的网站列出了很多PHP报告。这些是即时生成的。我会不会在Sourceforge的网站上询问这个问题?



我认为你最好还是读这个:
< http:// www.google.com/intl/zh-CN/webmasters/bot.html>



谢谢Phil。我完全是自学成才,突然发现自己拥有自己的域名,并且不得不做更多的管理。程序的根目录中有一个robots.txt

文件。我会跟进。


Doug。

-

注册Linux用户编号277548.我的真实电子邮件地址有热键支持

myaccess。

运气唯一确定的是它会改变。

- Bret Harte。


Doug Laidlaw写道:

谢谢Phil。我完全是自学成才,突然发现自己拥有自己的领域,并且不得不做更多的管理。程序的根目录中有一个robots.txt
文件。我会跟进。




你的robots.txt必须是错误的 - 所以请看看它。


或许更多重要的是,您需要考虑

内容的可缓存性。可能最简单的方法是确保服务器发送Last-Modified

标题,这样谷歌和其他人只需下载一个页面

当它被改变时,不是每次都有它检查可能的变化。

那将全面节省你的带宽,而不仅仅是谷歌等人。


-

Nick Kew


I know that this thread is inappropriate for this list, but c.i.w.a.servers
is dead at my address.

Google has been around to my site twice this month and downloaded almost a
GB, putting me over my bandwidth limit both times I imagine that if I
wasn''t paying a flat fee, that would be costing me money.

Is there a way of limiting this while at the same time allowing Google
reasonable indexing? I imagine that they are downloading the whole site
for their cache, but most of it (21 MB) is a program in PHP, and
unnecessary for archiving purposes. I have noticed from Google searches
that there are quite a few PHP reports from my site listed. These are
generated on-the-fly. Would I be better off asking this on the program site
at Sourceforge?

Doug.
--
Registered Linux User No. 277548. My true email address has hotkey for
myaccess.
The difference between ''involvement'' and ''commitment'' is like an
eggs-and-ham breakfast: the chicken was ''involved'' - the pig was
''committed''.
- Unknown.

解决方案

"Doug Laidlaw" wrote:

I know that this thread is inappropriate for this list, but c.i.w.a.servers
is dead at my address.
alt.internet.search-engines might have been a better place to ask.
Google has been around to my site twice this month and downloaded almost a
GB, putting me over my bandwidth limit both times I imagine that if I
wasn''t paying a flat fee, that would be costing me money.
Then I don''t really see what the problem is. You''ve got all this content on
your website, and presumably you want it indexed by Google. So you can''t
complain when the googlebot comes along and looks at the stuff.
Is there a way of limiting this while at the same time allowing Google
reasonable indexing? I imagine that they are downloading the whole site
for their cache, but most of it (21 MB) is a program in PHP, and
unnecessary for archiving purposes.
All you can do is switch the indexing on or off. Either with a robots.txt
file or with robots meta tags in individual pages. If you don''t want google
to crawl parts of your site, then tell it not to. It really is that simple.
I have noticed from Google searches
that there are quite a few PHP reports from my site listed. These are
generated on-the-fly. Would I be better off asking this on the program site
at Sourceforge?



I think you would be better off reading this:
<http://www.google.com/intl/en/webmasters/bot.html>

--
phil [dot] ronan @ virgin [dot] net
http://vzone.virgin.net/phil.ronan/


Philip Ronan wrote:

"Doug Laidlaw" wrote:

I know that this thread is inappropriate for this list, but
c.i.w.a.servers is dead at my address.



alt.internet.search-engines might have been a better place to ask.

Google has been around to my site twice this month and downloaded almost
a
GB, putting me over my bandwidth limit both times I imagine that if I
wasn''t paying a flat fee, that would be costing me money.



Then I don''t really see what the problem is. You''ve got all this content
on your website, and presumably you want it indexed by Google. So you
can''t complain when the googlebot comes along and looks at the stuff.

Is there a way of limiting this while at the same time allowing Google
reasonable indexing? I imagine that they are downloading the whole site
for their cache, but most of it (21 MB) is a program in PHP, and
unnecessary for archiving purposes.



All you can do is switch the indexing on or off. Either with a robots.txt
file or with robots meta tags in individual pages. If you don''t want
google to crawl parts of your site, then tell it not to. It really is that
simple.

I have noticed from Google searches
that there are quite a few PHP reports from my site listed. These are
generated on-the-fly. Would I be better off asking this on the program
site at Sourceforge?



I think you would be better off reading this:
<http://www.google.com/intl/en/webmasters/bot.html>


Thanks Phil. I am entirely self-taught, and suddenly finding myself with my
own domain, and having to do a lot more administration. There is a robots.txt
file in the root directory of the program. I will follow up on that.

Doug.
--
Registered Linux User No. 277548. My true email address has hotkey for
myaccess.
The only sure thing about luck is that it will change.
- Bret Harte.


Doug Laidlaw wrote:

Thanks Phil. I am entirely self-taught, and suddenly finding myself with my
own domain, and having to do a lot more administration. There is a robots.txt
file in the root directory of the program. I will follow up on that.



Your robots.txt must be incorrect - so look into it.

Perhaps more importantly, you need to consider cacheability of your
contents. Probably simplest is to ensure the server sends Last-Modified
headers, so that google and others only have to download a page
when it''s changed, not every time it checks for a possible change.
That''ll save you bandwidth all round, not just from google et al.

--
Nick Kew


这篇关于谷歌造成过多的带宽损失。的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆