如何向网站发送查询并解析结果? [英] How do I send a query to a website and parse the results?

查看:65
本文介绍了如何向网站发送查询并解析结果?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想用Java做一些开发。我希望能够访问一个网站,例如

I want to do some development in Java. I'd like to be able to access a website, say for example

www.chipotle.com

www.chipotle.com

在右上角,他们有一个地方可以输入您的邮政编码,它会为您提供所有最近的位置。该程序将只有一个空盒子供用户输入其邮政编码,它将查询实际的chipotle服务器以检索最近的位置。我该怎么做,以及我收到的数据是如何存储的?

On the top right, they have a place where you can enter in your zip code and it will give you all of the nearest locations. The program will just have an empty box for user input for their zip code, and it will query the actual chipotle server to retrieve the nearest locations. How do I do that, and also how is the data I receive stored?

这可能是一个关于我应该用什么方法来解析数据的后续问题。

This will probably be a followup question as to what methods I should use to parse the data.

谢谢!

推荐答案


这个关于我应该用什么方法来解析数据可能会成为一个后续问题。

This will probably be a followup question as to what methods I should use to parse the data.

这在很大程度上取决于网站实际返回的内容。

It very much depends on what the website actually returns.


  • 如果它返回静态HTML,则应使用常规(严格)或允许的HTML解析器。

  • If it returns static HTML, use an regular (strict) or permissive HTML parser should be used.

如果它返回动态HTML(即带有嵌入式Javascript的HTML)你可能需要使用评估Javascript的内容作为内容提取过程的一部分。

If it returns dynamic HTML (i.e. HTML with embedded Javascript) you may need to use something that evaluates the Javascript as part of the content extraction process.

可能还有一个专为程序(如你的)设计的Web API。这样的API通常会将结果作为XML或JSON返回,这样您就不必从HTML文档中删除结果。

There may also be a web API designed for programs (like yours) to use. Such an API would typically return the results as XML or JSON so that you don't have to scrape the results out of an HTML document.

在您继续之前,您应该查看该网站的服务条款。他们对你提出的建议做了什么吗?

Before you go any further you should check the Terms of Service for the site. Do they say anything about what you are proposing to do?

很多网站都不希望人们刮掉他们的内容或为他们的服务提供包装。例如,如果他们从他们网站上显示的广告中获得收入,那么您建议做的事情可能会导致访问者转移到他们的网站,从而导致潜在或实际收入的损失。

A lot of sites DO NOT WANT people to scrape their content or provide wrappers for their services. For instance, if they get income from ads shown on their site, what you are proposing to do could result in a diversion of visitors to their site and a resulting loss of potential or actual income.

如果你不尊重网站的ToS,你可能会接受律师的信件......或更糟。此外,他们可能已经在使用技术手段让人们难以为他们提供服务。

If you don't respect a website's ToS, you could be on the receiving end of lawyers letters ... or worse. In addition, they could already be using technical means to make life difficult for people to scrape their service.

这篇关于如何向网站发送查询并解析结果?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆