使用Java将网页保存到文件 [英] Saving a web page to a file in Java

查看:530
本文介绍了使用Java将网页保存到文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用以下代码阅读html网站,系统挂起任何提示请:

I am trying to read html site using below code,System hanging any hints please:

package com.test;

import java.io.BufferedWriter;   
import java.io.FileWriter;   
import java.net.Socket;  
import javax.net.SocketFactory;  
import java.net.InetAddress;

public class writingFile {

    public static void main(String a[]) throws Exception {

        SocketFactory factory=SocketFactory.getDefault();
        Socket socket=new Socket(InetAddress.getByName("java.sun.com"), 80);
        BufferedWriter out=new BufferedWriter(new FileWriter("C://test.html"));
        int data;

        while((data=socket.getInputStream().read()) != -1) {
            out.write((char)data);
            out.flush();
        }
    }
}

问候,
Raj

Regards, Raj

推荐答案

这是 HTTP 。你不能只打开一个套接字并开始阅读。您必须对服务器礼貌并首先发送请求:

This is HTTP. You can't just open a socket and start reading something. You have to be polite to the server and send a request first:

socket.getOutputStream().write("GET /index.html HTTP/1.0\n\n".getBytes());
socket.getOutputStream().flush();

然后阅读HTTP响应,解析,并获取您的html页面返回。

Then read a HTTP response, parse it, and get your html page back.

编辑我只是因为这是OP的直接问题而编写了如何处理套接字。使用URLConnection是正确的方法,正如@Mike Deck所回答。

EDIT I wrote what to do with sockets only because it was the immediate problem of the OP. Using URLConnection is the correct way, as answered by @Mike Deck.

这篇关于使用Java将网页保存到文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆