如何从网站地址下载所有页面?喜欢挖一些级别并下载网站的所有子页面? [英] How to download all pages from a website address ? Like dig uo to some levels and download all sub pages of website ?

查看:97
本文介绍了如何从网站地址下载所有页面?喜欢挖一些级别并下载网站的所有子页面?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道如何下载一页网址,但我想下载本网站的所有页面:


 


使用System;

使用System.Collections.Generic;

使用System.ComponentModel;

使用System.Data;

使用System.Drawing;

使用System.Linq;

使用System.Text;

使用System.Windows.Forms;

使用System.Net;

使用System.IO;


命名空间Download_Wbsite_Content

{

   公共部分类Form1:表格

    {

      

        string content_comparison2;

       

       字符串read_all_stream;

        string web_content_file;

        string content_comparison;

       // int i;

       字符串网站;

        WebClient wc = new WebClient();

        StreamWriter sw;

        StreamReader sr;

        StreamWriter fc;

        StreamWriter fc2;

       

        public Form1()

        {

            InitializeComponent();

           

            web_content_file = @" d:\ web_content_file.txt";

            content_comparison = @" d:\ _content_comparison.txt";
$
            content_comparison2 = @" d:\ content_comparison2.txt";

           

            textBox1.Enabled = false;

            if(!File.Exists(content_comparison))

            {

                fc = new StreamWriter(content_comparison);

                fc.Close();

               


            }
           否则为
            {

               


            }
           

        }


        private void Form1_Load(object sender,EventArgs e)

        {


        }
        private void get_web_content()

        {

          

           site = wc.DownloadString(" http://www.vgames.co.il ");

           sw.Write(site);


           


        }


        private void button1_Click(object sender,EventArgs e)

        {

            textBox1.Enabled = true;

            sw = new StreamWriter(web_content_file);

            get_web_content();

            sw.Close();

        }


        private void textBox1_TextChanged(object sender,EventArgs e)

        {


             sw.Close();

            fc = new StreamWriter(content_comparison);

            sr = new StreamReader(web_content_file);

            read_all_stream = sr.ReadToEnd();

              if(read_all_stream.Contains(textBox1.Text))

               {




                    


                    fc.WriteLine(textBox1.Text + Environment.NewLine);

              &NBSP ;    


          ;         


                }
               否则b $ b                {

                   


                   


                   


                }


            


            fc.Close();

            File.Delete(content_comparison2);

            File.Move(content_comparison,content_comparison2);

           返回;

        }
    }
}


danieli

解决方案

你必须阅读您正在下载的页面并查找其中的所有标记(锚点标记到其他链接< a href ='  和图片标记< img  并下载他们指向的内容。 没有魔法这里!

I know how to download one page of web address but i want to download all the pages in this website:

 

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.Net;
using System.IO;

namespace Download_Wbsite_Content
{
    public partial class Form1 : Form
    {
      
        string content_comparison2;
       
        string read_all_stream;
        string web_content_file;
        string content_comparison;
       // int i;
        string site;
        WebClient wc = new WebClient();
        StreamWriter sw;
        StreamReader sr;
        StreamWriter fc;
        StreamWriter fc2;
       
        public Form1()
        {
            InitializeComponent();
           
            web_content_file = @"d:\web_content_file.txt";
            content_comparison = @"d:\content_comparison.txt";
            content_comparison2 = @"d:\content_comparison2.txt";
           
            textBox1.Enabled = false;
            if (!File.Exists(content_comparison))
            {
                fc=new StreamWriter(content_comparison);
                fc.Close();
               
            }
            else
            {
               
            }
           
        }

        private void Form1_Load(object sender, EventArgs e)
        {

        }
        private void get_web_content()
        {
          
           site= wc.DownloadString("http://www.vgames.co.il");
           sw.Write(site);

          
        }

        private void button1_Click(object sender, EventArgs e)
        {
            textBox1.Enabled = true;
            sw = new StreamWriter(web_content_file);
            get_web_content();
            sw.Close();
        }

        private void textBox1_TextChanged(object sender, EventArgs e)
        {

            sw.Close();
            fc = new StreamWriter(content_comparison);
            sr = new StreamReader(web_content_file);
            read_all_stream = sr.ReadToEnd();
              if (read_all_stream.Contains(textBox1.Text))
               {


                   
                    fc.WriteLine(textBox1.Text+Environment.NewLine);
                   

                  
                }
                else
                {
                   
                   
                   
                }

           
            fc.Close();
            File.Delete(content_comparison2);
            File.Move(content_comparison,content_comparison2 );
            return;
        }
    }
}


danieli

解决方案

You would have to read the page you are downloading and look for all the tags in there (anchor tags to other links <a href='   and image tags <img  and download what they point to as well.  No magic here!


这篇关于如何从网站地址下载所有页面?喜欢挖一些级别并下载网站的所有子页面?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆