如何从亚马逊获取产品的图像和标题? [英] How to pull the image and title of the product from Amazon?

查看:178
本文介绍了如何从亚马逊获取产品的图像和标题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试根据亚马逊的独特产品代码列出产品列表.

I am trying to make a list of products based on the unique product codes of Amazon.

例如: https://www.amazon.in/gp/product/B00F2GPN36

其中B00F2GPN36是唯一代码.

Where B00F2GPN36 is the unique code.

我想在产品图片和产品名称列下将图片和产品标题提取到Excel列表中.

I want to fetch the image and the title of the product into an Excel list under the columns product image and product name.

我尝试了html.getElementsById("productTitle")html.getElementsByTagName.

当我尝试声明Object类型和HtmlHtmlElement时,我还不确定要描述哪种类型的变量来存储上述信息.

I also have doubt on what kind of variable to describe for storing the above mentioned info as I have tried declaration of Object type and HtmlHtmlElement.

我试图提取html文档并将其用于数据搜索.

I tried to pull the html doc and use it for the data search.

代码:

Enum READYSTATE
     READYSTATE_UNINITIALIZED = 0
     READYSTATE_LOADING = 1
     READYSTATE_LOADED = 2
     READYSTATE_INTERACTIVE = 3
     READYSTATE_COMPLETE = 4
End Enum

Sub parsehtml()

     Dim ie As InternetExplorer
     Dim topics As Object
     Dim html As HTMLDocument

     Set ie = New InternetExplorer
     ie.Visible = False
     ie.navigate "https://www.amazon.in/gp/product/B00F2GPN36"

     Do While ie.READYSTATE <> READYSTATE_COMPLETE
       Application.StatusBar = "Trying to go to Amazon.in...."
       DoEvents    
     Loop

     Application.StatusBar = ""
     Set html = ie.document
     Set topics = html.getElementsById("productTitle")
     Sheets(1).Cells(1, 1).Value = topics.innerText
     Set ie = Nothing

End Sub

我希望输出是单元格A1中的输出:
"Milton热钢瓶玻璃水瓶,2升,银色"应反映(不带引号),并且类似地,我也想拉出图像.

I expect the output to be that in cell A1:
"Milton Thermosteel Carafe Flask, 2 litres, Silver" should reflect (without quotation marks) and similarly I want to pull the image as well.

但是总会有一些错误,例如:
1.运行时错误"13":
我使用将主题设为HTMLHtmlElement时"类型不匹配
2.运行时错误"438":
对象不支持此属性或方法

But there is always some error like:
1. Run-time error '13':
Type mismatch when I used "Dim topics As HTMLHtmlElement"
2. Run-time error '438':
Object doesn't support this property or method

注意:我从工具>引用中添加了引用,即必需的库.

Note: I added references from Tools > References i.e. the required libraries.

推荐答案

更快的方法是使用 xhr 并避免使用浏览器,并将结果从数组写到表中

Faster would be to use xhr and avoid browser and write out results from an array to sheet

Option Explicit
Public Sub GetInfo()
    Dim html As HTMLDocument, results()
    Set html = New HTMLDocument
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", "https://www.amazon.in/gp/product/B00F2GPN36", False
        .send
        html.body.innerHTML = .responseText
        With html
            results = Array(.querySelector("#productTitle").innerText, .querySelector("#landingImage").getAttribute("data-old-hires"))
        End With
    End With
    With ThisWorkbook.Worksheets("Sheet1")
        .Cells(1, 1) = results(0)
        Dim file As String
        file = DownloadFile("C:\Users\User\Desktop\", results(1))  'your path to download file
        With .Pictures.Insert(file)
            .Left = ThisWorkbook.Worksheets("Sheet1").Cells(1, 2).Left
            .Top = ThisWorkbook.Worksheets("Sheet1").Cells(1, 2).Top
            .Width = 75
            .Height = 100
            .Placement = 1
        End With
    End With
    Kill file
End Sub 

这篇关于如何从亚马逊获取产品的图像和标题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆