亚马逊销售数据(使用 Excel VBA) [英] Amazon Sales Data (with Excel VBA)

查看:14
本文介绍了亚马逊销售数据(使用 Excel VBA)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过 Excel VBA 获取我搜索的每个关键字的结果编号(在 HTML 代码中).通过 className、id 和 data-asin 缩小标准,但事实证明最后一个很棘手,因为 VBA 尚不支持该参考库.

I'm trying to obtain the result number (in the HTML code) of each keyword I search by means of Excel VBA. Narrowing down the criteria by className, id, and data-asin, but that last one is proving to be tricky since VBA doesn't support that reference library yet.

这段代码的结构是为了做什么:

What this code is structured to do:

  1. 访问 amazon.com 并转到搜索栏.

  1. Go onto amazon.com and go to the search bar.

循环从 C 列开始.从 SearchTerm1 列中提取一个搜索词并进行搜索.

The loop starts with column C. Pull a search term from the SearchTerm1 column and search.

加载结果页面后,尝试通过 className、ID(均在 HTML 代码中找到)和 ASIN 编号(该编号从列 B 中提取以匹配数据)查找指定产品-asin 值在搜索结果页面上).如果没有所有 3 个条件,如果产品列在第一个结果页面上,Excel 将无法找到该产品.

Once the results page is loaded, try and find the specified product by className, ID (both found in the HTML code), and ASIN number (this number is pulled from column B in order to match the data-asin value on the search results page). Without all 3 criteria, excel won't be able to find the product if it's listed on the first results page.

下面的截图是只根据className和ID标准拉取的代码,它从页面拉取了最后一个产品结果,这不是我分析产品效果的目标.

The screenshot below is the code only pulling based on the className and ID criteria, and it pulls the last product result from the page, which is not what my goal for analysation of how products are doing.

如果产品是搜索页面上的第一个结果,则所包含的代码只会拉取产品排名,这意味着某些东西终于起作用了,但缺少一两个步骤来从页面上获取所有产品位置.

The code included only pulls the product rank if the product is the first result on the search page, which means something is finally working, but is missing a step or two to grab all of the product positions from the page.

对正确方向的任何帮助或推动将不胜感激.我希望 VBA 在这些类型的销售研究方面更加通用.到目前为止,它已经创造了奇迹,但我可能已经达到了它的极限.代码如下.

Any help or push in the right direction would be highly appreciated. I wish VBA were more versatile for these kinds of sales research things. It's done wonders so far, but I may be reaching its limit. Code is below.

Sub AmazonSearchRank()

    Dim MyHTML_Element As IHTMLElement
    Dim MyURL As String

    Dim AASearchRank As Workbook
    Dim AAws As Worksheet
    Dim InputSearchOrder As HTMLInputElement
    Dim elems As IHTMLElementCollection
    Dim TDelement As HTMLTableCell

    Dim InputSearchButton As HTMLInputButtonElement
    Dim IE As InternetExplorer
    Dim AASearchTerms As Workbook
    Dim SearchTermsSheet As Worksheet

    Dim x As Integer
    Dim i As Long

    MyURL = "https://www.amazon.com"
    Set IE = New InternetExplorer
    With IE
        .Silent = True
        .Navigate MyURL
        .Visible = True
        Do
            DoEvents
        Loop Until .ReadyState = READYSTATE_COMPLETE
    End With
    Set HTMLDoc = IE.Document

    Set AASearchRank = Application.Workbooks.Open("C:UsersCompanyNameDesktopAutomation AnywhereSample_Items_For_SearchRank.xls")
    Set AAws = AASearchRank.Worksheets("Sheet1")

    Set InputSearchButton = HTMLDoc.getElementById("nav-search-submit-text")
    Set InputSearchOrder = HTMLDoc.getElementById("twotabsearchbox")
    If Not InputSearchOrder Is Nothing Then
        InputSearchButton.Click
        Do
            DoEvents
        Loop Until IE.ReadyState = READYSTATE_COMPLETE
    End If

    x = 2
    If AAws.Range("D" & x).Value = "" Then
        Do Until AAws.Range("B" & x) = ""
            Set InputSearchOrder = HTMLDoc.getElementById("twotabsearchtextbox")
            InputSearchOrder.Value = AAws.Range("C" & x)

            Set InputSearchButton = HTMLDoc.getElementsByClassName("nav-input")(0)
            InputSearchButton.Click
              Do
                DoEvents
            Loop Until IE.ReadyState = READYSTATE_COMPLETE
            Application.Wait (Now + TimeValue("0:00:05"))

            Set elems = HTMLDoc.getElementsByClassName("s-result-item celwidget")
            i = 2
            For Each TDelement In elems
                If TDelement.className = "s-result-item celwidget" And InStr(TDelement.ID, "result") InStr(TDelement.innerHTML, AAws.Range("B" & x).Value) Then
                    AAws.Range("D" & x).Value = TDelement.ID
                    i = i + 1
                End If
            Next
        x = x + 1
        Loop
    End If

End Sub

推荐答案

这里是从 Amazon 下载产品的示例,该示例针对工作表 Terms 上显示的每个搜索查询,并填充工作表 带有 ASIN 和描述的产品.它使用 XHR,因此不需要 IE.代码如下:

Here is the example which downloads products from Amazon for each search query presented on the sheet Terms, and populates the sheet Products with ASINs and descriptions. It uses XHR, so IE isn't needed. The code is as follows:

Sub Test()
    lngRow = 1
    ' search each term
    For Each strTerm In Sheets("Terms").UsedRange
        lngPage = 1
        Do
            ' HTTP GET request of the search result page
            strUrl = "https://www.amazon.com/s/ref=nb_sb_noss_2?page=" & lngPage & "&keywords=" & EncodeUriComponent(strTerm)
            Set objXHR = CreateObject("MSXML2.XMLHttp")
            objXHR.Open "GET", strUrl, False
            objXHR.Send
            strResp = objXHR.ResponseText
            ' split response to array by items
            arrResp = Split(strResp, "<li id=""result_")
            ' process each item on the page
            For i = 1 To UBound(arrResp)
                strItem = arrResp(i)
                ' extract ASIN
                strTmp = Split(strItem, "data-asin=""")(1)
                strTmp = Split(strTmp, """")(0)
                Sheets("Products").Cells(lngRow, 1).NumberFormat = "@"
                Sheets("Products").Cells(lngRow, 1).Value = strTmp
                ' extract the product description
                strTmp = Split("<li id=""result_" & strItem, "</li>")(0) & "</li>"
                Sheets("Products").Cells(lngRow, 2).Value = GetInnerText(strTmp)
                ' show current item
                Sheets("Products").Cells(lngRow, 1).Select
                ' next row
                lngRow = lngRow + 1
            Next
            ' adjust sheet
            Sheets("Products").Columns.AutoFit
            Sheets("Products").Rows.AutoFit
            ' next page
            lngPage = lngPage + 1
        Loop Until UBound(arrResp) = 0 ' empty search result
    Next
End Sub

Function EncodeUriComponent(strText)
    Static objHtmlfile As Object
    If objHtmlfile Is Nothing Then
        Set objHtmlfile = CreateObject("htmlfile")
        objHtmlfile.parentWindow.execScript "function encode(s) {return encodeURIComponent(s)}", "jscript"
    End If
    EncodeUriComponent = objHtmlfile.parentWindow.encode(strText)
End Function

Function GetInnerText(strHtmlContent)
    Dim objHtmlFile, objBody
    Set objHtmlFile = CreateObject("htmlfile")
    objHtmlFile.write strHtmlContent
    Set objBody = objHtmlFile.getElementsByTagName("body")(0)
    GetInnerText = Trim(objBody.innerText)
End Function

我放在条款表上:

Product 表上的结果包含 571 个项目:

Results on the Product sheet contain 571 items:

这不是一个完整的答案,但我希望它可以帮助您.

It's not a complete answer, but I hope it helps you.

这篇关于亚马逊销售数据(使用 Excel VBA)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆