CSS选择器QuerySelector替代 [英] CSS selector QuerySelector alternative

查看:41
本文介绍了CSS选择器QuerySelector替代的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为了找到有关如何使用XMLHTTP获取元数据的资料,我进行了大量搜索.而且我认为使用早期绑定"方法无法做到这一点.唯一有效的方法是通过 CreateObject("HTMLFile")进行后期绑定,并处理该后期绑定的HTML.这种方法的缺点是它不支持使用 QuerySelector QuerySelectorAll .现在,我尝试不使用 QuerySelector

I have searched a lot and a lot so as to find material about how to get meta data using XMLHTTP. And I think that's impossible to do that using the Early binding method. The only approach that will work is the late binding by CreateObject("HTMLFile") and dealing with that HTML which is late binding. The disadvantage of this approach is that it doesn't support the use of the QuerySelector or QuerySelectorAll.. Now I am trying to find alternative to this CSS selector .. without using the QuerySelector

Set post = .querySelector("table div span[itemprop='lowPrice']")

这会产生一个错误..我找不到更简单的方法来查找元素这是HTML内容

This arises an error .. and I can't find easier way to find the element Here's the HTML content

<table class="p">
    <tbody><tr>
        <td class="foto">
            <div class="foto">
                        <a href="https://krmivo-psy.heureka.cz/brit-premium-by-nature-adult-l-15-kg/#gallery-open" target="_blank" class="gallery-link product-detail__gallery-link" onclick="dataLayer.push({'event':'sendEvent','event_category':'Product Detail - Desktop','event_action':'Gallery','event_label':'Otev\u0159en\u00ed galerie','event_value':0});">
                           <img src="https://im9.cz/iR/importprodukt-orig/4c2/4c2b1733c8b233edd5052d3063ac46d9--mmf250x250.jpg" alt="Brit Premium by Nature Adult L 15 kg" width="250" height="250" id="picture-main">
                            <span class="image-hover">
                                <span class="image-overlay"></span>
                                <span class="js-test-image-count-info image-count-info">Galerie <span class="picture-count">(2)</span></span>
                            </span>
                            <span class="product-detail__gallery-link__image__count-info">Galerie
                                <span class="product-detail__gallery-link__image__count-info__count">(2)</span>
                            </span>
                        </a>
                        <a href="https://krmivo-psy.heureka.cz/top-produkty/" class="top-ico gtm-header-link" data-gtm-link-description="Pořadí v TOP produktech"><span>Top</span><strong>1.</strong></a>
                    <div class="poty-ico">
                        <a href="http://www.produktroku.cz/" target="_blank"><img src="https://im9.cz/iR/recenze-externi/107.png" alt="Produkt Roku 2019" class="product-of-year-badge"></a></div>



            </div>

        </td>
        <td>
<div class="main-info">
    <div class="text-cover">
        <div id="n649054946" data-id="649054946" class="item js-public-product-id">
                <h2 itemprop="name">Brit Premium by Nature Adult L 15 kg</h2>
        </div>
        <div class="rating-box" itemprop="aggregateRating" itemscope="" itemtype="http://schema.org/AggregateRating">

            <p class="eval">
                <strong itemprop="ratingValue">95%</strong>
                <a href="https://krmivo-psy.heureka.cz/brit-premium-by-nature-adult-l-15-kg/pridat-uzivatelskou-recenzi/#section">
                    <span class="rating"><span class="hidden">Hodnocení produktu: 95%</span><span class="over" title="Hodnocení produktu: 95%"><span style="width: 75px;"></span></span></span>
                </a>
            </p>

            <span class="hidden-microdata" itemprop="ratingCount">
                456
            </span>

            <p class="review-count delimiter-blank">
                <a href="https://krmivo-psy.heureka.cz/brit-premium-by-nature-adult-l-15-kg/recenze/#section" class="gtm-header-link" data-gtm-link-description="Počet recenzí">
                    <span itemprop="reviewCount">344</span>
                    recenzí
                </a>
            </p>
            <div class="cleaner"></div>
            <p class="rating-box__item rating-box__favourite">
                <a href="https://ucet.heureka.cz/prihlaseni?callbackUrl=https%3A%2F%2Fkrmivo-psy.heureka.cz%2Fbrit-premium-by-nature-adult-l-15-kg%2F" title="Chci to" class="gtm-header-link" data-gtm-link-description="Akce - oblíbené">Přidat do oblíbených</a>
            </p>

            <p id="cli649054946" class="rating-box__item rating-box__compare delimiter-blank cl-add">
                <a class="checkbox gtm-header-link" data-gtm-link-description="Akce - porovnání" href="#" title="Porovnat">Přidat do porovnání</a>
            </p>
            
            <p class="delimiter-blank rating-box__item rating-box__price-watch js-price-watch-button">
                <a href="#" title="Hlídat cenu" class="gtm-header-link" data-gtm-link-description="Akce - hlídat cenu">
                        Hlídat cenu
                </a>
            </p>

            <p class="add-review rating-box__item rating-box__add-review delimiter-blank">
                <a href="https://krmivo-psy.heureka.cz/brit-premium-by-nature-adult-l-15-kg/pridat-uzivatelskou-recenzi/#section" class="gtm-header-link" data-gtm-link-description="Akce - přidat recenzi">
                    Přidat recenzi
                </a>
            </p>
        </div>

        <div id="top-shop-info" class="top-shop-info">
            <div class="inner">
            <div class="guar">
                <div>
                    <img class="guar-badge" src="https://im9.cz/css-v2/images/guaranty-seal.png?1" alt="Garance nákupu - SpokojenyPes.cz" width="27" height="34">
                </div>
            </div>

        <div class="shop-claim bold">
            <strong>Produkt vám dodá:</strong>
        </div>
        <div class="shop-logo">
            <a href="https://www.heureka.cz/exit/spokojenypes-cz/3180319922/?z=41" target="_blank" rel="nofollow noopener" class="gtm-header-link" data-gtm-link-description="Exit - produkt vám dodá">
                    <img src="https://im9.cz/iR/importobchod-orig/1983_logo--mmf130x40.png" alt="SpokojenyPes.cz" width="130" height="40">
            </a>
        </div>

        <div class="recommendation">
            <a href="https://obchody.heureka.cz/spokojenypes-cz/recenze/" class="gtm-header-link" data-gtm-link-description="Hodnocení - Produkt vám dodá">
                99% zákazníků doporučuje obchod
            </a>
        </div>

            <div class="delivery-info bold price-delivery-free">
                Doprava zdarma
            </div>
                <div class="availability-info bold in-stock">
            skladem
        </div>


    </div>
    <a data-gtm-link-description="Další nabídky" id="top-shop-count-info" href="https://krmivo-psy.heureka.cz/brit-premium-by-nature-adult-l-15-kg/porovnat-ceny/#section" class="top-shop-count-info box-active gtm-header-link">Dalších 134 nabídek od 728 Kč</a>
        </div>

        <p class="desc">
            <span id="product-short-description">
                    Kompletní krmivo Brit Premium pro dospělé psy. Kuřecí receptura pro dospělé psy velkých plemen (25 - 45 kg). 
                <a id="product-short-description-button" href="https://krmivo-psy.heureka.cz/brit-premium-by-nature-adult-l-15-kg/specifikace/#section" title="celá specifikace Brit Premium by Nature Adult L 15 kg">celá specifikace</a>
            </span>
        </p>
    </div>

    <div itemprop="offers" itemscope="" itemtype="http://schema.org/AggregateOffer" style="display:none">
        <span itemprop="lowPrice">728.00</span>
        <span itemprop="highPrice">1579.00</span>
        <span itemprop="offerCount">135</span>
            <link itemprop="availability" href="http://schema.org/InStock">
    </div>

    <div itemprop="offers" itemscope="" itemtype="http://schema.org/Offer" class="price-from shopping-cart">
        <link itemprop="itemCondition" href="http://schema.org/OfferItemCondition" content="http://schema.org/NewCondition">
            <link itemprop="availability" href="http://schema.org/InStock">
        <link itemprop="category" href="http://schema.org/category" content="Hobby / Chovatelství / Pro psy / Krmivo pro psy">
        <link itemprop="image" href="http://schema.org/image" content="https://im9.cz/iR/importprodukt-orig/4c2/4c2b1733c8b233edd5052d3063ac46d9.jpg">
                        <div class="top-left">
                <div id="top-button" class="buy-click-observed">
<p class="buy">
    <a href="#" class="flat-button flat-button--top-position flat-button--orange buy-btn hb hb-3180319922 js-top-pos-btn" data-cart-position="0">
        <i class="ico basket"></i>
        <i class="ico check"></i>
        <span class="in">Koupit na Heurece</span>
        <span class="in replace">Přidáno do košíku</span>
    </a>
</p>
                </div>

                <div class="n" id="top-offer-price">
<p class="buy-price">
    <span itemprop="price" class="js-top-price" content="839.00">839 Kč</span>
    <span class="price-vat-title small">s DPH</span>
    <span itemprop="priceCurrency" content="CZK"></span>
</p>
                </div>


                <div class="clear"></div>
                <div class="js-top-gifts-info top-shop-gifts-info-box">
                </div>

            </div>
            <div class="clear"></div>
        <div class="clear"></div>
    </div>
    <span id="new-pd"></span>
    <script>
        (function() {
            loadScript("https:\/\/im9.cz\/js\/cache\/7e39f733-1-42bd9e7837b830d87e1af94da6d0e4a82055c56f.hash.js", function () {
                var productHeadObserver = new ProductHeadObserver({ 'topShortDescElm': $('product-short-description'), 'topShopBox': $('top-shop-info'), 'maxOfferNameLength': 90 });
                productHeadObserver.oneOfferInit();
            });

                H.Awards._reviewClick($$('#awards-list span.pa'));
                var notSelectedCallback = function() {
                    if ('undefined' != typeof H.ShoppingCartHelper.BuyMoreOptions &&
                        typeof H.ShoppingCartHelper.BuyMoreOptions.buyClickNotSelectedCallback == 'function') {
                        H.ShoppingCartHelper.BuyMoreOptions.buyClickNotSelectedCallback();
                    }
                };
                H.ShoppingCartHelper.observeBuyClick($('top-button'), new H.ShoppingCart(), notSelectedCallback, 'js-top-pos-btn');
        })();
    </script>

    <div class="clear"></div>


</div>
        </td>
    </tr>
</tbody></table>

这是整个HTML https://pastebin.com/Dgu1wk2b

这是到目前为止的代码

Sub MyTest()
Dim source      As Object
Dim obj         As Object
Dim resp        As String
Dim post As Object
Dim a, i As Long

With CreateObject("MSXML2.xmlHttp")
    .Open "GET", "https://krmivo-psy.heureka.cz/brit-premium-by-nature-adult-l-15-kg/specifikace/#section", False
    .send
    resp = .responseText
End With

With CreateObject("HTMLFile")
    .write resp
    Set post = .getElementsByTagName("meta")

    For i = 0 To post.Length - 1
        On Error Resume Next
        Debug.Print post.item(i).getAttribute("name")
        If post.item(i).getAttribute("name") = "gtm:product_id" Then
            Cells(2, 1).Value = post.item(i).Value
        End If
        If post.item(i).getAttribute("name") = "gtm:product_name" Then
            Cells(2, 3).Value = post.item(i).Value
        End If
        If post.item(i).getAttribute("name") = "gtm:product_brand" Then
            Cells(2, 4).Value = post.item(i).Value
        End If
        On Error GoTo 0
    Next i

    Set post = Nothing

    Set post = .getElementsByTagName("link")
    For i = 0 To post.Length - 1
        On Error Resume Next
        If post.item(i).getAttribute("rel") = "canonical" Then
            Cells(2, 2).Value = post.item(i).href
        End If
        On Error GoTo 0
    Next i

    'I am stuck here
    'Set post = .querySelector("table div span[itemprop='lowPrice']")
    'Debug.Print .getElementsByTagName("table")(0).innerHTML
End With

End Sub

推荐答案

当您使用 document.body时,发现 HEAD 标记信息(元数据所在的位置)会被剥离.innerHTML = .responseText 和早期绑定的 MSHTML.HTMLDocument .请考虑您要填充的内容( document.body ).这就是为什么您无法选择 meta 信息的原因.对于后期绑定的 HTMLFile (您不能使用 querySelector ),您正在使用 .write 方法写入文档( HTMLFile ),从而保留 HEAD 信息.

As you have discovered HEAD tag info (where meta stuff lives) is stripped out when you use document.body.innerHTML = .responseText with early-bound MSHTML.HTMLDocument. Kinda what you would expect considering what you are populating (document.body). That is why you are unable to select the meta info. With your late bound HTMLFile (where you can't use querySelector) you are using .write method which is writing to your document (HTMLFile) and thereby retaining the HEAD info.

您需要确保 HEAD 信息最终位于 BODY 标记内.如果希望使用早期绑定,则可以将其作为响应正文的一部分,也可以将提取的 HEAD 与新的 BODY 标记连接起来并写入 HTMLDocument .

You need to ensure that the HEAD info ends up within BODY tags. Either as part of response body or extracted HEAD concatenated with new BODY tags and written to HTMLDocument if wishing to use early binding.

例如为了清楚起见,我仅在 BODY 标记之间编写 HEAD 信息(不包含现有响应的其余部分)

E.g. for clarity I am writing HEAD info between BODY tags only (Without rest of existing response)

Option Explicit

Public Sub MetaInfoEarlyBound()
    Dim html As MSHTML.HTMLDocument, htmlHead As MSHTML.HTMLDocument, xhr As MSXML2.XMLHTTP60
    Dim re As VBScript_RegExp_55.RegExp

    Set htmlHead = New MSHTML.HTMLDocument
    Set html = New MSHTML.HTMLDocument
    Set xhr = New MSXML2.XMLHTTP60    
    Set re = New VBScript_RegExp_55.RegExp

    re.Pattern = "<head>([\s\S]+)<\/head>"

    With xhr
        .Open "GET", "https://krmivo-psy.heureka.cz/brit-premium-by-nature-adult-l-15-kg/specifikace/#section", False
        .send
        htmlHead.body.innerHTML = Replace$(Replace$(re.Execute(.responseText)(0), "<head>", "<body>"), "</head>", "</body>")
        html.body.innerHTML = .responseText
    End With

    Debug.Print htmlHead.querySelector("[name='gtm:product_price']").Value
    Debug.Print html.querySelector("[itemprop=lowPrice]").innerText

End Sub


顺便说一句,我添加了两个较短的方法(比当前其他答案更短)来实现您的后期绑定目标.请注意,我已经评论了一个.


As an aside, I add two shorter methods (than current other answer) to achieve your goal with late-bound. Note I have commented one out.

Public Sub MetaInfoLateBound()
    Dim resp As String

    With CreateObject("MSXML2.xmlHttp")
        .Open "GET", "https://krmivo-psy.heureka.cz/brit-premium-by-nature-adult-l-15-kg/specifikace/#section", False
        .send
        resp = .responseText
    End With

    With CreateObject("HTMLFile")

        .write resp

'        Dim post As Object
'
'        Set post = .getElementById("new-pd")
'        Debug.Print post.PreviousSibling.PreviousSibling.getElementsByTagName("span")(0).innertext
'
        Dim metas As Object, i As Long

        Set metas = .getElementsByTagName("meta")

        For i = 0 To metas.Length - 1
            If metas.Item(i).Name = "gtm:product_price" Then
                Debug.Print metas.Item(i).Value
                Exit For
            End If
        Next
    End With
End Sub

这篇关于CSS选择器QuerySelector替代的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆