解析HTML不会输出所需的数据(FedEx的跟踪信息) [英] Parsing HTML does not output desired data(tracking info for FedEx)

查看:81
本文介绍了解析HTML不会输出所需的数据(FedEx的跟踪信息)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试制作一个脚本,以从联邦快递网站上获取跟踪信息.

Im trying to make a script that grabs tracking information from fedex website.

我发现fi只是转到url' https://www.fedex.com/fedextrack/?tracknumbers = "并将跟踪号粘贴到它的末尾,它带我到跟踪页面,其中包含我需要的信息.

I figured that f i just go to the url 'https://www.fedex.com/fedextrack/?tracknumbers=' and paste the tracking number at the end of it, it brings me to the tracking page which has the information i need.

我试图向URL提供跟踪号并从响应中解析html.

I tried to feed the URL the tracking number and parse the html from the response.

这是我尝试过的.

import urllib

url_prefix= 'https://www.fedex.com/fedextrack/?tracknumbers='
tracking_number = '570573906561'
url = url_prefix + tracking_number
sock = urllib.urlopen(url) htmlSource = sock.read()
sock.close()
print htmlSource

此代码输出: http://freetexthost.com/iy1ma2q1fm

我认为我只能从输出中搜索文本并找到交货状态/日期,但不在此输出中.

I thought i would just be able to search the text from the output and find the delivery status/date but it is not in this output.

如果我进入Chrome中的跟踪页面并检查元素,则看到交货日期信息的ID为destionDateTime,因此,如果我在Chrome控制台中运行它:

If i go to the tracking page in Chrome and inspect element, I see that the delivery date information has an id of destionDateTime, so if i run this in the Chrome Console:

var document.getElementbyID('destinationDateTime')

它返回我想要的输出(交货日期)

it returns the output I want (delivery date)

为什么我的python脚本不打印实际的跟踪数据信息或html输出中的该类?

How come my python script doesn't print actual tracking data information or that class in the html output?

我尝试搜索此问题,并尝试解析几种不同的方式(机械化,美丽的汤,html2text),但是所有这些都为我提供了相同的输出,其中不包含有关该货件的任何实际数据.

I tried searching this question and tried parsing several different ways (Mechanize, Beautiful Soup, html2text) but all of these gave me the same output that does not contain any actual data about the shipment.

推荐答案

与许多其他网站一样,如果没有JavaScript,该网站将无法正常运行.它将HTTP POST请求发送到某个URL,然后该URL将跟踪数据作为JSON编码的对象返回.

The website, like many others, won't work without JavaScript. It sends a HTTP POST request to a certain URL, which then returns the tracking data as a JSON-encoded object.

您需要使用Python进行模拟:

You'll need to simulate that with Python:

import requests
import json

tracking_number = '570573906561'

data = requests.post('https://www.fedex.com/trackingCal/track', data={
    'data': json.dumps({
        'TrackPackagesRequest': {
            'appType': 'wtrk',
            'uniqueKey': '',
            'processingParameters': {
                'anonymousTransaction': True,
                'clientId': 'WTRK',
                'returnDetailedErrors': True,
                'returnLocalizedDateTime': False
            },
            'trackingInfoList': [{
                'trackNumberInfo': {
                    'trackingNumber': tracking_number,
                    'trackingQualifier': '',
                    'trackingCarrier': ''
                }
            }]
        }
    }),
    'action': 'trackpackages',
    'locale': 'en_US',
    'format': 'json',
    'version': 99
}).json()

然后使用生成的对象:

{
    "TrackPackagesResponse": {
        "successful": true,
        "passedLoggedInCheck": false,
        "errorList": [{
            "code": "0",
            "message": "Request was successfully processed.",
            "source": null
        }],
        "packageList": [{
            "trackingNbr": "570573906561",
            "trackingQualifier": "2456536000\u007e570573906561\u007eFX",
            "trackingCarrierCd": "FDXE",
            "trackingCarrierDesc": "FedEx Express",
            "displayTrackingNbr": "570573906561",
            "shipperCmpnyName": "",
            "shipperName": "",
            "shipperAddr1": "",
            "shipperAddr2": "",
            "shipperCity": "SEOUL",
            "shipperStateCD": "",
            "shipperZip": "",
            "shipperCntryCD": "KR",
            "shipperPhoneNbr": "",
            "shippedBy": "",
            "recipientCmpnyName": "",
            "recipientName": "",
            "recipientAddr1": "",
            "recipientAddr2": "",
            "recipientCity": "CHEK LAP KOK",
            "recipientStateCD": "",
            "recipientZip": "",
            "recipientCntryCD": "HK",
            "recipientPhoneNbr": "",
            "shippedTo": "",
            "keyStatus": "Delivered",
            "keyStatusCD": "DL",
            "lastScanStatus": "",
            "lastScanDateTime": "",
            "receivedByNm": ".CHOP",
            "subStatus": "Signed for by\u003a .CHOP",
            "mainStatus": "",
            "statusBarCD": "DL",
            "shortStatus": "",
            "shortStatusCD": "",
            "statusLocationAddr1": "",
            "statusLocationAddr2": "",
            "statusLocationCity": "CHEK LAP KOK",
            "statusLocationStateCD": "",
            "statusLocationZip": "",
            "statusLocationCntryCD": "HK",
            "statusWithDetails": "Delivered\u003a 9\u002f02\u002f2013 11\u003a58 am Signed for by\u003a.CHOP\u003b CHEK LAP KOK, HK",
            "shipDt": "2013\u002d08\u002d31T15\u003a00\u003a00\u002b09\u003a00",
            "displayShipDt": "8\u002f31\u002f2013",
            "displayShipTm": "3\u003a00 pm",
            "displayShipDateTime": "8\u002f31\u002f2013 3\u003a00 pm",
            "pickupDt": "2013\u002d08\u002d31T15\u003a00\u003a00\u002b09\u003a00",
            "displayPickupDt": "8\u002f31\u002f2013",
            "displayPickupTm": "3\u003a00 pm",
            "displayPickupDateTime": "8\u002f31\u002f2013 3\u003a00 pm",
            "estDeliveryDt": "",
            "estDeliveryTm": "",
            "displayEstDeliveryDt": "",
            "displayEstDeliveryTm": "",
            "displayEstDeliveryDateTime": "",
            "actDeliveryDt": "2013\u002d09\u002d02T11\u003a58\u003a00\u002b08\u003a00",
            "displayActDeliveryDt": "9\u002f02\u002f2013",
            "displayActDeliveryTm": "11\u003a58 am",
            "displayActDeliveryDateTime": "9\u002f02\u002f2013 11\u003a58 am",
            "nickName": "",
            "note": "",
            "matchedAccountList": [""],
            "fxfAdvanceETA": "",
            "fxfAdvanceReason": "",
            "fxfAdvanceStatusCode": "",
            "fxfAdvanceStatusDesc": "",
            "destLink": "",
            "originLink": "",
            "hasBillOfLadingImage": false,
            "hasBillPresentment": false,
            "signatureRequired": 0,
            "totalKgsWgt": "3.5",
            "displayTotalKgsWgt": "3.5 kgs",
            "totalLbsWgt": "7.8",
            "displayTotalLbsWgt": "7.8 lbs",
            "displayTotalWgt": "7.8 lbs \u002f 3.5 kgs",
            "pkgKgsWgt": "3.5",
            "displayPkgKgsWgt": "3.5 kgs",
            "pkgLbsWgt": "7.8",
            "displayPkgLbsWgt": "7.8 lbs",
            "displayPkgWgt": "7.8 lbs \u002f 3.5 kgs",
            "dimensions": "20x14x14 in.",
            "masterTrackingNbr": "",
            "masterQualifier": "",
            "masterCarrierCD": "",
            "originalOutboundTrackingNbr": null,
            "originalOutboundQualifier": "",
            "originalOutboundCarrierCD": "",
            "invoiceNbrList": [""],
            "referenceList": [""],
            "doorTagNbrList": [""],
            "referenceDescList": [""],
            "purchaseOrderNbrList": [""],
            "billofLadingNbrList": [""],
            "shipperRefList": ["PO\u00232612  Proton housing\u005fPlastics"],
            "rmaList": [""],
            "deptNbrList": [""],
            "shipmentIdList": [""],
            "tcnList": [""],
            "partnerCarrierNbrList": [""],
            "hasAssociatedShipments": false,
            "hasAssociatedReturnShipments": false,
            "assocShpGrp": 0,
            "drTgGrp": ["0"],
            "associationInfoList": [{
                "trackingNumberInfo": {
                    "trackingNumber": "",
                    "trackingQualifier": "",
                    "trackingCarrier": "",
                    "processingParameters": null
                },
                "associatedType": ""
            }],
            "returnReason": "",
            "returnRelationship": null,
            "skuItemUpcCdList": [""],
            "receiveQtyList": [""],
            "itemDescList": [""],
            "partNbrList": [""],
            "serviceCD": "INTERNATIONAL\u005fPRIORITY",
            "serviceDesc": "FedEx International Priority",
            "serviceShortDesc": "IP",
            "packageType": "YOUR\u005fPACKAGING",
            "packaging": "Your Packaging",
            "clearanceDetailLink": "",
            "showClearanceDetailLink": false,
            "manufactureCountryCDList": [""],
            "commodityCDList": [""],
            "commodityDescList": [""],
            "cerNbrList": [""],
            "cerComplaintCDList": [""],
            "cerComplaintDescList": [""],
            "cerEventDateList": [""],
            "displayCerEventDateList": [""],
            "totalPieces": "1",
            "specialHandlingServicesList": ["Deliver Weekday", "Weekend Pick\u002dUp"],
            "shipmentType": "",
            "pkgContentDesc1": "",
            "pkgContentDesc2": "",
            "docAWBNbr": "",
            "originalCharges": "",
            "transportationCD": "",
            "transportationDesc": "",
            "dutiesAndTaxesCD": "",
            "dutiesAndTaxesDesc": "",
            "origPieceCount": "",
            "destPieceCount": "",
            "goodsClassificationCD": "",
            "receipientAddrQty": "0",
            "deliveryAttempt": "0",
            "codReturnTrackNbr": "",
            "scanEventList": [{
                "date": "2013\u002d09\u002d02",
                "time": "11\u003a58\u003a00",
                "gmtOffset": "\u002b08\u003a00",
                "status": "Delivered",
                "statusCD": "DL",
                "scanLocation": "CHEK LAP KOK HK",
                "scanDetails": "",
                "scanDetailsHtml": "",
                "rtrnShprTrkNbr": "",
                "isDelException": false,
                "isClearanceDelay": false,
                "isException": false,
                "isDelivered": true
            }, {
                "date": "2013\u002d09\u002d02",
                "time": "09\u003a36\u003a00",
                "gmtOffset": "\u002b08\u003a00",
                "status": "On FedEx vehicle for delivery",
                "statusCD": "OD",
                "scanLocation": "LANTAU ISLAND HK",
                "scanDetails": "",
                "scanDetailsHtml": "",
                "rtrnShprTrkNbr": "",
                "isDelException": false,
                "isClearanceDelay": false,
                "isException": false,
                "isDelivered": false
            }, {
                "date": "2013\u002d09\u002d02",
                "time": "08\u003a55\u003a00",
                "gmtOffset": "\u002b08\u003a00",
                "status": "At local FedEx facility",
                "statusCD": "AR",
                "scanLocation": "LANTAU ISLAND HK",
                "scanDetails": "",
                "scanDetailsHtml": "",
                "rtrnShprTrkNbr": "",
                "isDelException": false,
                "isClearanceDelay": false,
                "isException": false,
                "isDelivered": false
            }, {
                "date": "2013\u002d09\u002d02",
                "time": "07\u003a12\u003a00",
                "gmtOffset": "\u002b08\u003a00",
                "status": "International shipment release \u002d Import",
                "statusCD": "CC",
                "scanLocation": "LANTAU ISLAND HK",
                "scanDetails": "",
                "scanDetailsHtml": "",
                "rtrnShprTrkNbr": "",
                "isDelException": false,
                "isClearanceDelay": false,
                "isException": false,
                "isDelivered": false
            }, {
                "date": "2013\u002d09\u002d02",
                "time": "04\u003a40\u003a00",
                "gmtOffset": "\u002b08\u003a00",
                "status": "Shipment exception",
                "statusCD": "SE",
                "scanLocation": "GUANGZHOU CN",
                "scanDetails": "Delay beyond our control",
                "scanDetailsHtml": "",
                "rtrnShprTrkNbr": "",
                "isDelException": false,
                "isClearanceDelay": false,
                "isException": false,
                "isDelivered": false
            }, {
                "date": "2013\u002d09\u002d02",
                "time": "03\u003a45\u003a00",
                "gmtOffset": "\u002b08\u003a00",
                "status": "Departed FedEx location",
                "statusCD": "DP",
                "scanLocation": "GUANGZHOU CN",
                "scanDetails": "",
                "scanDetailsHtml": "",
                "rtrnShprTrkNbr": "",
                "isDelException": false,
                "isClearanceDelay": false,
                "isException": false,
                "isDelivered": false
            }, {
                "date": "2013\u002d09\u002d02",
                "time": "01\u003a17\u003a00",
                "gmtOffset": "\u002b08\u003a00",
                "status": "Arrived at FedEx location",
                "statusCD": "AR",
                "scanLocation": "GUANGZHOU CN",
                "scanDetails": "",
                "scanDetailsHtml": "",
                "rtrnShprTrkNbr": "",
                "isDelException": false,
                "isClearanceDelay": false,
                "isException": false,
                "isDelivered": false
            }, {
                "date": "2013\u002d09\u002d01",
                "time": "23\u003a10\u003a00",
                "gmtOffset": "\u002b08\u003a00",
                "status": "In transit",
                "statusCD": "IT",
                "scanLocation": "SHANGHAI CN",
                "scanDetails": "",
                "scanDetailsHtml": "",
                "rtrnShprTrkNbr": "",
                "isDelException": false,
                "isClearanceDelay": false,
                "isException": false,
                "isDelivered": false
            }, {
                "date": "2013\u002d09\u002d01",
                "time": "17\u003a13\u003a00",
                "gmtOffset": "\u002b09\u003a00",
                "status": "In transit",
                "statusCD": "IT",
                "scanLocation": "INCHEON KR",
                "scanDetails": "",
                "scanDetailsHtml": "",
                "rtrnShprTrkNbr": "",
                "isDelException": false,
                "isClearanceDelay": false,
                "isException": false,
                "isDelivered": false
            }, {
                "date": "2013\u002d08\u002d31",
                "time": "19\u003a44\u003a00",
                "gmtOffset": "\u002b09\u003a00",
                "status": "In transit",
                "statusCD": "IT",
                "scanLocation": "INCHEON KR",
                "scanDetails": "",
                "scanDetailsHtml": "",
                "rtrnShprTrkNbr": "",
                "isDelException": false,
                "isClearanceDelay": false,
                "isException": false,
                "isDelivered": false
            }, {
                "date": "2013\u002d08\u002d31",
                "time": "16\u003a27\u003a00",
                "gmtOffset": "\u002b09\u003a00",
                "status": "Left FedEx origin facility",
                "statusCD": "DP",
                "scanLocation": "SEOUL KR",
                "scanDetails": "",
                "scanDetailsHtml": "",
                "rtrnShprTrkNbr": "",
                "isDelException": false,
                "isClearanceDelay": false,
                "isException": false,
                "isDelivered": false
            }, {
                "date": "2013\u002d08\u002d31",
                "time": "15\u003a00\u003a00",
                "gmtOffset": "\u002b09\u003a00",
                "status": "Picked up",
                "statusCD": "PU",
                "scanLocation": "SEOUL KR",
                "scanDetails": "",
                "scanDetailsHtml": "",
                "rtrnShprTrkNbr": "",
                "isDelException": false,
                "isClearanceDelay": false,
                "isException": false,
                "isDelivered": false
            }, {
                "date": "2013\u002d08\u002d30",
                "time": "23\u003a58\u003a11",
                "gmtOffset": "\u002d05\u003a00",
                "status": "Shipment information sent to FedEx",
                "statusCD": "OC",
                "scanLocation": "",
                "scanDetails": "",
                "scanDetailsHtml": "",
                "rtrnShprTrkNbr": "",
                "isDelException": false,
                "isClearanceDelay": false,
                "isException": false,
                "isDelivered": false
            }],
            "originAddr1": "",
            "originAddr2": "",
            "originCity": "SEOUL",
            "originStateCD": "",
            "originZip": "",
            "originCntryCD": "KR",
            "originLocationID": "",
            "originTermCity": "SEOUL",
            "originTermStateCD": "",
            "destLocationAddr1": "",
            "destLocationAddr2": "",
            "destLocationCity": "LANTAU ISLAND",
            "destLocationStateCD": "",
            "destLocationZip": "",
            "destLocationCntryCD": "HK",
            "destLocationID": "",
            "destLocationTermCity": "LANTAU ISLAND",
            "destLocationTermStateCD": "",
            "destAddr1": "",
            "destAddr2": "",
            "destCity": "CHEK LAP KOK",
            "destStateCD": "",
            "destZip": "",
            "destCntryCD": "HK",
            "halAddr1": "",
            "halAddr2": "",
            "halCity": "",
            "halStateCD": "",
            "halZipCD": "",
            "halCntryCD": "",
            "actualDelAddrCity": "CHEK LAP KOK",
            "actualDelAddrStateCD": "",
            "actualDelAddrZipCD": "",
            "actualDelAddrCntryCD": "HK",
            "totalTransitMiles": "",
            "excepReasonList": [""],
            "excepActionList": [""],
            "exceptionReason": "",
            "exceptionAction": "",
            "statusDetailsList": [""],
            "trackErrCD": "",
            "destTZ": "\u002b08\u003a00",
            "originTZ": "\u002b09\u003a00",
            "isMultiStat": "0",
            "multiStatList": [{
                "multiPiec": "",
                "multiTm": "",
                "multiDispTm": "",
                "multiSta": ""
            }],
            "maskMessage": "",
            "deliveryService": "",
            "milestoDestination": "",
            "terms": "",
            "originUbanizationCode": "",
            "originCountryName": "",
            "isOriginResidential": false,
            "halUrbanizationCD": "",
            "halCountryName": "",
            "actualDelAddrUrbanizationCD": "",
            "actualDelAddrCountryName": "",
            "destUrbanizationCD": "",
            "destCountryName": "",
            "delToDesc": "Shipping\u002fReceiving",
            "recpShareID": "",
            "shprShareID": "9mbo6hrq0tqxo1i4pr7kp2yp",
            "defaultCDOType": "CDO",
            "mpstype": "",
            "fxfAdvanceNotice": true,
            "rthavailableCD": "",
            "excepReasonListNoInit": [""],
            "excepActionListNoInit": [""],
            "statusDetailsListNoInit": [""],
            "matched": false,
            "isSuccessful": true,
            "errorList": [{
                "code": "",
                "message": "",
                "source": null
            }],
            "isCanceled": false,
            "isPrePickup": false,
            "isPickup": false,
            "isInTransit": false,
            "isInProgress": true,
            "isDelException": false,
            "isClearanceDelay": false,
            "isException": false,
            "isDelivered": true,
            "isHAL": false,
            "isOnSchedule": false,
            "isDeliveryToday": false,
            "isSave": false,
            "isWatch": false,
            "isHistorical": false,
            "isTenderedNotification": false,
            "isDeliveredNotification": true,
            "isExceptionNotification": false,
            "isCurrentStatusNotification": false,
            "isAnticipatedShipDtLabel": false,
            "isShipPickupDtLabel": true,
            "isActualPickupLabel": false,
            "isOrderReceivedLabel": false,
            "isEstimatedDeliveryDtLabel": true,
            "isDeliveryDtLabel": false,
            "isActualDeliveryDtLabel": true,
            "isOrderCompleteLabel": false,
            "isOutboundDirection": false,
            "isInboundDirection": false,
            "isThirdpartyDirection": false,
            "isUnknownDirection": false,
            "isFSM": false,
            "isReturn": false,
            "isOriginalOutBound": false,
            "isChildPackage": false,
            "isParentPackage": false,
            "isReclassifiedAsSingleShipment": false,
            "isDuplicate": false,
            "isMaskShipper": false,
            "isHalEligible": false,
            "isFedexOfficeOnlineOrders": false,
            "isFedexOfficeInStoreOrders": false,
            "isMultipleStop": false,
            "isCustomCritical": false,
            "isInvalid": false,
            "isNotFound": false,
            "isFreight": false,
            "isSpod": true,
            "isSignatureAvailable": false,
            "isMPS": false,
            "isGMPS": false,
            "isResidential": false,
            "isDestResidential": true,
            "isHALResidential": false,
            "isActualDelAddrResidential": false,
            "isReqEstDelDt": false,
            "isCDOEligible": false,
            "CDOInfoList": [{
                "spclInstructDesc": "",
                "delivOptn": "",
                "delivOptnStatus": "",
                "reqApptWdw": "",
                "reqApptDesc": "",
                "rerouteTRKNbr": "",
                "beginTm": "",
                "endTm": ""
            }],
            "CDOExists": false,
            "isMtchdByRecShrID": false,
            "isMtchdByShiprShrID": false
        }]
    }
}

这篇关于解析HTML不会输出所需的数据(FedEx的跟踪信息)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆