如何从Yahoo Finance中下载数据(限制为100行) [英] How to download data from Yahoo finance limited to 100 rows

查看:663
本文介绍了如何从Yahoo Finance中下载数据(限制为100行)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我正在做这个项目,我必须从Yahoo Finance下载历史股票数据.得到了这段代码.它工作正常,但最多只能下载100行.我试图在网上扫描答案或其他代码(这只是excel录制的宏),但是我在YouTube上看到了一些使用他的解决方案的教程,就很好了.

So I'm doing this project where I have to download historical stock data from yahoo finance. Got this code. It's working fine, BUT it only downloads max 100 rows. I tried to scan the web for answers or a different code (this one is just recorded macro from excel) but I saw a few tutorials on YouTube that use his solution and it's just fine.

..那我不明白

Sub Makro6()

    ' Dowload stock prices from Yahoo Finance based on input

    Dim ws As Worksheet

    Set ws = Sheets("Data")

    'clear previous queries
    For Each qr In ThisWorkbook.Queries
        qr.Delete
    Next qr

    'clear Data sheet
    ws.Select
    Cells.Clear

    'clear graphs
    'ws.ChartObjects.Delete

    'stock = Sheets("Main").Range("A2")

    StartDate = toUnix(Sheets("Main").Range("A4"))
    EndDate = toUnix(Sheets("Main").Range("A6"))

    Application.CutCopyMode = False
    ActiveWorkbook.Queries.Add Name:="Table 2 (3)", Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Zdroj = Web.Page(Web.Contents(""https://finance.yahoo.com/quote/" & stock & "/history?period1=" & StartDate & "&period2=" & EndDate & "&interval=1d&filter=history&frequency=1d""))," & Chr(13) & "" & Chr(10) & "    Data2 = Zdroj{2}[Data]," & Chr(13) & "" & Chr(10) & "    #""Změněný typ"" = Table.TransformColumnTypes(Data2,{{""Date"", type date}, {""Open"", type text}, {""High"", type text}, {""Low"", type text}, {""Close*"", type tex" & _
        "t}, {""Adj Close**"", type text}, {""Volume"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Změněný typ"""
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table 2 (3)"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [Table 2 (3)]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "Table_2_3"
        .Refresh BackgroundQuery:=False
    End With
    Sheets("Data").Select

    '' Sort data by date from oldest to newest

    ws.ListObjects("Table_2_3").Sort.SortFields. _
        Clear
    ws.ListObjects("Table_2_3").Sort.SortFields. _
        Add2 Key:=Range("A1:A99"), SortOn:=xlSortOnValues, Order:=xlAscending, _
        DataOption:=xlSortNormal
    With ws.ListObjects("Table_2_3").Sort
        .Header = xlYes
        .MatchCase = False
        .Orientation = xlTopToBottom
        .SortMethod = xlPinYin
        .Apply
    End With

    Call DeleteDividends

    Call Stochastics


End Sub

该代码适用于其他网站.我尝试下载总数为120的Wikipedia页面列表,并且它加载数据没有问题.

The code works for other websites. I tried to download Wikipedia page list of total 120 and it loaded data no problem.

问题是来自Yahoo金融网站的数据是项目要求

The problem is the data from Yahoo finance website is a project requirement

推荐答案

如果对页面进行检查,您会发现HTMLTable行(准确地说是tbody)中最初仅存在100个结果.

If you check against the page you will discover only 100 results are initially present within the HTMLTable rows (tbody to be precise).

在浏览器元素选项卡搜索框中输入css选择器[data-test="historical-prices"] tbody tr( F12 打开开发工具),您将看到以下内容:

Enter the css selector [data-test="historical-prices"] tbody tr in the browser elements tab search box (F12 to open dev tools)and you will see this:

当您向下滚动页面时,其余行将从数据存储中动态馈送.当然,您当前的方法无法解决这些问题.实际上,您可以发出xhr请求,对容纳所有行的适当javascript对象进行正则表达式,然后使用json解析器进行解析.

The rest of the rows are fed dynamically from a data store as you scroll down the page. Of course, your current method doesn't pick up on these. You can in fact issue an xhr request, regex out the appropriate javascript object housing all the rows, and parse with a json parser.

以下是您目前应该在响应中看到的大致内容:

Here is roughly what you should currently see in response:

我使用jsonconverter.bas作为json解析器.从此处下载原始代码添加到名为jsonConverter的标准模块.然后,您需要转到VBE>工具>引用>添加对Microsoft脚本运行时的引用.

I use jsonconverter.bas as my json parser. Download raw code from here and add to standard module called jsonConverter . You then need to go VBE > Tools > References > Add reference to Microsoft Scripting Runtime.

@TimWilliams在此处编写了一个更好的unix转换函数,但是我认为我会在编写一些不同的东西方面发挥作用.我建议您坚持使用他,因为它更安全,更快捷.

@TimWilliams wrote a better unix conversion function here but I thought I would have a play at writing something different. I would advise you to stick with his as it is safer and faster.

VBA:

Option Explicit  
Public Sub GetYahooData()
    '<  VBE > Tools > References > Microsoft Scripting Runtime
    Dim json As Object, re As Object, s As String, xhr As Object, ws As Worksheet
    Dim startDate As String, endDate As String, stock As String

    Set ws = ThisWorkbook.Worksheets("Sheet1")
    Set re = CreateObject("VBScript.RegExp")
    Set xhr = CreateObject("MSXML2.XMLHTTP")

    stock = "AAPL"
    startDate = "1534809600" 
    endDate = "1566345600"     

    With xhr
        .Open "GET", "https://finance.yahoo.com/quote/" & stock & "/history?period1=" & startDate & "&period2=" & endDate & "&interval=1d&filter=history&frequency=1d&_guc_consent_skip=" & GetCurrentUnix(Now()), False
        .setRequestHeader "User-Agent", "Mozilla/5.0"
        .send
        s = .responseText
    End With

    s = GetJsonString(re, s)
    If s = "No match" Then Exit Sub

    Set json = JsonConverter.ParseJson(s)
    WriteOutResults ws, json
End Sub

Public Sub WriteOutResults(ByVal ws As Worksheet, ByVal json As Object)
    Dim item As Object, key As Variant, headers(), results(), r As Long, c As Long

    headers = json.item(1).keys
    ReDim results(1 To json.Count, 1 To UBound(headers) + 1)

    For Each item In json
        r = r + 1: c = 1
        For Each key In item.keys
            results(r, c) = item(key)
            c = c + 1
        Next
    Next
    With ws
        .Cells(1, 1).Resize(1, UBound(headers) + 1) = headers
        .Cells(2, 1).Resize(UBound(results, 1), UBound(results, 2)) = results
    End With      
End Sub

Public Function GetJsonString(ByVal re As Object, ByVal responseText As String) As String
    With re
        .Global = True
        .MultiLine = True
        .IgnoreCase = False
        .Pattern = "HistoricalPriceStore"":{""prices"":(.*?\])" 'regex pattern to get json string
        If .test(responseText) Then
            GetJsonString = .Execute(responseText)(0).SubMatches(0)
        Else
            GetJsonString = "No match"
        End If
    End With
End Function

Public Function GetCurrentUnix(ByVal t As Double) As String
    With CreateObject("htmlfile")
        .parentWindow.execScript "function GetTimeStamp(t){return new Date(t).getTime() / 1000}", "jscript"
        GetCurrentUnix = .parentWindow.GetTimeStamp(Now)
    End With
End Function


正则表达式:

Python:

如果感兴趣的话,我最初是用python编写的:

I initially wrote as python if of interest:

import requests, re, json
from bs4 import BeautifulSoup as bs

p = re.compile('HistoricalPriceStore":{"prices":(.*?\])')
r = requests.get('https://finance.yahoo.com/quote/AAPL/history?period1=1534809600&period2=1566345600&interval=1d&filter=history&frequency=1d&_guc_consent_skip=1566859607')
data = json.loads(p.findall(r.text)[0])

这篇关于如何从Yahoo Finance中下载数据(限制为100行)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆