使用Python获取财务数据 [英] get financial data using Python

查看:61
本文介绍了使用Python获取财务数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我设法编写了一些Python代码和Selenium,可以导航到包含某些表中财务数据的网页.

我希望能够提取数据并将其放入excel.

这些表似乎是以下基于html的表代码:

 < tr>< td class ="bc2T bc2gt">最新更新</td>< td class ="bc2V bc2D"> 03/15/2018</td>< td class ="bc2V bc2D"> 03/14/2019</td>< td class ="bc2V bc2D">; 03/12/2020</td>< td class ="bc2V bc2D" style ="background-color:#DEFEFE;"> 05/22/2020</td>< td class ="bc2V bc2D"style ="background-color:#DEFEFE;"> 05/20/2020</td>< td class ="bc2V bc2D" style ="background-color:#DEFEFE;"> 05/18/2020</td></tr></table> 

该表具有以下类名:< table class ='BordCollapseYear2'style ="margin-right:20px; font-size:12px; width:100%;"cellspacing = 0>

有没有一种方法可以提取这些数据?理想情况下,我希望它是动态的,以便它可以提取不同公司的信息.

我以前从未使用过它,但是我看过几次提到BeautifulSoup库.

I have managed to write some Python code and Selenium that navigates to a webpage that contains financial data that is in some tables.

I want to be able to extract the data and put it into excel.

The tables seem to be html based tables code below:

                <tr>
                <td class="bc2T bc2gt">Last update</td>
                <td class="bc2V bc2D">03/15/2018</td><td class="bc2V bc2D">03/14/2019</td><td class="bc2V bc2D">03/12/2020</td><td class="bc2V bc2D" style="background-color:#DEFEFE;">05/22/2020</td><td class="bc2V bc2D" style="background-color:#DEFEFE;">05/20/2020</td><td class="bc2V bc2D" style="background-color:#DEFEFE;">05/18/2020</td> 
            </tr>
    </table>

The table has the following class name: <table class='BordCollapseYear2' style="margin-right:20px; font-size:12px; width:100%;" cellspacing=0>

Is there a way I can extract this data? Ideally I want this to be dynamic so that it can extract information for different companies.

I've never used it before, but I've seen BeautifulSoup library mentioned a few times.

https://www.marketscreener.com/MICROSOFT-CORPORATION-4835/financials/

As an example Microsoft. I'd want to extract the income statement data, balance sheet etc.

解决方案

This script will scrape all tables found on the page and pretty prints them:

import requests
from bs4 import BeautifulSoup

url = 'https://www.marketscreener.com/MICROSOFT-CORPORATION-4835/financials/'

soup = BeautifulSoup(requests.get(url).content, 'html.parser')

all_data = {}
# for every table found on page...
for table in soup.select('table.BordCollapseYear2'):
    table_name = table.find_previous('b').text
    all_data[table_name] = []
    # ..scrape every row
    for tr in table.select('tr'):
        row = [td.get_text(strip=True, separator=' ') for td in tr.select('td')]
        if len(row) == 7:
            all_data[table_name].append(row)

#pretty print all data:
for k, v in all_data.items():
    print('Table name: {}'.format(k))
    print('-' * 160)
    for row in v:
        print(('{:<25}'*7).format(*row))
    print()

Prints:

Table name: Valuation
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Fiscal Period: June      2017                     2018                     2019                     2020                     2021                     2022                     
Capitalization 1         532 175                  757 640                  1 026 511                1 391 637                -                        -                        
Entreprise Value (EV) 1  485 388                  700 112                  964 870                  1 315 823                1 299 246                1 276 659                
P/E ratio                25,4x                    46,3x                    26,5x                    32,3x                    29,7x                    25,8x                    
Yield                    2,26%                    1,70%                    1,37%                    1,10%                    1,18%                    1,31%                    
Capitalization / Revenue 5,51x                    6,87x                    8,16x                    9,81x                    8,89x                    7,95x                    
EV / Revenue             5,02x                    6,34x                    7,67x                    9,28x                    8,30x                    7,30x                    
EV / EBITDA              12,7x                    15,4x                    17,7x                    20,2x                    18,3x                    15,9x                    
Cours sur Actif net      7,46x                    9,15x                    10,0x                    12,1x                    10,1x                    8,49x                    
Nbr of stocks (in thousands)7 720 510                7 683 198                7 662 818                7 583 440                -                        -                        
Reference price (USD)    68,9                     98,6                     134                      184                      184                      184                      
Last update              07/20/2017               07/19/2018               07/18/2019               05/08/2020               04/30/2020               04/30/2020               

Table name: Annual Income Statement Data
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Fiscal Period: June      2017                     2018                     2019                     2020                     2021                     2022                     
Net sales 1              96 657                   110 360                  125 843                  141 818                  156 534                  174 945                  
EBITDA 1                 38 117                   45 319                   54 641                   65 074                   70 966                   80 445                   
Operating profit (EBIT) 129 339                   35 058                   42 959                   52 544                   57 045                   65 289                   
Operating Margin         30,4%                    31,8%                    34,1%                    37,1%                    36,4%                    37,3%                    
Pre-Tax Profit (EBT) 1   23 149                   36 474                   43 688                   52 521                   57 042                   65 225                   
Net income 1             21 204                   16 571                   39 240                   43 693                   47 223                   53 905                   
Net margin               21,9%                    15,0%                    31,2%                    30,8%                    30,2%                    30,8%                    
EPS 2                    2,71                     2,13                     5,06                     5,68                     6,18                     7,11                     
Dividend per Share 2     1,56                     1,68                     1,84                     2,02                     2,16                     2,41                     
Last update              07/20/2017               07/19/2018               07/18/2019               05/22/2020               05/22/2020               05/22/2020               

Table name: Balance Sheet Analysis
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Fiscal Period: June      2017                     2018                     2019                     2020                     2021                     2022                     
Net Debt 1               -                        -                        -                        -                        -                        -                        
Net Cash position 1      46 787                   57 528                   61 641                   75 814                   92 392                   114 978                  
Leverage (Debt / EBITDA) -1,23x                   -1,27x                   -1,13x                   -1,17x                   -1,30x                   -1,43x                   
Free Cash Flow 1         31 378                   32 252                   38 260                   41 953                   46 887                   53 155                   
ROE (Net Profit / Equities)29,4%                    19,4%                    42,4%                    36,6%                    34,5%                    36,1%                    
Shareholders' equity 1   72 195                   85 215                   92 524                   119 417                  136 690                  149 484                  
ROA (Net Profit / Asset) 9,76%                    6,51%                    14,4%                    18,5%                    14,6%                    14,7%                    
Assets 1                 217 276                  254 580                  272 703                  235 800                  323 445                  366 702                  
Book Value Per Share 2   9,24                     10,8                     13,4                     15,2                     18,2                     21,6                     
Cash Flow per Share 2    5,04                     5,63                     6,73                     7,03                     8,02                     9,79                     
Capex 1                  8 129                    11 632                   13 925                   15 698                   17 922                   19 507                   
Capex / Sales            8,41%                    10,5%                    11,1%                    11,1%                    11,4%                    11,2%                    
Last update              07/20/2017               07/19/2018               07/18/2019               05/22/2020               05/22/2020               05/04/2020               

EDIT (to save all_data as csv file):

import csv

with open('data.csv', 'w', newline='') as csvfile:
    spamwriter = csv.writer(csvfile, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
    for k, v in all_data.items():
        spamwriter.writerow([k])
        for row in v:
            spamwriter.writerow(row)

Screenshot from LibreOffice:

这篇关于使用Python获取财务数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆