以编程方式在 R 中查找股票代码 [英] Programmatically look up a ticker symbol in R

查看:24
本文介绍了以编程方式在 R 中查找股票代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含公司名称的数据字段,例如

I have a field of data containing company names, such as

company <- c("Microsoft", "Apple", "Cloudera", "Ford")
> company

  Company
1 Microsoft
2 Apple
3 Cloudera
4 Ford

等等.

tm.plugin.webmining 允许您从 Yahoo! 查询数据.基于股票代码的财务:

The package tm.plugin.webmining allows you to query data from Yahoo! Finance based on ticker symbols:

require(tm.plugin.webmining)
results <- WebCorpus(YahooFinanceSource("MSFT")) 

我错过了中间步骤.如何根据公司名称以编程方式查询票证符号?

I'm missing the in-between step. How can I query ticket symbols programmatically based on company names?

推荐答案

我无法使用 tm.plugin.webmining 包来做到这一点,但我想出了一个粗略的解决方案 -拉&解析来自这个网络文件的数据:ftp://ftp.nasdaqtrader.com/SymbolDirectory/nasdaqlisted.txt.我说粗略是因为出于某种原因,我对 httr::content(httr::GET(...)) 的调用并非每次都有效 - 我认为这与网址类型 (ftp://) 但我不做那么多网页抓取,所以我无法真正解释这一点.它在我的 Linux 上似乎比在 Mac 上运行得更好,但这可能无关紧要.无论如何,这就是我得到的:感谢@thelatemail 的评论,这似乎更顺畅了:

I couldn't manage to do this with the tm.plugin.webmining package, but I came up with a rough solution - pulling & parsing data from this web file: ftp://ftp.nasdaqtrader.com/SymbolDirectory/nasdaqlisted.txt. I say rough because for some reason my calls with httr::content(httr::GET(...)) don't work every time - I think it has to do with the type of web address (ftp://) but I don't do that much web scraping so I can't really explain this. It seemed to work better on my Linux than my Mac, but that could be irrelevant. Regardless, here's what I got: Thanks to @thelatemail's comment, this seems to be working much smoother:

library(quantmod) ## optional
symbolData <- read.csv(
  "ftp://ftp.nasdaqtrader.com/SymbolDirectory/nasdaqlisted.txt",
  sep="|")
##
> head(symbolData,10)
   Symbol                                                   Security.Name Market.Category Test.Issue Financial.Status Round.Lot.Size
1    AAIT iShares MSCI All Country Asia Information Technology Index Fund               G          N                N            100
2     AAL                    American Airlines Group, Inc. - Common Stock               Q          N                N            100
3    AAME                    Atlantic American Corporation - Common Stock               G          N                N            100
4    AAOI                    Applied Optoelectronics, Inc. - Common Stock               G          N                N            100
5    AAON                                       AAON, Inc. - Common Stock               Q          N                N            100
6    AAPL                                       Apple Inc. - Common Stock               Q          N                N            100
7    AAVL                  Avalanche Biotechnologies, Inc. - Common Stock               G          N                N            100
8    AAWW                     Atlas Air Worldwide Holdings - Common Stock               Q          N                N            100
9    AAXJ               iShares MSCI All Country Asia ex Japan Index Fund               G          N                N            100
10   ABAC                        Aoxin Tianli Group, Inc. - Common Shares               S          N                N            100

根据@GSee 的建议,获取源数据的(大概)更可靠的方法是使用 TTR 包中的 stockSymbols() 函数:

As per @GSee's suggestion, a (presumably) more robust way to obtain the source data is with the stockSymbols() function in the package TTR:

> symbolData2 <- stockSymbols(exchange="NASDAQ")
Fetching NASDAQ symbols...
> ##
> head(symbolData2)
  Symbol                                                           Name LastSale    MarketCap IPOyear         Sector
1   AAIT iShares MSCI All Country Asia Information Technology Index Fun   34.556      6911200      NA           <NA>
2    AAL                                  American Airlines Group, Inc.   40.500  29164164453      NA Transportation
3   AAME                                  Atlantic American Corporation    4.020     83238028      NA        Finance
4   AAOI                                  Applied Optoelectronics, Inc.   20.510    303653114    2013     Technology
5   AAON                                                     AAON, Inc.   18.420   1013324613      NA  Capital Goods
6   AAPL                                                     Apple Inc.  103.300 618546661100    1980     Technology
                         Industry Exchange
1                            <NA>   NASDAQ
2   Air Freight/Delivery Services   NASDAQ
3                  Life Insurance   NASDAQ
4                  Semiconductors   NASDAQ
5 Industrial Machinery/Components   NASDAQ
6          Computer Manufacturing   NASDAQ

我不知道您是否只是想从名称中获取股票代码,但如果您还在寻找实际股价信息,您可以执行以下操作:

I don't know if you just wanted to get ticker symbols from names, but if you are also looking for actual share price information you could do something like this:

namedStock <- function(name="Microsoft",
                       start=Sys.Date()-365,
                       end=Sys.Date()-1){
  ticker <- symbolData[agrep(name,symbolData[,2]),1]
  getSymbols(
    Symbols=ticker,
    src="yahoo",
    env=.GlobalEnv,
    from=start,to=end)
}
##
## an xts object named MSFT will be added to
## the global environment, no need to assign
## to an object
namedStock()
##
> str(MSFT)
An ‘xts’ object on 2013-09-03/2014-08-29 containing:
  Data: num [1:251, 1:6] 31.8 31.4 31.1 31.3 31.2 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:6] "MSFT.Open" "MSFT.High" "MSFT.Low" "MSFT.Close" ...
  Indexed by objects of class: [Date] TZ: UTC
  xts Attributes:  
List of 2
 $ src    : chr "yahoo"
 $ updated: POSIXct[1:1], format: "2014-09-02 21:51:22.792"
> chartSeries(MSFT)

所以就像我说的,这不是最干净的解决方案,但希望它可以帮助您.另请注意,我的数据来源是拉动在纳斯达克交易的公司(这是大多数主要公司),但您可以轻松地将其与其他来源结合起来.

So like I said, this isn't the cleanest solution but hopefully it helps you out. Also note that my data source was pulling companies traded on NASDAQ (which is most major companies), but you could easily combine this with other sources.

这篇关于以编程方式在 R 中查找股票代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆