以编程方式在 R 中查找股票代码 [英] Programmatically look up a ticker symbol in R
问题描述
我有一个包含公司名称的数据字段,例如
I have a field of data containing company names, such as
company <- c("Microsoft", "Apple", "Cloudera", "Ford")
> company
Company
1 Microsoft
2 Apple
3 Cloudera
4 Ford
等等.
包 tm.plugin.webmining
允许您从 Yahoo! 查询数据.基于股票代码的财务:
The package tm.plugin.webmining
allows you to query data from Yahoo! Finance based on ticker symbols:
require(tm.plugin.webmining)
results <- WebCorpus(YahooFinanceSource("MSFT"))
我错过了中间步骤.如何根据公司名称以编程方式查询票证符号?
I'm missing the in-between step. How can I query ticket symbols programmatically based on company names?
推荐答案
我无法使用 tm.plugin.webmining
包来做到这一点,但我想出了一个粗略的解决方案 -拉&解析来自这个网络文件的数据:ftp://ftp.nasdaqtrader.com/SymbolDirectory/nasdaqlisted.txt.我说粗略是因为出于某种原因,我对 感谢@thelatemail 的评论,这似乎更顺畅了:httr::content(httr::GET(...))
的调用并非每次都有效 - 我认为这与网址类型 (ftp://
) 但我不做那么多网页抓取,所以我无法真正解释这一点.它在我的 Linux 上似乎比在 Mac 上运行得更好,但这可能无关紧要.无论如何,这就是我得到的:
I couldn't manage to do this with the tm.plugin.webmining
package, but I came up with a rough solution - pulling & parsing data from this web file: ftp://ftp.nasdaqtrader.com/SymbolDirectory/nasdaqlisted.txt. I say rough because for some reason my calls with Thanks to @thelatemail's comment, this seems to be working much smoother:httr::content(httr::GET(...))
don't work every time - I think it has to do with the type of web address (ftp://
) but I don't do that much web scraping so I can't really explain this. It seemed to work better on my Linux than my Mac, but that could be irrelevant. Regardless, here's what I got:
library(quantmod) ## optional
symbolData <- read.csv(
"ftp://ftp.nasdaqtrader.com/SymbolDirectory/nasdaqlisted.txt",
sep="|")
##
> head(symbolData,10)
Symbol Security.Name Market.Category Test.Issue Financial.Status Round.Lot.Size
1 AAIT iShares MSCI All Country Asia Information Technology Index Fund G N N 100
2 AAL American Airlines Group, Inc. - Common Stock Q N N 100
3 AAME Atlantic American Corporation - Common Stock G N N 100
4 AAOI Applied Optoelectronics, Inc. - Common Stock G N N 100
5 AAON AAON, Inc. - Common Stock Q N N 100
6 AAPL Apple Inc. - Common Stock Q N N 100
7 AAVL Avalanche Biotechnologies, Inc. - Common Stock G N N 100
8 AAWW Atlas Air Worldwide Holdings - Common Stock Q N N 100
9 AAXJ iShares MSCI All Country Asia ex Japan Index Fund G N N 100
10 ABAC Aoxin Tianli Group, Inc. - Common Shares S N N 100
根据@GSee 的建议,获取源数据的(大概)更可靠的方法是使用 TTR
包中的 stockSymbols()
函数:
As per @GSee's suggestion, a (presumably) more robust way to obtain the source data is with the stockSymbols()
function in the package TTR
:
> symbolData2 <- stockSymbols(exchange="NASDAQ")
Fetching NASDAQ symbols...
> ##
> head(symbolData2)
Symbol Name LastSale MarketCap IPOyear Sector
1 AAIT iShares MSCI All Country Asia Information Technology Index Fun 34.556 6911200 NA <NA>
2 AAL American Airlines Group, Inc. 40.500 29164164453 NA Transportation
3 AAME Atlantic American Corporation 4.020 83238028 NA Finance
4 AAOI Applied Optoelectronics, Inc. 20.510 303653114 2013 Technology
5 AAON AAON, Inc. 18.420 1013324613 NA Capital Goods
6 AAPL Apple Inc. 103.300 618546661100 1980 Technology
Industry Exchange
1 <NA> NASDAQ
2 Air Freight/Delivery Services NASDAQ
3 Life Insurance NASDAQ
4 Semiconductors NASDAQ
5 Industrial Machinery/Components NASDAQ
6 Computer Manufacturing NASDAQ
我不知道您是否只是想从名称中获取股票代码,但如果您还在寻找实际股价信息,您可以执行以下操作:
I don't know if you just wanted to get ticker symbols from names, but if you are also looking for actual share price information you could do something like this:
namedStock <- function(name="Microsoft",
start=Sys.Date()-365,
end=Sys.Date()-1){
ticker <- symbolData[agrep(name,symbolData[,2]),1]
getSymbols(
Symbols=ticker,
src="yahoo",
env=.GlobalEnv,
from=start,to=end)
}
##
## an xts object named MSFT will be added to
## the global environment, no need to assign
## to an object
namedStock()
##
> str(MSFT)
An ‘xts’ object on 2013-09-03/2014-08-29 containing:
Data: num [1:251, 1:6] 31.8 31.4 31.1 31.3 31.2 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:6] "MSFT.Open" "MSFT.High" "MSFT.Low" "MSFT.Close" ...
Indexed by objects of class: [Date] TZ: UTC
xts Attributes:
List of 2
$ src : chr "yahoo"
$ updated: POSIXct[1:1], format: "2014-09-02 21:51:22.792"
> chartSeries(MSFT)
所以就像我说的,这不是最干净的解决方案,但希望它可以帮助您.另请注意,我的数据来源是拉动在纳斯达克交易的公司(这是大多数主要公司),但您可以轻松地将其与其他来源结合起来.
So like I said, this isn't the cleanest solution but hopefully it helps you out. Also note that my data source was pulling companies traded on NASDAQ (which is most major companies), but you could easily combine this with other sources.
这篇关于以编程方式在 R 中查找股票代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!