解析youtube url [英] Parsing youtube url
问题描述
我写了一个 ruby youtube url 解析器.它旨在输入以下结构之一的 youtube 网址(这些是目前我可以找到的 youtube 网址结构,也许还有更多?):
I've written a ruby youtube url parser. It's designed to take an input of a youtube url of one of the following structures (these are currently the youtube url structures that I could find, maybe there's more?):
http://youtu.be/sGE4HMvDe-Q
http://www.youtube.com/watch?v=Lp7E973zozc&feature=relmfu
http://www.youtube.com/p/A0C3C1D163BE880A?hl=en_US&fs=1
目的是只保存剪辑或播放列表的 id 以便可以嵌入,所以如果是剪辑:'sGE4HMvDe-Q'
,或者如果它是播放列表:'p/A0C3C1D163BE880A'
The aim is to save just the id of the clip or playlist so that it can be embedded, so if it's a clip: 'sGE4HMvDe-Q'
, or if it's a playlist: 'p/A0C3C1D163BE880A'
我编写的解析器对这些 url 工作正常,但似乎有点脆弱和冗长,我只是想知道是否有人可以建议一个更好的 ruby 方法来解决这个问题?
The parser I wrote works fine for these urls, but seems a bit brittle and long-winded, I'm just wondering if someone could suggest a nicer ruby approach to this problem?
def parse_youtube
a = url.split('//').last.split('/')
b = a.last.split('watch?v=').last.split('?').first.split('&').first
if a[1] == 'p'
url = "p/#{b}"
else
url = b
end
end
推荐答案
def parse_youtube url
regex = /(?:.be\/|\/watch\?v=|\/(?=p\/))([\w\/\-]+)/
url.match(regex)[1]
end
urls = %w[http://youtu.be/sGE4HMvDe-Q
http://www.youtube.com/watch?v=Lp7E973zozc&feature=relmfu
http://www.youtube.com/p/A0C3C1D163BE880A?hl=en_US&fs=1]
urls.each {|url| puts parse_youtube url }
# sGE4HMvDe-Q
# Lp7E973zozc
# p/A0C3C1D163BE880A
根据您如何使用它,您可能需要更好地验证 URL 确实来自 youtube.
Depending on how you use this, you might want a better validation that the URL is indeed from youtube.
更新:
几年后回到这个话题.我一直对原始答案的草率感到恼火.由于 Youtube 域的有效性无论如何都没有得到验证,我已经删除了一些废话.
Coming back to this a few years later. I've always been annoyed by how sloppy the original answer was. Since the validity of the Youtube domain wasn't validated anyway, I've removed some of the slop.
NODE EXPLANATION
--------------------------------------------------------------------------------
(?: group, but do not capture:
--------------------------------------------------------------------------------
. any character except \n
--------------------------------------------------------------------------------
be 'be'
--------------------------------------------------------------------------------
\/ '/'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
\/ '/'
--------------------------------------------------------------------------------
watch 'watch'
--------------------------------------------------------------------------------
\? '?'
--------------------------------------------------------------------------------
v= 'v='
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
\/ '/'
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
p 'p'
--------------------------------------------------------------------------------
\/ '/'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
) end of grouping
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
[\w\/\-]+ any character of: word characters (a-z,
A-Z, 0-9, _), '\/', '\-' (1 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of \1
这篇关于解析youtube url的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!