解析youtube url [英] Parsing youtube url

查看:85
本文介绍了解析youtube url的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个 ruby​​ youtube url 解析器.它旨在输入以下结构之一的 youtube 网址(这些是目前我可以找到的 youtube 网址结构,也许还有更多?):

I've written a ruby youtube url parser. It's designed to take an input of a youtube url of one of the following structures (these are currently the youtube url structures that I could find, maybe there's more?):

http://youtu.be/sGE4HMvDe-Q
http://www.youtube.com/watch?v=Lp7E973zozc&feature=relmfu
http://www.youtube.com/p/A0C3C1D163BE880A?hl=en_US&fs=1

目的是只保存剪辑或播放列表的 id 以便可以嵌入,所以如果是剪辑:'sGE4HMvDe-Q',或者如果它是播放列表:'p/A0C3C1D163BE880A'

The aim is to save just the id of the clip or playlist so that it can be embedded, so if it's a clip: 'sGE4HMvDe-Q', or if it's a playlist: 'p/A0C3C1D163BE880A'

我编写的解析器对这些 url 工作正常,但似乎有点脆弱和冗长,我只是想知道是否有人可以建议一个更好的 ruby​​ 方法来解决这个问题?

The parser I wrote works fine for these urls, but seems a bit brittle and long-winded, I'm just wondering if someone could suggest a nicer ruby approach to this problem?

def parse_youtube
    a = url.split('//').last.split('/')
    b = a.last.split('watch?v=').last.split('?').first.split('&').first
    if a[1] == 'p'
        url = "p/#{b}"
    else
        url = b
    end
end

推荐答案

def parse_youtube url
   regex = /(?:.be\/|\/watch\?v=|\/(?=p\/))([\w\/\-]+)/
   url.match(regex)[1]
end

urls = %w[http://youtu.be/sGE4HMvDe-Q 
          http://www.youtube.com/watch?v=Lp7E973zozc&feature=relmfu
          http://www.youtube.com/p/A0C3C1D163BE880A?hl=en_US&fs=1]

urls.each {|url| puts parse_youtube url }
# sGE4HMvDe-Q
# Lp7E973zozc
# p/A0C3C1D163BE880A

根据您如何使用它,您可能需要更好地验证 URL 确实来自 youtube.

Depending on how you use this, you might want a better validation that the URL is indeed from youtube.

更新:

几年后回到这个话题.我一直对原始答案的草率感到恼火.由于 Youtube 域的有效性无论如何都没有得到验证,我已经删除了一些废话.

Coming back to this a few years later. I've always been annoyed by how sloppy the original answer was. Since the validity of the Youtube domain wasn't validated anyway, I've removed some of the slop.

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  (?:                      group, but do not capture:
--------------------------------------------------------------------------------
    .                        any character except \n
--------------------------------------------------------------------------------
    be                       'be'
--------------------------------------------------------------------------------
    \/                       '/'
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    \/                       '/'
--------------------------------------------------------------------------------
    watch                    'watch'
--------------------------------------------------------------------------------
    \?                       '?'
--------------------------------------------------------------------------------
    v=                       'v='
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    \/                       '/'
--------------------------------------------------------------------------------
    (?=                      look ahead to see if there is:
--------------------------------------------------------------------------------
      p                        'p'
--------------------------------------------------------------------------------
      \/                       '/'
--------------------------------------------------------------------------------
    )                        end of look-ahead
--------------------------------------------------------------------------------
  )                        end of grouping
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    [\w\/\-]+                any character of: word characters (a-z,
                             A-Z, 0-9, _), '\/', '\-' (1 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )                        end of \1

这篇关于解析youtube url的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆