如何解析URL并提取所需的子字符串 [英] How to parse a URL and extract the required substring

查看:91
本文介绍了如何解析URL并提取所需的子字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我有一个像这样的字符串:"http://something.example.com/directory/"

我想做的是解析此字符串,然后从字符串中提取"something".

第一步,显然是检查以确保该字符串包含"http://"-否则,应忽略该字符串.

但是,我该如何在该字符串中提取"something"呢?假定将要评估的所有字符串都具有相似的结构(即,我正在尝试提取URL的子域-如果正在检查的字符串确实是有效的URL-其中以"http://"开头的有效字符串). /p>

谢谢.

P.S.我知道如何检查第一部分,即我只能在"http://"处分割字符串,但这不能解决整个问题,因为这会产生"http://something.example.com/directory/".我想要的只是"something",没别的.

解决方案

我会这样做:

require 'uri'

uri = URI.parse('http://something.example.com/directory/')
uri.host.split('.').first
=> "something"

URI 内置在Ruby中.它不是功能最全的功能,但对于大多数URL而言,它具有完成此任务的足够能力.如果您有 IRIs ,请查看

What I want to do is to parse this string, and extract the "something" from the string.

The first step, is to obviously check to make sure that the string contains "http://" - otherwise, it should ignore the string.

But, how do I then just extract the "something" in that string? Assume that all the strings that this will be evaluating will have a similar structure (i.e. I am trying to extract the subdomain of the URL - if the string being examined is indeed a valid URL - where valid is starts with "http://").

Thanks.

P.S. I know how to check the first part, i.e. I can just simply split the string at the "http://" but that doesn't solve the full problem because that will produce "http://something.example.com/directory/". All I want is the "something", nothing else.

解决方案

I'd do it this way:

require 'uri'

uri = URI.parse('http://something.example.com/directory/')
uri.host.split('.').first
=> "something"

URI is built into Ruby. It's not the most full-featured but it's plenty capable of doing this task for most URLs. If you have IRIs then look at Addressable::URI.

这篇关于如何解析URL并提取所需的子字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆