如何解析URL并提取所需的子字符串 [英] How to parse a URL and extract the required substring
问题描述
说我有一个像这样的字符串:"http://something.example.com/directory/"
我想做的是解析此字符串,然后从字符串中提取"something"
.
第一步,显然是检查以确保该字符串包含"http://"
-否则,应忽略该字符串.
但是,我该如何在该字符串中提取"something"
呢?假定将要评估的所有字符串都具有相似的结构(即,我正在尝试提取URL的子域-如果正在检查的字符串确实是有效的URL-其中以"http://"
开头的有效字符串). /p>
谢谢.
P.S.我知道如何检查第一部分,即我只能在"http://"
处分割字符串,但这不能解决整个问题,因为这会产生"http://something.example.com/directory/"
.我想要的只是"something"
,没别的.
我会这样做:
require 'uri'
uri = URI.parse('http://something.example.com/directory/')
uri.host.split('.').first
=> "something"
URI 内置在Ruby中.它不是功能最全的功能,但对于大多数URL而言,它具有完成此任务的足够能力.如果您有 IRIs ,请查看
What I want to do is to parse this string, and extract the "something"
from the string.
The first step, is to obviously check to make sure that the string contains "http://"
- otherwise, it should ignore the string.
But, how do I then just extract the "something"
in that string? Assume that all the strings that this will be evaluating will have a similar structure (i.e. I am trying to extract the subdomain of the URL - if the string being examined is indeed a valid URL - where valid is starts with "http://"
).
Thanks.
P.S. I know how to check the first part, i.e. I can just simply split the string at the "http://"
but that doesn't solve the full problem because that will produce "http://something.example.com/directory/"
. All I want is the "something"
, nothing else.
I'd do it this way:
require 'uri'
uri = URI.parse('http://something.example.com/directory/')
uri.host.split('.').first
=> "something"
URI is built into Ruby. It's not the most full-featured but it's plenty capable of doing this task for most URLs. If you have IRIs then look at Addressable::URI.
这篇关于如何解析URL并提取所需的子字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!