如何从一个字符串中去除一个网址并将其放入数组? [英] How do I strip a URL from a string and place it an array?

查看:149
本文介绍了如何从一个字符串中去除一个网址并将其放入数组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在建设一个小型脚本搜索一个服务啾啾5最新图片,隔离URL,并把该URL到一个数组。

I'm working on building a small script that searches for the 5 most recent pictures tweeted by a service, isolates the URL and puts that URL into an array.

def grabTweets(linkArray) #brings in empty array
  tweets = Twitter.search("[pic] "+" url.com/r/", :rpp => 2, :result_type => "recent").map do |status|
  tweets = "#{status.text}" #class = string

  url_regexp = /http:\/\/\w/ #isolates link
  url = tweets.split.grep(url_regexp).to_s #chops off link, turns link to string from an array

  #add link to url array
  #print linkArray #prints []

  linkArray.push(url)
  print linkArray

  end
end

x = []
timelineTweets = grabTweets(x)

该函数返回这样的事情:[\\HTTP://t.co/6789 \\]] [[\\HTTP://t.co/12345 \\]]

The function is returning things like this: ["[\"http://t.co/6789\"]"]["[\"http://t.co/12345\"]"]

我试图让它返回[http://t.co/6789,http://t.co/1245],但它不是管理的。

I'm trying to get it to return ["http://t.co/6789", "http://t.co/1245"] but it's not managing that.

在这里任何帮助将是AP preciated。我不知道我在做什么错了。

Any help here would be appreciated. I'm not sure what I'm doing wrong.

推荐答案

抢在Ruby中的URL最简单的方法就是使用<$c$c>URI::extract方法。这是一个pre-现有轮的作品:

The easiest way to grab URLs in Ruby is to use the URI::extract method. It's a pre-existing wheel that works:

require 'uri'
require 'open-uri'

body = open('http://www.example.com').read

urls = URI::extract(body)
puts urls

将返回:

http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
http://www.w3.org/1999/xhtml
http://www.icann.org/
mailto:iana@iana.org?subject=General%20website%20feedback

一旦你的阵列可以筛选你想要的东西,或者你可以给它计划提取的列表。

Once you have the array you can filter for what you want, or you can give it a list of schemes to extract.

这篇关于如何从一个字符串中去除一个网址并将其放入数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆