无法在Ruby中提取单个JSON值 [英] Trouble extracting individual JSON values in Ruby

查看:160
本文介绍了无法在Ruby中提取单个JSON值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试抓取Reddit(无API),并且遇到了砖墙.在reddit上,每个页面都有一个JSON表示形式,只需将.json附加到末尾即可看到,例如https://www.reddit.com/r/AskReddit.json.

I'm in the process of trying to scrape reddit (API-free) and I've run into a brick wall. On reddit, every page has a JSON representation that can be seen simply by appending .json to the end, e.g. https://www.reddit.com/r/AskReddit.json.

我安装了NeatJS,并编写了一小段代码来清理并打印JSON:

I installed NeatJS, and wrote a small chunk of code to clean the JSON up and print it:

require "rubygems"
require "json"
require "net/http"
require "uri"
require 'open-uri'
require 'neatjson'

url = ("https://www.reddit.com/r/AskReddit.json")

result = JSON.parse(open(url).read)

neatJS = JSON.neat_generate(result, wrap: 40, short: true, sorted: true, aligned: true, aroundColonN: 1)

puts neatJS

工作正常:

(还有更多方法,它还会继续进行几页,完整的JSON在这里: http://pastebin.com/HDzFXqyU )

但是,当我更改它以仅提取所需的值时:

However, when I changed it to extract only the values I want:

url = ("https://www.reddit.com/r/AskReddit.json")

result = JSON.parse(open(url).read)

neatJS = JSON.neat_generate(result, wrap: 40, short: true, sorted: true, aligned: true, aroundColonN: 1)

neatJS.each do |data|
  puts data["title"]
  puts data["url"]
  puts data["id"]
end

它给了我一个错误:

  002----extractallaskredditthreads.rb:17:in `<main>': undefined method `each' for #<String:0x0055f948da9ae8> (NoMethodError)

我一直在尝试提取器的不同变体大约两天,但没有一个起作用.我感觉好像缺少了非常明显的东西.如果有人能指出我在做什么错,那将不胜感激.

I've been trying different variations of the extractor for about two days and none of them have worked. I feel like I'm missing something incredibly obvious. If anyone could point out what I'm doing wrong, that would be appreciated.

编辑

事实证明我输入了错误的变量名:

It turns out I had the wrong variable name:

 neatSJ =/= neatJS

但是,更正此错误只会改变我得到的错误:

However, correcting this only changes the error I got:

 002----extractallaskredditthreads.rb:17:in `<main>': undefined method `each' for #<String:0x0055f948da9ae8> (NoMethodError)

正如我所说,我一直在尝试多种方法来提取标签,这可能造成了我的错字.

And as I said, I have been attempting multiple ways of extracting the tags, which may have caused my typo.

推荐答案

在此代码中:

result = JSON.parse(open(url).read)

neatJS = JSON.neat_generate(result, wrap: 40, short: true, sorted: true, aligned: true, aroundColonN: 1)

... result是Ruby哈希对象,是使用JSON.parse将JSON解析为Ruby对象的结果.同时,neatJS是字符串,是在result哈希上调用JSON.neat_generate的结果.在字符串上调用each没有任何意义.如果要访问JSON结构中的值,则要使用result对象,而不是neatJS字符串:

...result is a Ruby Hash object, the result of parsing the JSON into a Ruby object with JSON.parse. Meanwhile, neatJS is a String, the result of calling JSON.neat_generate on the result Hash. It doesn't make sense to call each on a string. If you want to access the values inside the JSON structure, you want to use the result object, not the neatJS string:

children = result["data"]["children"]

children.each do |child|
  puts child["data"]["title"]
  puts child["data"]["url"]
  puts child["data"]["id"]
end

这篇关于无法在Ruby中提取单个JSON值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆