在Heroku上渲染大量JSON的有效方法 [英] Efficient way to render ton of JSON on Heroku

查看：140 发布时间：2018/6/7 11:43:51 ruby json heroku sinatra mongoid

本文介绍了在Heroku上渲染大量JSON的有效方法的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我用一个端点构建了一个简单的API。它刮擦文件，目前有大约30,000条记录。我希望能够通过一个http调用在JSON中获取所有这些记录。

这是我的Sinatra查看代码：

  require'sinatra' 
 require'json'
 require'mongoid'
 
 Mongoid.identity_map_enabled = false 
 
 get'/'do 
 content_type：json 
 Book.all 
 end

我尝试了以下操作：
使用multi_json与

  require'./require.rb'
 require'sinatra'
 require' multi_json'
 MultiJson.engine =：yajl 
 
 Mongoid.identity_map_enabled = false 
 
 get'/'do 
 content_type：json 
 MultiJson .encode（Book.all）
 end

这种方法的问题是我得到错误R14（超出内存配额）。当我尝试使用'oj'宝石时，我收到了同样的错误。

我只是将一切冗长的Redis字符串进行协调，但Heroku的redis服务为每月30美元我需要的实例大小（> 10mb）。

我当前的解决方案是使用后台任务创建对象，并在满足Mongoid对象大小限制（16mb）。这种方法的问题：渲染仍然需要将近30秒，我必须在接收应用程序上运行后处理，以正确地从对象中提取json。

有没有人有任何更好的想法，我可以如何在一次调用中为30k记录呈现json而无需切换离开Heroku？听起来就像你想直接将JSON流式传输到客户端，而不是将它们全部存储在内存中。这可能是减少内存使用量的最佳方法。例如，您可以使用 yajl 将JSON直接编码到流中。

编辑：我重写了 yajl 的整个代码，因为它的API更加引人注目，并且允许更简洁的代码。我还包括一个阅读大块响应的例子。这是我写的流式JSON数组助手：
require'yajl' 模块JsonArray class StreamWriter def initialize（out） super（） @out = out @encoder = Yajl :: Encoder.new @first = true end def <（lt）（object） @out<< '，'除非@first @out<< @ encoder.encode（object） @out<< \\\ @first = false 结束结束 def self.write_stream（app，&block） app.stream do |出去| out<< '[' block.call StreamWriter.new（out） out<< ']' 结束结束结束
用法：
require'sinatra' require'mongoid' Mongoid.identity_map_enabled = false ＃使用支持流式传输的服务器 set：server，：thin get'/'do content_type：json JsonArray.write_stream （self）do | json | Book.all.each do | book | json<< book.attributes end end end
解码客户端可以使用 em-http 来读取和解析块中的响应。请注意，此解决方案要求客户端内存足够大以存储整个对象数组。下面是相应的流式解析器帮助器：

require'yajl' module JsonArray class StreamParser def初始化（&回调） @parser = Yajl :: Parser.new @ parser.on_parse_complete =回调结束 def< ;<（str） @parser<< str end end def self.parse_stream（&callback） StreamParser.new（&callback） end end
用法：

require'em-http' parser = JsonArray.parse_stream do | object |当我们完成分析＃整个数组时，调用＃块;现在我们可以处理数据 p object end EventMachine.run do http = EventMachine :: HttpRequest.new（'http：// localhost：4567' ）.get http.stream do | chunk | 解析器<< chunk end http.callback do EventMachine.stop end end
替代解决方案

实际上，当您放弃生成一个适当的JSON数组。以上解决方案所产生的JSON格式如下：
[{... book_1 ...} ， {... book_2 ...} ，{... book_3 ...} ... ，{... book_n ...} ]
然而，我们可以将每本书作为单独的JSON进行流式处理，从而将格式缩小为以下内容：
{... book_1 ...} {... book_2 ...} {.. 。book_3 ...} ... {... book_n ...}
然后服务器上的代码会很简单：
require'sinatra ' require'mongoid' require'yajl' Mongoid.identity_map_enabled = false set：server，：thin get' /'do content_type：json encoder = Yajl :: Encoder.new stream do | out | Book.all.each do | book | out<< encoder.encode（book.attributes）<< \\\ 结束结束结束
由于以及客户：

require'em-http' require'yajl' parser = Yajl :: Parser.new parser.on_parse_complete = Proc.new do | book | ＃现在将为每本书分别调用 p book end EventMachine.run do http = EventMachine :: HttpRequest.new（' http：// localhost：4567'）.get http.stream do | chunk | 解析器<< chunk end http.callback do EventMachine.stop end end
重要的是现在客户端不必等待整个响应，而是分别分析每本书。但是，如果您的某个客户希望有一个大的JSON数组，那么这将不起作用。

I built a simple API with one endpoint. It scrapes files and currently has around 30,000 records. I would ideally like to be able to fetch all those records in JSON with one http call.

Here is my Sinatra view code:
require 'sinatra' require 'json' require 'mongoid' Mongoid.identity_map_enabled = false get '/' do content_type :json Book.all end
I've tried the following: using multi_json with
require './require.rb' require 'sinatra' require 'multi_json' MultiJson.engine = :yajl Mongoid.identity_map_enabled = false get '/' do content_type :json MultiJson.encode(Book.all) end
The problem with this approach is I get Error R14 (Memory quota exceeded). I get the same error when I try to use the 'oj' gem.

I would just concatinate everything one long Redis string, but Heroku's redis service is $30 per month for the instance size I would need (> 10mb).

My current solution is to use background task that creates objects and stuffs them full of jsonified objects at near the Mongoid object size limit (16mb). The problems with this approach: It still takes nearly 30 seconds to render, and I have to run post-processing on the receiving app to properly extract the json from the objects.

Does anyone have any better idea for how I can render json for 30k records in one call without switching away from Heroku?
解决方案
Sounds like you want to stream the JSON directly to the client instead of building it all up in memory. It's probably the best way to cut down memory usage. You could for example use yajl to encode JSON directly to a stream.

Edit: I rewrote the entire code for yajl, because its API is much more compelling and allows for much cleaner code. I also included an example for reading the response in chunks. Here's the streamed JSON array helper I wrote:
require 'yajl' module JsonArray class StreamWriter def initialize(out) super() @out = out @encoder = Yajl::Encoder.new @first = true end def <<(object) @out << ',' unless @first @out << @encoder.encode(object) @out << "\n" @first = false end end def self.write_stream(app, &block) app.stream do |out| out << '[' block.call StreamWriter.new(out) out << ']' end end end
Usage:
require 'sinatra' require 'mongoid' Mongoid.identity_map_enabled = false # use a server that supports streaming set :server, :thin get '/' do content_type :json JsonArray.write_stream(self) do |json| Book.all.each do |book| json << book.attributes end end end
To decode on the client side you can read and parse the response in chunks, for example with em-http. Note that this solution requires the clients memory to be large enough to store the entire objects array. Here's the corresponding streamed parser helper:
require 'yajl' module JsonArray class StreamParser def initialize(&callback) @parser = Yajl::Parser.new @parser.on_parse_complete = callback end def <<(str) @parser << str end end def self.parse_stream(&callback) StreamParser.new(&callback) end end
Usage:
require 'em-http' parser = JsonArray.parse_stream do |object| # block is called when we are done parsing the # entire array; now we can handle the data p object end EventMachine.run do http = EventMachine::HttpRequest.new('http://localhost:4567').get http.stream do |chunk| parser << chunk end http.callback do EventMachine.stop end end
Alternative solution

You could actually simplify the whole thing a lot when you give up the need for generating a "proper" JSON array. What the above solution generates is JSON in this form:
[{ ... book_1 ... } ,{ ... book_2 ... } ,{ ... book_3 ... } ... ,{ ... book_n ... } ]
We could however stream each book as a separate JSON and thus reduce the format to the following:
{ ... book_1 ... } { ... book_2 ... } { ... book_3 ... } ... { ... book_n ... }
The code on the server would then be much simpler:
require 'sinatra' require 'mongoid' require 'yajl' Mongoid.identity_map_enabled = false set :server, :thin get '/' do content_type :json encoder = Yajl::Encoder.new stream do |out| Book.all.each do |book| out << encoder.encode(book.attributes) << "\n" end end end
As well as the client:
require 'em-http' require 'yajl' parser = Yajl::Parser.new parser.on_parse_complete = Proc.new do |book| # this will now be called separately for every book p book end EventMachine.run do http = EventMachine::HttpRequest.new('http://localhost:4567').get http.stream do |chunk| parser << chunk end http.callback do EventMachine.stop end end
The great thing is that now the client does not have to wait for the entire response, but instead parses every book separately. However, this will not work if one of your clients expects one single big JSON array.

这篇关于在Heroku上渲染大量JSON的有效方法的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在Heroku上渲染大量JSON的有效方法 [英] Efficient way to render ton of JSON on Heroku

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在Heroku上渲染大量JSON的有效方法 [英] Efficient way to render ton of JSON on Heroku

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭