ruby 2.1.2 超时仍然不是线程安全的吗? [英] Is ruby 2.1.2 timeout still not thread safe?

查看：56 发布时间：2021/6/4 20:24:23 ruby-on-rails ruby multithreading timeout sidekiq

本文介绍了ruby 2.1.2 超时仍然不是线程安全的吗?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有 50 个 sidekiq 线程在网络上爬行，几周前这些线程在运行约 20 分钟后开始挂起.当我进行回溯转储时，大多数线程都卡在 net/http 初始化上:

I have 50 sidekiq threads crawling the web, and a few weeks ago the threads started hanging after about 20 minutes of running. When I do a backtrace dump, most of the threads are stuck on net/http initialize:

/app/vendor/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:879:in `initialize'
/app/vendor/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:879:in `open'
/app/vendor/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:879:in `block in connect'
/app/vendor/ruby-2.1.2/lib/ruby/2.1.0/timeout.rb:76:in `timeout'
/app/vendor/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:878:in `connect'
/app/vendor/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:863:in `do_start'
/app/vendor/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:858:in `start'
/app/vendor/bundle/ruby/2.1.0/gems/net-http-persistent-2.9.4/lib/net/http/persistent.rb:700:in `start'
/app/vendor/bundle/ruby/2.1.0/gems/net-http-persistent-2.9.4/lib/net/http/persistent.rb:631:in `connection_for'
/app/vendor/bundle/ruby/2.1.0/gems/net-http-persistent-2.9.4/lib/net/http/persistent.rb:994:in `request'
/app/vendor/bundle/ruby/2.1.0/gems/mechanize-2.7.2/lib/mechanize/http/agent.rb:257:in `fetch'
/app/vendor/bundle/ruby/2.1.0/gems/mechanize-2.7.2/lib/mechanize/http/agent.rb:974:in `response_redirect'
/app/vendor/bundle/ruby/2.1.0/gems/mechanize-2.7.2/lib/mechanize/http/agent.rb:298:in `fetch'
/app/vendor/bundle/ruby/2.1.0/gems/mechanize-2.7.2/lib/mechanize.rb:432:in `get'
/app/app/workers/crawl_page.rb:24:in `block in perform'
/app/vendor/ruby-2.1.2/lib/ruby/2.1.0/timeout.rb:91:in `block in timeout'
/app/vendor/ruby-2.1.2/lib/ruby/2.1.0/timeout.rb:35:in `block in catch'
/app/vendor/ruby-2.1.2/lib/ruby/2.1.0/timeout.rb:35:in `catch'
/app/vendor/ruby-2.1.2/lib/ruby/2.1.0/timeout.rb:35:in `catch'
/app/vendor/ruby-2.1.2/lib/ruby/2.1.0/timeout.rb:106:in `timeout'

我不认为 sidekiq 会卡在 net/http 上，因为我已经将整个调用包装在超时中: Timeout::timeout(APP_CONFIG['crawl_page_timeout']) { @page = agent.get(url) }

I didn't think sidekiq would get stuck on net/http because I've wrapped the entire call in a timeout: Timeout::timeout(APP_CONFIG['crawl_page_timeout']) { @page = agent.get(url) }

...但后来我开始阅读一些关于 ruby 的超时如何不是线程安全的旧帖子:http://blog.headius.com/2008/02/rubys-threadraise-threadkill-timeoutrb.html

...but then I started reading some old posts about how ruby's Timeout is not thread safe: http://blog.headius.com/2008/02/rubys-threadraise-threadkill-timeoutrb.html

ruby 的超时是否仍然不是线程安全的?

Is ruby's Timeout still not thread safe?

我知道很多人用 Ruby 编写爬虫.如果 Timeout 不是线程安全的，那么人们如何编写爬虫来处理 net/http 卡住的问题?

I know a lot of people write crawlers in Ruby. If Timeout isn't thread-safe, how are people writing crawlers handling the issue of net/http getting stuck?

更新:

我已经切换到 HTTPClient(特别说明它的线程安全)来替换 mechanize.我们似乎仍然被困在初始化线程上.同样，这可能是由于 ruby 的超时无法正常工作，也可能是 sidekiq 问题.这是最近挂起的 sidekiq 线程的堆栈跟踪:

I've switched to HTTPClient (which specifically says its thread safe) to replace mechanize. We appear to still be getting stuck on initializing a thread. Again, this could be due to ruby'ss Timeout not working properly, or it could be a sidekiq issue. Here's the stacktrace from the most recent hung sidekiq threads:

/app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient/session.rb:805:in `initialize'
/app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient/session.rb:805:in `new'
/app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient/session.rb:805:in `create_socket'
/app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient/session.rb:752:in `block in connect'
/app/vendor/ruby-2.1.2/lib/ruby/2.1.0/timeout.rb:91:in `block in timeout'
/app/vendor/ruby-2.1.2/lib/ruby/2.1.0/timeout.rb:101:in `call'
/app/vendor/ruby-2.1.2/lib/ruby/2.1.0/timeout.rb:101:in `timeout'
/app/vendor/ruby-2.1.2/lib/ruby/2.1.0/timeout.rb:127:in `timeout'
/app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient/session.rb:751:in `connect'
/app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient/session.rb:609:in `query'
/app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient/session.rb:164:in `query'
/app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient.rb:1087:in `do_get_block'
/app/vendor/bundle/ruby/2.1.0/gems/newrelic_rpm-3.9.2.239/lib/new_relic/agent/instrumentation/httpclient.rb:34:in `block in do_get_block_with_newrelic'
/app/vendor/bundle/ruby/2.1.0/gems/newrelic_rpm-3.9.2.239/lib/new_relic/agent/cross_app_tracing.rb:43:in `tl_trace_http_request'
/app/vendor/bundle/ruby/2.1.0/gems/newrelic_rpm-3.9.2.239/lib/new_relic/agent/instrumentation/httpclient.rb:33:in `do_get_block_with_newrelic'
/app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient.rb:891:in `block in do_request'
/app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient.rb:985:in `protect_keep_alive_disconnected'
/app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient.rb:890:in `do_request'
/app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient.rb:963:in `follow_redirect'
/app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient.rb:776:in `request'
/app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient.rb:677:in `get'
/app/app/ohm_models/queued_page.rb:20:in `run_crawl'

ruby 2.1.2 超时仍然不是线程安全的吗? [英] Is ruby 2.1.2 timeout still not thread safe?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

ruby 2.1.2 超时仍然不是线程安全的吗? [英] Is ruby 2.1.2 timeout still not thread safe?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭