curl慢多线程dns [英] Curl slow multithreading dns

查看:767
本文介绍了curl慢多线程dns的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

该程序是用C ++编写的,它对网页进行索引,因此所有域都是来自网络的随机域名。奇怪的是,在DNS 失败 / 未找到比例很小(> 5%)。



下面是PMP堆栈跟踪:

  3886 __GI ___民调显示,send_dg,BUF = 0xADDRESS ,__ libc_res_nquery,__ libc_res_nquerydomain,__ libc_res_nsearch,_nss_dns_gethostbyname3_r,gaih_inet,__ GI_getaddrinfo,Curl_getaddrinfo_ex 
601 __GI ___民调显示,Curl_socket_check,waitconnect,singleipconnect,Curl_connecthost,ConnectPlease,protocol_done = protocol_done @项= 0xADDRESS),Curl_connect,connect_host,在
534 __GI ___ poll,Curl_socket_check,Transfer,at,getweb,athread,start_thread,clone,??
498 nanosleep,__ sleep,athread,start_thread,clone,??
50 __GI ___ poll,Curl_socket_check,Transfer,at,getweb,getweb,athread,start_thread,clone,??
15 __GI ___民调显示,Curl_socket_check,传输,AT,getweb,getweb,getweb,athread,start_thread,克隆
7了nanosleep,usleep,主要

为什么在 _nss_dns_gethostbyname3_r 有这么多线程?我能做什么来加快速度。



难道是因为我使用卷曲的默认同步DNS解析器与 CURLOPT_NOSIGNAL



程序运行在intel I7(8核HT),16GB RAM,Ububtu 12.10上。



带宽从不规则间隔的6MB / s(ISP限制)到> 2MB / s不等,有时甚至下降到几个100KB / s。

解决方案

我发现解决方案是将默认curl dns解析器更改为 c-ares

更改为 c-ares 也允许我添加更多的集dns服务器, dns查询数



结果:

  /设置为ipv4 only 
curl_easy_setopt(curl,CURLOPT_IPRESOLVE,CURL_IPRESOLVE_V4);

// cicle dns服务器
dns_index = DNS_SERVER_I;
pthread_mutex_lock(& running_mutex);
if(DNS_SERVER_I> DNS_SERVERS.size())
{
DNS_SERVER_I = 1;
} else
{
DNS_SERVER_I ++;
}
pthread_mutex_unlock(& running_mutex);

string dns_servers_string = DNS_SERVERS.at(dns_index%DNS_SERVERS.size())+,+ DNS_SERVERS.at((dns_index + 1)%DNS_SERVERS.size())+,+ DNS_SERVERS .at((dns_index + 2)%DNS_SERVERS.size());

//设置curl DNS(仅当使用c-ares构建curl时,此选项才可用)
curl_easy_setopt(curl,CURLOPT_DNS_SERVERS,& dns_servers_string [0]);


The program is made in C++, and it indexes webpages, so all domains are random domain names from the web. The strange part is that the dns fail/not found percentage is small (>5%).

here is the pmp stack trace:

   3886 __GI___poll,send_dg,buf=0xADDRESS,__libc_res_nquery,__libc_res_nquerydomain,__libc_res_nsearch,_nss_dns_gethostbyname3_r,gaih_inet,__GI_getaddrinfo,Curl_getaddrinfo_ex
    601 __GI___poll,Curl_socket_check,waitconnect,singleipconnect,Curl_connecthost,ConnectPlease,protocol_done=protocol_done@entry=0xADDRESS),Curl_connect,connect_host,at
    534 __GI___poll,Curl_socket_check,Transfer,at,getweb,athread,start_thread,clone,??
    498 nanosleep,__sleep,athread,start_thread,clone,??
     50 __GI___poll,Curl_socket_check,Transfer,at,getweb,getweb,athread,start_thread,clone,??
     15 __GI___poll,Curl_socket_check,Transfer,at,getweb,getweb,getweb,athread,start_thread,clone
      7 nanosleep,usleep,main

Why are there so many threads at _nss_dns_gethostbyname3_r? What could I do to speed it up.

Could it be because I'm using curl's default synchronous DNS resolver with CURLOPT_NOSIGNAL?

The program is running on a intel I7 (8 cores HT), 16GB ram, Ububtu 12.10.

The bandwidth varies from of 6MB/s (ISP limit) -> 2MB/s at an irregular interval, and it sometimes even drops to a few 100KB/s.

解决方案

I've found that the solution was to change the default curl dns resolver to c-ares and to specifically ask for ipv4 as ipv6 is not supported yet by my network.

Changing to c-ares also allowed me to add more set dns servers and to circle them in order to improve the number of dns queries/s.

The outcome:

//set to ipv4 only
curl_easy_setopt(curl, CURLOPT_IPRESOLVE, CURL_IPRESOLVE_V4);

//cicle dns Servers
dns_index=DNS_SERVER_I;
pthread_mutex_lock(&running_mutex);
    if(DNS_SERVER_I>DNS_SERVERS.size())
    {
        DNS_SERVER_I=1;
    }else
    {
        DNS_SERVER_I++;
    }
pthread_mutex_unlock(&running_mutex);

string dns_servers_string=DNS_SERVERS.at(dns_index%DNS_SERVERS.size())+","+DNS_SERVERS.at((dns_index+1)%DNS_SERVERS.size())+","+DNS_SERVERS.at((dns_index+2)%DNS_SERVERS.size());

// set curl DNS (option available only when curl is built with c-ares)
curl_easy_setopt(curl, CURLOPT_DNS_SERVERS, &dns_servers_string[0]);

这篇关于curl慢多线程dns的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆