Google 为收到的每个请求打开多少个套接字? [英] How many sockets do Google open for every request it receives?

查看:18
本文介绍了Google 为收到的每个请求打开多少个套接字?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下是我最近在一家知名网络软件公司的面试经历.有人问我有关互连 TCP 级别和 Web 请求的问题,这让我很困惑.我真的很想知道专家对答案的看法.这不仅是关于面试,而且是关于网络如何工作(或者应用层和传输层如何相互影响,如果有的话)的基本理解.

The following is my recent interview experience with a reputed network software company. I was asked questions about interconnecting TCP level and web requests and that confused me a lot. I really would like to know expert opinions on the answers. It is not just about the interview but also about a fundamental understanding of how networking work (or how application layer and transport layer cross-talk, if at all they do).

  1. 采访者:告诉我幕后发生的过程我打开浏览器并在其中输入 google.com.

  1. Interviewer: Tell me the process that happens behind the scenes when I open a browser and type google.com in it.

Me:发生的第一件事是创建了一个套接字,它是由 {SRC-IP, SRC-PORT, DEST-IP, DEST-PORT, PROTOCOL} 标识.这SRC-PORT number 是浏​​览器给定的随机数.通常是 TCP/IP连接协议(建立三向握手).现在客户端(我的浏览器)和服务器(谷歌)都准备好处理要求.(TCP 连接建立).

Me: The first thing that happens is a socket is created which is identified by {SRC-IP, SRC-PORT, DEST-IP, DEST-PORT, PROTOCOL}. The SRC-PORT number is a random number given by the browser. Usually the TCP/IP connection protocol (three-way handshake is established). Now both the client (my browser) and the server (Google) are ready to handle requests. (TCP connection is established).

Interviewer:等等,名字解析什么时候发生?

Interviewer: Wait, when does the name resolution happen?

:是的,对不起.它甚至应该在创建套接字之前发生.DNS 名称解析首先发生以获取 Google 的 IP 地址到达.

Me: Yep, I am sorry. It should have happened before even the socket is created. DNS name resolve happens first to get the IP address of Google to reach at.

Interviewer:是否为 DNS 名称解析创建了套接字?

Interviewer: Is a socket created for DNS name resolution?

:嗯,我其实不知道.但我所知道的 DNS 名称解析是无连接.也就是说,它不是TCP而是UDP.只有一个请求-响应循环发生.(所以有一个为 DNS 创建的新套接字名称解析).

Me: hmm, I actually do not know. But all I know DNS name resolution is connectionless. That is, it's not TCP but UDP. Only a single request-response cycle happens. (So there is a new socket created for DNS name resolution).

采访者:google.com 对来自其他人的其他请求开放客户.与 Google 阻止建立连接也是如此其他用户?

Interviewer: google.com is open for other requests from other clients. So is establishing your connection with Google blocking other users?

Me:我不确定 Google 是如何处理的.但是在典型的套接字中通信,它在最小程度上阻塞.

Me: I am not sure how Google handles this. But in a typical socket communication, it is blocking to a minimal extent.

Interviewer:您认为这可以如何处理?

Interviewer: How do you think this can be handled?

Me:我猜这个进程会派生一个新线程并创建一个套接字来处理我的要求.从现在开始,我的套接字端点与Google 就是这个子套接字.

Me: I guess the process forks a new thread and creates a socket to handle my request. From now on, my socket endpoint of communication with Google is this child socket.

Interviewer:如果是这样,这个子套接字的端口号与父母不同?

Interviewer: If that is the case, is this child socket’s port number different than the parent one?

Me:父套接字在 80 监听来自客户.孩子必须在不同的端口号上侦听.

Me: The parent socket is listening at 80 for new requests from clients. The child must be listening at a different port number.

采访者:自从目标端口号改变后,你的 TCP 连接是如何维护的.(即 Google 的数据包上发送的 src-port 号)?

Interviewer: How is your TCP connection maintained since your destination port number has changed. (That is the src-port number sent on Google's packet) ?

我:我看到的作为客户端的目标端口总是 80.当响应被发回,它也来自端口 80.我猜操作系统/父进程在发送回之前将源端口设置回 80发布.

Me: The dest-port that I see as a client is always 80. When a response is sent back, it also comes from port 80. I guess the OS/the parent process sets the source port back to 80 before it sends back the post.

Interviewer:您的套接字连接与谷歌?

Interviewer: How long is your socket connection established with Google?

:如果我在一段时间内没有提出任何要求,主线程关闭其子套接字和来自的任何后续请求我会像一个新客户一样.

Me: If I don’t make any requests for a period of time, the main thread closes its child socket and any subsequent requests from me will be like I am a new client.

Interviewer:不,Google 不会为你.它处理您的请求并正确丢弃/回收套接字

Interviewer: No, Google will not keep a dedicated child socket for you. It handles your request and discards/recycles the sockets right away.

采访者:虽然谷歌可能有很多服务器要服务请求,每个服务器只能在端口 80 上打开一个父套接字.访问 Google 网页的客户端数量必须大于它们拥有的服务器数量.这通常是如何处理的?

Interviewer: Although Google may have many servers to serve requests, each server can have only one parent socket opened at port 80. The number of clients to access Google's webpage must be larger than the number of servers they have. How is this usually handled?

Me:我不确定这是如何处理的.我看到唯一的办法work 为它收到的每个请求生成一个线程.

Me: I am not sure how this is handled. I see the only way it could work is spawn a thread for each request it receives.

Interviewer:您认为 Google 处理这个问题的方式与有银行网站吗?

Interviewer: Do you think the way Google handles this is different from any bank website?

Me:在 TCP-IP 套接字级别,应该是相似的.在请求层面,略有不同,因为一个会话维护以保持对银行网站的请求之间的状态.

Me: At the TCP-IP socket level, it should be similar. At the request level, slightly different because a session is maintained to keep state between requests for banks' websites.

如果有人能对每一点进行解释,对很多网络初学者会很有帮助.

If someone can give an explanation of each of the points, it will be very helpful for many beginners in networking.

推荐答案

Google 为收到的每个请求打开多少个套接字?

How many sockets do Google open for every request it receives?

这个问题实际上并没有出现在面试中,但它在你的标题中,所以我会回答它.Google 根本不打开任何套接字.您的浏览器就是这样做的.Google 以新套接字的形式接受连接,但我不会将其描述为打开"它们.

This question doesn't actually appear in the interview, but it's in your title so I'll answer it. Google doesn't open any sockets at all. Your browser does that. Google accepts connections, in the form of new sockets, but I wouldn't describe that as 'opening' them.

采访者:请告诉我当我打开浏览器并在其中输入 google.com 时幕后发生的过程.

Interviewer : Tell me the process that happens behind the scene when I open a browser and type google.com in it.

我:发生的第一件事是创建一个由 {SRC-IP, SRC-PORT, DEST-IP, DEST-PORT, PROTOCOL} 标识的套接字.

Me : The first thing that happens is a socket is created which is identified by {SRC-IP, SRC-PORT, DEST-IP, DEST-PORT, PROTOCOL}.

没有.连接由元组标识.套接字是连接的端点.

No. The connection is identified by the tuple. The socket is an endpoint to the connection.

SRC-PORT 编号是浏览器给出的随机数.

The SRC-PORT number is a random number given by browser.

没有.通过操作系统.

通常是 TCP/IP 连接协议(建立三向握手).现在客户端(我的浏览器)和服务器(谷歌)都准备好处理请求了.(TCP连接建立)

Usually TCP/IP connection protocol (three way handshake is established). Now the both client (my browser) and server (google) are ready to handle requests. (TCP connection is established)

面试官:等等,名字解析什么时候发生?

Interviewer: wait, when does the name resolution happens?

我:是的,对不起.它甚至应该在创建套接字之前发生.DNS 名称解析首先发生以获取要访问的 google 的 IP 地址.

Me: yep, I am sorry. It should have happened before even the socket is created. DNS name resolve happens first to get the IP address of google to reach at.

采访者:是否为 DNS 名称解析创建了套接字?

Interviewer : Is a socket created for DNS name resolution?

我:嗯,其实不知道.但我所知道的 DNS 名称解析是无连接的.那不是TCP而是UDP.只有一个请求-响应循环发生.(为 DNS 名称解析创建的新套接字也是如此).

Me : hmm, Iactually do not know. But all I know DNS name resolution is a connection-less. That is it not TCP but UDP. only a single request-response cycle happens. (so is a new socket created for DNS name resolution).

任何合理实现的浏览器都会将整个事情委托给操作系统的 Sockets 库,其内部功能取决于操作系统.在访问 DNS 服务器之前,它可能会查看内存缓存、文件、数据库、LDAP 服务器等多项内容,它可以通过 TCP 或 UDP 执行这些操作.这不是一个好问题.

Any rationally implemented browser would delegate the entire thing to the operating system's Sockets library, whose internal functioning depends on the OS. It might look at an in-memory cache, a file, a database, an LDAP server, several things, before going out to a DNS server, which it can do via either TCP or UDP. It's not a great question.

采访者:google.com 对来自其他客户的其他请求开放.那么,与 Google 建立连接是否会阻止其他用户?

Interviewer: google.com is open for other requests from other clients. so is establishing you connection with google is blocking other users?

我:我不确定谷歌如何处理这个问题.但在典型的套接字通信中,它的阻塞程度最低.

Me: I am not sure how google handles this. But in a typical socket communication, it is blocking to a minimal extent.

又错了.它与谷歌没有什么特别的关系.TCP 服务器对每个接受的连接都有一个单独的套接字,任何合理构造的 TCP 服务器都可以通过多线程、多路复用/非阻塞 I/O 或异步 I/O 完全独立地处理它们.他们不会互相阻挡.

Wrong again. It has very little to do with Google specifically. A TCP server has a separate socket per accepted connection, and any rationally constructed TCP server handles them completely independently, either via multithreading, muliplexed/non-blocking I/O, or asynchronous I/O. They don't block each other.

面试官:你认为这可以如何处理?

Interviewer : How do you think this can be handled?

我:我猜这个进程会派生一个新线程并创建一个套接字来处理我的请求.从现在开始,我与谷歌通信的套接字端点就是这个子套接字.

Me : I guess the process forks a new thread and creates a socket to handle my request. From now on, my socket endpoint of communication with google is this this child socket.

线程是创建的,而不是分叉的".分叉一个进程会创建另一个进程,而不是另一个线程.套接字并没有像接受的那样创建",这通常在线程创建之前.它不是子"套接字.

Threads are created, not 'forked'. Forking a process creates another process, not another thread. The socket isn't 'created' so much as accepted, and this would normally precede thread creation. It isn't a 'child' socket.

Interviewer:如果是这样的话,这个子socket的端口号和父socket的端口号不一样吗?

Interviewer: If that is the case, is this child socket’s port number different than the parent one.?

我:父套接字正在 80 监听来自客户端的新请求.孩子必须在不同的端口号上侦听.

Me: The parent socket is listening at 80 for new requests from clients. The child must be listening at a different port number.

又错了.接受的套接字使用与监听套接字相同的端口号,并且接受的套接字根本不是监听",而是接收发送.

Wrong again. The accepted socket uses the same port number as the listening socket, and the accepted socket isn't 'listening' at all, it is receiving and sending.

面试官:自从你的目标端口号改变后,你的 TCP 连接是如何维护的.(那是google的数据包上发送的src-port号)?

Interviewer: how is your TCP connection maintained since your Dest-port number has changed. (That is the src-port number sent on google's packet) ?

我:作为客户端的目标端口始终是 80.当请求被发回时,它也来自端口 80.我猜操作系统/父进程在发送之前将 src 端口设置回 80回帖.

Me: The dest-port as what I see as client is always 80. when request is sent back, it also comes from port 80. I guess the OS/the parent process sets the src port back to 80 before it sends back the post.

此问题旨在探索您之前的错误答案.你继续你的错误答案仍然是错误的.

This question was designed to explore your previous wrong answer. Your continuation of your wrong answer is still wrong.

面试官:你和google建立的socket连接多久了?

Interviewer : how long is your socket connection established with google?

我:如果我在一段时间内没有发出任何请求,主线程会关闭它的子套接字,任何来自我的后续请求都会像一个新客户端一样.

Me : If I don’t make any requests for a period of time, the main thread closes its child socket and any subsequent requests from me will be like am a new client.

又错了.您对 Google 的线程一无所知,更不用说哪个线程关闭了套接字.任何一端都可以随时关闭连接.很可能服务器端会打败你,但它不是一成不变的,如果有任何线程会这样做,也不是一成不变的.

Wrong again. You don't know anything about threads at Google, let alone which thread closes the socket. Either end can close the connection at any time. Most probably the server end will beat you to it, but it isn't set in stone, and neither is which if any thread will do it.

采访者:不,谷歌不会为你保留一个专用的子插座.它会处理您的请求并立即丢弃/回收套接字.

Interviewer : No, google will not keep a dedicated child socket for you. It handles your request and discards/recycles the sockets right away.

这里面试官是错误的.他似乎没有听说过 HTTP keep-alive,也没有听说过它是 HTTP 1.1 中的默认设置.

Here the interviewer is wrong. He doesn't seem to have heard of HTTP keep-alive, or the fact that it is the default in HTTP 1.1.

Interviewer:虽然 google 可能有很多服务器来处理请求,但每个服务器只能在 80 端口打开一个父 socket.访问 google 网页的客户端数量必须超过他们拥有的服务器数量.这通常是如何处理的?

Interviewer: Although google may have many servers to serve requests, each server can have only one parent socket opened at port 80. The number of clients to access google webpage must be exceeding larger than the number of servers they have. How is this usually handled?

我:我不确定这是如何处理的.我认为它可以工作的唯一方法是为它收到的每个请求生成一个线程.

Me : I am not sure how this is handled. I see the only way it could work is spawn a thread for each request it receives.

在这里你根本没有回答这个问题.他正在寻找有关负载平衡器或循环 DNS 或所有这些服务器前面的东西的答案.然而,他的一句话访问谷歌网页的客户端数量必须超过他们拥有的服务器数量".已经通过你们都错误地称之为子套接字"的存在得到了回答.同样,这不是一个很好的问题,除非你报告的不准确.

Here you haven't answered the question at all. He is fishing for an answer about load-balancers or round-robin DNS or something in front of all those servers. However his sentence "the number of clients to access google webpage must be exceeding larger than the number of servers they have" has already been answered by the existence of what you are both incorrectly calling 'child sockets'. Again, not a great question, unless you've reported it inaccurately.

采访者:你认为谷歌处理的方式与任何银行网站不同吗?

Interviewer : Do you think the way Google handles is different from any bank website?

我:在 TCP-IP 套接字级别,应该是类似的.在请求级别,略有不同,因为维护一个会话以保持 Banks 网站中请求之间的状态.

Me: At the TCP-IP socket level, it should be similar. At the request level, slightly different because a session is maintained to keep state between requests in Banks websites.

你差点就猜对了.存在与 Google 以及银行网站的 HTTP 会话.这不是什么大问题.他应该询问事实,而不是您的意见.

You almost got this one right. HTTP sessions to Google exist, as well as to bank websites. It isn't much of a question. He should be asking for facts, not your opinion.

总的来说,(a) 你完全没有通过面试,并且 (b) 你沉迷于 太多的猜测.你应该简单地说我不知道",然后让面试继续你知道的事情.

Overall, (a) you failed the interview completely, and (b) you indulged in far too much guesswork. You should have simply stated 'I don't know' and let the interview proceed to things that you do know about.

这篇关于Google 为收到的每个请求打开多少个套接字?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆