YK
← Posts

From Browser to Server: OS Abstraction and TCP Communication Mechanisms

2026-04-16

Network communication is inherently the exchange of data between processes running on two different hosts. However, it is impossible for an application, such as a browser or a web server, to directly control the physical Network Interface Card (NIC) to send packets. This is because the operating system strictly divides memory into user space and kernel space to protect hardware resources and reliably manage a multi-process environment.

Complex network protocol implementations and state machines, such as the well-known TCP/IP stack, are implemented precisely within this kernel space. Therefore, for an application operating in user space to send and receive data over a network, it must delegate the task to the kernel by invoking a system call.

At this point, a standardized interface is required to bridge the application and the internal network stack of the kernel. Under the philosophy that "Everything is a file," UNIX-like operating systems provide a powerful abstraction that allows even network communication to be handled in the same manner as standard file I/O. The endpoint and interface of this abstracted connection is the socket.

Based on this understanding of the fundamental system architecture, we will now examine the actual communication process from the client side to the server, utilizing the BSD Socket API as a reference.

Suppose a user enters google.com into a browser's address bar. To establish a connection with the server, the target's IP address is required; however, since google.com is a human-readable domain name, the process of resolving it into an IP address must precede.

First, the OS checks its internal cache to see if the IP for the corresponding domain is stored. If not, it sends a request externally. This mechanism is known as the Domain Name System (DNS). Depending on the perspective, DNS can be viewed as a massive, single distributed system or as a local repository for a specific layer.

However, to communicate with an external network, the computer itself must first have its network configurations set up. When a computer initially connects to a physical network, it broadcasts a packet across the network. A router then receives this and allocates an IP address, subnet mask, default gateway, and the IP address of a local DNS server to the computer. This series of dynamic allocation processes is called the Dynamic Host Configuration Protocol (DHCP).

The roles of each element are as follows:

Local DNS Server: The ISP's server that maps domains to IP addresses. Subnet Mask: The reference point used to determine whether the destination IP belongs to the internal or external network when the computer attempts to communicate. If the Network ID, extracted by performing a bitwise AND operation on the destination IP address and the subnet mask, differs from the computer's own Network ID, it is classified as an external network. Default Gateway: The IP address of the router that must be traversed to reach external networks.

Operating upon this network foundation, the function that ultimately retrieves the target server's IP address is getaddrinfo(). Now that the destination is known, a socket must be created for communication. Since the term "socket" is easily conflated, it is necessary to define it clearly here.

API Perspective: BSD Socket. The interface of C language functions invoked by an application in user space to utilize network capabilities. Implementation Perspective: Stream Socket (TCP) / Datagram Socket (UDP). The actual data structures and state machines allocated within the kernel space.

Therefore, it must be recognized that the socket() function discussed hereafter is a part of the BSD Socket API, and this function creates the actual socket object within the kernel. This text will focus on TCP connections.

Let us examine the socket() function, which is called after getaddrinfo().

Because a browser has numerous tasks to process, such as screen rendering, it delegates the receiving task to the socket rather than managing the network communication directly. It employs a structure where data accumulates in the socket's receive buffer, preventing data loss while the browser focuses on other operations. Specifically, since TCP guarantees a reliable connection, the OS kernel autonomously reassembles missing or out-of-order packets.

However, the socket and its buffers are resources managed in kernel space. Direct access to kernel memory by a user application (like a browser) is highly dangerous and can trigger security breaches or kernel panics. Accordingly, the OS provides an abstraction layer known as a file descriptor. A file descriptor is an integer-based index array that maps the kernel memory addresses of I/O objects opened by a process. When the socket() function is invoked, the OS creates the actual socket object in the kernel area, stores its pointer in an internal table, and returns only the array's index to the application. The application communicates indirectly using this safe integer value.

Once a file descriptor is obtained via socket(), the connect() function is called. This serves the dual purpose of binding the client's local port and sending a TCP connection request (SYN) to the destination server.

The flow on the server side is similar to the client's, yet distinct. The server also configures address information using getaddrinfo() and creates the kernel object and buffer to receive packets via socket().

Subsequently, it explicitly calls the bind() function, declaring that it will only accept requests entering a specific port. TCP packets arriving at this port are directed to that socket's receive buffer.

Next, it calls the listen() function, transitioning the socket into a state where it waits for external connections. When listen() is invoked, two connection queues are generated within the kernel:

SYN Queue: A space where initial connection requests (SYN packets) from clients temporarily wait; in other words, connections for which the TCP 3-way handshake has not yet completed. Accept Queue: A space where fully established, immediately communicable connections wait, following the server sending a SYN-ACK and ultimately receiving an ACK from the client, concluding the 3-way handshake.

The accept() function waits inside the server's main loop. accept() observes the Accept Queue; when a completed connection arrives, it dequeues it and allocates a new socket file descriptor.

The reason a new socket is created instead of reusing the existing listening socket is to guarantee an independent, 1:1 information exchange (session) with the client. These newly generated sockets are registered in a hash table within the kernel. Later, when a multitude of packets flood into the server's network interface, the kernel uses the 4-tuple (source IP, source port, destination IP, destination port) in each packet's header as a hash key to accurately route the packet into the correct socket's buffer.