Demystifying the Journey of a Data Packet: From Application to Network Transmission: A Must-Know for Every Software Engineer

Ganesh Sahu
4 min readAug 24, 2024

--

Introduction

Ever wondered what happens under the hood when you hit “Send” on an email or “Enter” after typing a URL? Behind the scenes, your data undergoes a complex yet fascinating journey through various layers of the Linux kernel before it reaches its destination. Understanding this journey not only deepens your appreciation for networking but also empowers you to debug and optimize network applications more effectively.

For software engineers, especially those working on networked applications, understanding how data flows through the OS is not just a technical curiosity — it’s a crucial part of ensuring that your applications perform efficiently, scale effectively, and handle errors gracefully.

In this article, we’ll take a deep dive into the life cycle of a data packet as it moves from an application, through the kernel, and out onto the network. We’ll explore each layer involved, from the Application layer down to the Physical layer, and highlight the key functions in the Linux kernel that make it all possible. If you’re aiming to build robust, high-performance networked applications, this is knowledge you can’t afford to skip.

1. Application Layer: Kicking Off the Journey

Everything starts at the Application layer (Layer 7), where a user application decides to send data over the network. Whether it’s a web browser requesting a webpage or a chat app sending a message, the process begins with the sendmsg() system call.

System Call (sendmsg):

  • Function: The application calls sendmsg() to send data.
  • File: The sys_sendmsg() function, defined in net/socket.c, is responsible for transitioning data from user space to kernel space.
  • GitHub Link: net/socket
SYSCALL_DEFINE3(sendmsg, int, fd, struct msghdr __user *, msg, unsigned, flags)
{
struct socket *sock;
sock = sockfd_lookup(fd, &err); // Lookup socket by file descriptor
...
err = sock_sendmsg(sock, &msg_sys, len); // Call sock_sendmsg()

2. Transport Layer: Segmenting and Preparing Data

At the Transport layer (Layer 4), the TCP implementation takes over. The data is segmented into smaller chunks, encapsulated in TCP segments, and stored in sk_buff structures. This is where TCP ensures reliable delivery, retransmissions, and flow control.

sock_sendmsg() Function:

  • Invocation: The sys_sendmsg() function calls sock_sendmsg(), which delegates the task to tcp_sendmsg() for TCP sockets.
  • File: sock_sendmsg() is implemented in net/socket.c.
  • GitHub Link: net/socket.c
int sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
{
struct sock *sk = sock->sk;
...
return sk->sk_prot->sendmsg(sk, msg, len); // Calls tcp_sendmsg() for TCP sockets
}

tcp_sendmsg() Function:

  • Encapsulation in TCP Segments: The data is processed and stored in TCP segments within sk_buff structures.
  • File: tcp_sendmsg() is implemented in net/ipv4/tcp.c.
  • GitHub Link: net/ipv4/tcp.c

3. Network Layer: Routing the packet

Next, the TCP segments are handed off to the Network layer (Layer 3). Here, the ip_queue_xmit() function takes charge, adding an IP header to create a complete IP packet. The IP layer also handles routing decisions, determining the next hop for the packet.

ip_queue_xmit() Function:

  • Processing: Adds an IP header to the TCP segment and prepares the packet for routing.
  • File: This function is implemented in net/ipv4/ip_output.c.
  • GitHub Link: net/ipv4/ip_output.c

4. Data Link Layer: Encapsulation in Ethernet Frames

The Data Link layer (Layer 2) receives the IP packet and encapsulates it within an Ethernet frame. This process involves adding MAC addresses and other necessary information to the frame, making it ready for transmission over the physical network.

Ethernet Processing:

  • File: Ethernet frame processing is handled in net/ethernet/eth.c.
  • GitHub Link: net/ethernet/eth.c

NIC Driver Example: If you’re using an Intel Ethernet NIC, the driver is in drivers/net/ethernet/intel/e1000/e1000_main.c.

5. Physical Layer: Transmission Over the Network

At the Physical layer (Layer 1), the Ethernet frame is converted into physical signals that can travel over the network medium. This layer deals with the actual transmission of bits across various physical media like copper wires, fiber optics, or wireless signals.

NIC Driver:

  • File: The NIC driver handles the conversion of Ethernet frames into physical signals.
  • Example for Intel NIC: The driver file is drivers/net/ethernet/intel/e1000/e1000_main.c.
  • GitHub Link: drivers/net/ethernet/intel/e1000/e1000_main.c

6. Receiving Side: Reversing the Journey

Once the data reaches the destination, the process reverses. The receiving NIC converts physical signals back into Ethernet frames, which are then passed up through the Network and Transport layers. The TCP layer reassembles the original data stream, which is finally delivered to the receiving application.

Invocation Flow on Receiving Side:

  • NIC Driver: Converts signals to frames and hands off to the Ethernet layer.
  • Ethernet Decapsulation: Strips the Ethernet header and passes the packet to the IP layer.
  • IP Processing: Decapsulates the IP packet and passes the segment to TCP.
  • TCP Reassembly: Reassembles the TCP segments into a complete data stream and delivers it to the application.

Why It Matters:

  • Engineer’s Insight: Being familiar with the data flow on the receiving side helps in understanding and resolving issues related to data integrity, out-of-order packets, and TCP reassembly errors. It also helps in ensuring that your application can handle network-related anomalies gracefully.

--

--

Ganesh Sahu
Ganesh Sahu

Written by Ganesh Sahu

Senior engineer at VMware.Passionate about building elegant solutions to complex problems.

No responses yet