Every Millisecond Accounted For: Deconstructing Go HTTP Requests with httptrace

The Go standard library offers a built-in mechanism for instrumenting the precise lifecycle of an outbound HTTP request, from DNS lookup to the final byte. This granular visibility, provided by the net/http/httptrace package, transforms network calls from opaque operations into a sequence of measurable events, enabling developers to pinpoint the exact source of latency with surgical precision.

Dissecting the Standard HTTP Client Call

For many developers, an outbound network call like http.Get is treated as an atomic unit of work. The operation begins, and at some later point, it concludes, yielding either a response or an error. The primary metric available is the total duration—the time elapsed between invocation and return. While simple, this single number obscures a complex sequence of underlying steps.

A single HTTPS request is not a monolithic action but a multi-stage process. It begins with a DNS query to resolve a hostname into an IP address. With the address acquired, the client initiates a TCP connection to the server, involving the requisite three-way handshake. Following this, a TLS handshake secures the channel, a computationally non-trivial exchange of certificates and keys. Only then can the client write the actual HTTP request headers and body to the newly established connection. The client then waits for the server to process the request and begin sending its response, a period often referred to as "time to first byte." Finally, the client reads the complete response body from the connection.

The core diagnostic problem is clear: when a request is slow, the total duration alone provides no insight into which of these stages is the bottleneck. An application might be suffering from slow DNS resolution, network congestion delaying the TCP connection, a CPU-bound server causing a long wait time, or a large payload requiring a lengthy download. Without finer instrumentation, identifying the culprit is an exercise in educated guesswork (a process that can feel more like divination than engineering).

Mapping the Lifecycle with `net/http/httptrace`

The Go standard library provides a direct solution to this ambiguity in the form of the net/http/httptrace package. At its heart is the httptrace.ClientTrace struct, a collection of optional function fields, or hooks, that the http client can invoke at critical junctures during a request's execution.

These hooks map directly onto the fundamental stages of an HTTP request, allowing a developer to register a callback for the start and end of nearly every phase.

DNS Resolution: The DNSStart and DNSDone hooks are called immediately before and after the hostname lookup.
TCP Connection: ConnectStart and ConnectDone bracket the establishment of the network connection to the server's IP address.
TLS Handshake: For HTTPS requests, TLSHandshakeStart and TLSHandshakeDone are triggered, measuring the time spent negotiating the secure session.
Request and Response: WroteRequest fires after the entire request has been written to the wire, and GotFirstResponseByte marks the moment the server's response begins to arrive.

Activating this instrumentation involves a standard Go pattern using the context package. A developer first creates an instance of the httptrace.ClientTrace struct, populating the desired hooks with functions that capture timestamps or log events. This trace instance is then attached to a context.Context using the httptrace.WithClientTrace function. Finally, this new context is associated with the http.Request before it is dispatched by the client. The HTTP transport mechanism detects the trace in the context and diligently invokes the specified hooks throughout the request's lifecycle.

From Raw Timestamps to Actionable Telemetry

The raw output of these hooks is a series of events. The real value is derived by converting these events into timing data. The functions provided to the ClientTrace struct are passed informational structs that typically include a timestamp. By capturing the time in a Start hook and again in the corresponding Done hook, one can calculate the precise duration of that stage—for example, by subtracting the DNSStart time from the DNSDone time.

This detailed telemetry opens up several practical applications. Instead of logging every request, an application can be configured to conditionally log the full trace breakdown only for requests that exceed a latency threshold, providing rich diagnostic data precisely when it is needed most. Aggregating these individual stage durations (e.g., average DNS lookup time, 99th percentile TLS handshake time) and exporting them to a monitoring platform provides a powerful, high-level view of an application's external dependencies.

For engineers operating services at scale, ambiguity is a liability. The question is never if a dependency will slow down, but when. With the ground truth from httptrace, teams can immediately distinguish between a slow connection establishment, which points to network issues, and a long time-to-first-byte, which points to a problem with the upstream server itself.

One of the most critical pieces of information provided by httptrace comes from the GotConn hook. The GotConnInfo struct passed to this hook contains a boolean field named Reused. This flag indicates whether the request was sent over a new connection or an existing, idle connection from the HTTP client's connection pool. When Reused is true, the durations for DNS, TCP connection, and TLS handshake will typically be zero. Understanding this distinction is vital for correctly interpreting metrics and avoiding false alarms about network performance.

Situating `httptrace` in the Observability Landscape

It is important to situate httptrace correctly within the broader landscape of modern observability. It is a low-level tool designed for high-fidelity, client-side, in-process network event timing. It provides an exhaustive account of a single network hop as seen by the client making the call. It is not, by itself, a distributed tracing solution.

Tools like OpenTelemetry are built to address a different problem: tracking a request's entire journey as it propagates across service boundaries in a distributed system. OpenTelemetry achieves this by propagating context, such as trace and span IDs, from one service to the next, creating a complete, end-to-end view of a user request.

Rather than being competitors, the two are complementary. httptrace can serve as a powerful data source for enriching a distributed trace. An OpenTelemetry span representing an outbound HTTP call can be augmented with attributes or events generated from httptrace data. This embeds the granular network-level timings directly into the broader distributed trace.

A useful analogy is to think of OpenTelemetry as the narrative of the whole journey, and httptrace as the detailed logbook for a specific sea voyage within that journey. The distributed trace tells you the request went from service A to service B. The httptrace data, in contrast, tells you exactly how long it took to find service B's address, establish a secure channel, and get the first word back. Combining them provides the full story.

As software architectures continue to decompose into smaller, network-connected microservices, the performance of the "connective tissue" between them becomes paramount. The ability to dissect and measure every phase of this communication is no longer a niche requirement for network specialists but a core competency for application developers. Tools like httptrace, built directly into the language's standard library, represent a move toward making this level of observability a default, accessible capability, empowering engineers to build more resilient and performant systems by accounting for every last millisecond.