Understanding How SpringBoot Handles Requests with Tomcat and more than 1 million/s requests using…

Umesh Kumar Yadav

~7 min read · June 4, 2025 (Updated: June 4, 2025) · Free: No

Understanding How SpringBoot handles 1M requests/s using Virtual Threads 🚀

During a recent interview, I was asked: When an IP sends a request, does one IP correspond to one thread? This question caught me off guard, as I realized I hadn't deeply explored how SpringBoot processes requests. After some research, I discovered that the answer lies in SpringBoot's default embedded server, Tomcat, and its thread management. Additionally, I explored how virtual threads in modern Java can supercharge SpringBoot to handle massive concurrency, like 1 million requests per second! 🌟R Lt's dive into this topic to clarify how SpringBoot handles requests and scales with virtual threads.

How SpringBoot and Tomcat Handle Requests 🛠️

SpringBoot, by default, uses an embedded Tomcat server to run applications. Instead of focusing on how many requests SpringBoot can handle, we should examine Tomcat's capacity, as it's the engine processing the requests. Tomcat's configuration is defined in the spring-configuration-metadata.json file, with the corresponding configuration class being org.springframework.boot.autoconfigure.web.ServerProperties.

Key Tomcat Configuration Parameters ⚙️

Tomcat's ability to handle requests is governed by four key parameters:

server.tomcat.threads.min-spare: The minimum number of worker threads, defaulting to 10. These are like "permanent workers" that handle requests when concurrency is low. 👷
server.tomcat.threads.max: The maximum number of worker threads, defaulting to 200. These act as "temporary workers" activated when concurrent requests exceed the minimum thread count. 🧑‍🍳
server.tomcat.max-connections: The maximum number of simultaneous connections, defaulting to 8192. This represents the total number of requests Tomcat can process at once. 🪑
server.tomcat.accept-count: The size of the waiting queue, defaulting to 100. Requests exceeding max-connections are placed in this queue. 🛋️

A Restaurant Analogy 🍽️

To better understand these parameters, let's use a restaurant analogy:

Requests are guests arriving at the restaurant.
min-spare represents permanent chefs always ready to cook.
threads.max is the total number of chefs (permanent + temporary).
max-connections is the number of seats in the restaurant.
accept-count is the number of stools outside for waiting guests.

Imagine a restaurant with 15 chefs (threads.max = 15), 10 permanent chefs (min-spare = 10), 30 seats (max-connections = 30), and 10 stools outside (accept-count = 10). If 20 guests arrive, 10 permanent chefs handle 10 guests, and 5 temporary chefs handle the rest, while 5 guests wait. If 35 guests arrive, 30 are seated, and 5 wait on stools. If 50 guests arrive, only 40 (30 seats + 10 stools) are accommodated, and 10 are turned away.

Thus, the maximum number of requests SpringBoot can handle simultaneously is max-connections + accept-count. Requests exceeding this are discarded. 🚫

Scaling to 1 Million Requests per Second with Virtual Threads 🚀

While Tomcat's default thread pool is effective for typical workloads, handling 1 million requests per second requires a more scalable approach. Enter virtual threads, introduced in Java 21 (via Project Loom), which SpringBoot supports starting with Spring Framework 6 and Spring Boot 3. Unlike traditional platform threads, virtual threads are lightweight, managed by the JVM, and designed for high-concurrency scenarios. 🌐

Why Virtual Threads? 🤔

Lightweight: Virtual threads are not tied to heavy OS threads. Millions of virtual threads can run on a small number of platform threads, reducing resource overhead. 💡
Non-blocking: Virtual threads work seamlessly with reactive programming and non-blocking I/O, making them ideal for I/O-bound tasks like HTTP requests. ⚡
Scalability: With virtual threads, SpringBoot can handle massive concurrent requests without exhausting system resources, potentially reaching 1 million requests per second on appropriately sized hardware. 📈

Configuring SpringBoot for Virtual Threads 🧵

To enable virtual threads in SpringBoot, you need:

Java 21 or later: Ensure your project uses JDK 21+.
Spring Boot 3.x: Use Spring Boot 3, which supports virtual threads.
Tomcat Configuration: Enable virtual threads in the application.yml or programmatically.

Here's an example configuration in application.yml:

server:
  tomcat:
    threads:
      min-spare: 10
      max: 1000 # Increased for high concurrency
    max-connections: 100000 # Large number to handle massive connections
    accept-count: 10000
    use-virtual-threads: true # Enable virtual threads

Alternatively, configure programmatically in a @Configuration class:

@Bean
public TomcatProtocolHandlerCustomizer<?> protocolHandlerVirtualThreadExecutorCustomizer() {
    return protocolHandler -> protocolHandler.setExecutor(Executors.newVirtualThreadPerTaskExecutor());
}

How Virtual Threads Handle 1 Million Requests per Second 🌟

With virtual threads, each incoming request can be assigned to a virtual thread, which is inexpensive to create and manage. Unlike platform threads, which are limited by system resources (e.g., 200–1000 threads on typical hardware), virtual threads allow SpringBoot to scale to millions of concurrent connections. For example:

I/O-bound tasks (e.g., database queries, API calls) are handled efficiently, as virtual threads yield during I/O operations, allowing other virtual threads to run.
High throughput: On a server with sufficient CPU and memory (e.g., 32-core, 128GB RAM), SpringBoot with virtual threads can process 1 million requests per second by leveraging non-blocking I/O and a large max-connections setting.
Load balancing: For ultra-high loads, use a load balancer (e.g., NGINX) and cluster multiple SpringBoot instances to distribute requests across nodes. 🖥️

Practical Considerations for Extreme Concurrency ⚠️

Hardware: Achieving 1 million requests per second requires powerful hardware (e.g., multi-core CPUs, high memory) and optimized network configurations.
Database and Backend: Ensure downstream systems (databases, APIs) can handle the load. Use async drivers (e.g., R2DBC for databases) to avoid bottlenecks. 🗄️
Tuning: Adjust max-connections, accept-count, and JVM settings (e.g., garbage collection) to optimize performance.
Monitoring: Use tools like Spring Actuator or Prometheus to monitor thread usage and request latency. 📊

Testing the Theory with a SpringBoot Project 🧪

To validate traditional thread handling, I created a SpringBoot project with a custom configuration to simulate request handling under constrained settings.

Configuration

In the application.yml file, I set smaller values for testing:

server:
  tomcat:
    threads:
      min-spare: 10
      max: 15
    max-connections: 30
    accept-count: 10

Test Endpoint

I created a simple endpoint to log the IP address and thread name, with a 0.5-second sleep to simulate processing time and force queueing:

@GetMapping("/test")
public Response test1(HttpServletRequest request) throws Exception {
    log.info("ip:{}, thread:{}", request.getRemoteAddr(), Thread.currentThread().getName());
    Thread.sleep(500);
    return Response.buildSuccess();
}

Simulating Requests

Using APIfox, I sent 100 concurrent requests to the endpoint. With max-connections = 30 and accept-count = 10, the total capacity is 40 requests. I expected 60 requests to be discarded, and the remaining 40 to be processed in batches based on the thread limit (threads.max = 15).

Results

The test results aligned with expectations:

40 requests were processed: 15 in the first batch, 15 in the second, and 10 in the third, respecting the thread limit of 15.
60 requests were discarded, as they exceeded max-connections + accept-count (30 + 10 = 40).
The console logs confirmed that no more than 15 threads were active at any time, validating the thread pool behavior.

With virtual threads, this test would scale dramatically. By enabling use-virtual-threads: true and increasing max-connections to 100,000, the same endpoint could handle thousands or millions of requests, limited only by hardware and backend capacity. 🚀

Key Takeaway

With traditional platform threads, if the number of concurrent requests is below server.tomcat.threads.max, they are processed immediately. Excess requests wait in the queue up to max-connections + accept-count. Beyond this, requests are discarded. With virtual threads, SpringBoot can handle orders of magnitude more requests, potentially reaching 1 million requests per second by leveraging lightweight threads and non-blocking I/O. 🎉

Concurrency Challenges: A Deeper Look 🔍

While exploring request handling, I encountered an interesting concurrency issue. Consider a scenario where multiple threads access a shared resource, such as a variable tracking the total number of dishes cooked in our restaurant analogy.

The Problem

Suppose we define a global variable cookSum in a singleton Spring Controller:

private int cookSum = 0;

@GetMapping("/test")
public Response test1(HttpServletRequest request) throws Exception {
    cookSum += 1;
    log.info("Cooked {} dishes", cookSum);
    Thread.sleep(500);
    return Response.buildSuccess();
}

In a concurrent scenario, multiple threads may read and update cookSum simultaneously. For example:

Thread 1 reads cookSum = 20, calculates 20 + 1, but before writing back, Thread 2 reads cookSum = 20.
Both threads write 21 back, even though two dishes were cooked, so cookSum should be 22.

This is a classic race condition, common in Spring's singleton beans, which are shared across threads. With virtual threads, this issue persists, as they still share the same memory space. 🛑

Solution

To avoid this, use synchronization mechanisms like locks or atomic variables (e.g., AtomicInteger). For example:

private AtomicInteger cookSum = new AtomicInteger(0);

@GetMapping("/test")
public Response test1(HttpServletRequest request) throws Exception {
    int newValue = cookSum.incrementAndGet();
    log.info("Cooked {} dishes", newValue);
    Thread.sleep(500);
    return Response.buildSuccess();
}

This ensures thread-safe updates, even with millions of virtual threads. 🔒

Conclusion

The interview question about IP-to-thread mapping led me to uncover the intricacies of SpringBoot's request handling via Tomcat's thread pool and connection management. By configuring parameters like threads.max, max-connections, and accept-count, you can control how many requests are processed or queued. With virtual threads, SpringBoot can scale to handle 1 million requests per second, making it a powerhouse for high-concurrency applications. Practical testing confirmed the theoretical limits, and exploring concurrency revealed potential pitfalls in shared resource management. This deep dive not only answered the interviewer's question but also reinforced the importance of hands-on experimentation to truly understand system behavior. 🧑‍💻

Thank you for your patience in reading this article! If you found this article helpful, please give it a clap 👏, bookmark it ⭐, and share it with friends in need and follow for more Spring Boot insights. Your support is my biggest motivation to continue to output technical insights!

#java #spring-boot #request #throughput #request-handling