dreaife

Announcement

welcome to my blog

Learn More

Tags

dreaife

Announcement

welcome to my blog

Learn More

Site Statistics

Posts

71

Categories

13

Tags

58

Total Words

127,637

Running Days

0 days

Last Activity

0 days ago

Tags

dreaife

Announcement

welcome to my blog

Learn More

Site Statistics

Posts

71

Categories

13

Tags

58

Total Words

127,637

Running Days

0 days

Last Activity

0 days ago

Tags

Categories

1762 words

9 minutes

Using Java Thread Pools

2024-02-03

cs-base

java

/

multi-prog

/

meeting

/

doc

Correct declaration of thread pools#

Thread pools must be declared manually through ThreadPoolExecutor ‘s constructor; avoid using the Executors class to create thread pools, which can lead to OOM.

Executors returning thread pool objects has the following drawbacks:

FixedThreadPool and SingleThreadExecutor: use an unbounded LinkedBlockingQueue; the task queue can reach a maximum length of Integer.MAX_VALUE, which may accumulate a large number of requests and lead to OOM.
CachedThreadPool: uses the synchronous queue SynchronousQueue, allowing up to Integer.MAX_VALUE threads, which may create a large number of threads and cause OOM.
ScheduledThreadPool and SingleThreadScheduledExecutor: use the unbounded delaying queue DelayedWorkQueue, the task queue maximum length is Integer.MAX_VALUE, which may accumulate a large number of requests and lead to OOM.

In short: use bounded queues and control the number of threads created.

Besides avoiding OOM, there are other reasons not to use the two quick-thread-pool options provided by Executors:

In practice you need to manually configure thread pool parameters according to your machine’s performance and your business scenario, such as core pool size, the task queue to use, saturation policies, etc.
We should explicitly name our thread pools, which helps us locate problems.

Monitoring thread pool status#

You can monitor the running state of a thread pool through various means, such as Spring Boot’s Actuator component.

In addition, you can use the APIs of ThreadPoolExecutor to build a simple monitor. ThreadPoolExecutor provides methods to obtain the current pool size, the number of active threads, the number of completed tasks, the number of tasks in the queue, and so on.

Here is a simple Demo. printThreadPoolStatus() prints the thread pool size, active count, completed task count, and the number of tasks in the queue every second.

1
/**
2
 * Print the status of a thread pool
3
 *
4
 * @param threadPool the thread pool object
5
 */
6
public static void printThreadPoolStatus(ThreadPoolExecutor threadPool) {
7
    ScheduledExecutorService scheduledExecutorService = new ScheduledThreadPoolExecutor(1, createThreadFactory("print-images/thread-pool-status", false));
8
    scheduledExecutorService.scheduleAtFixedRate(() -> {
9
        log.info("=========================");
10
        log.info("ThreadPool Size: [{}]", threadPool.getPoolSize());
11
        log.info("Active Threads: {}", threadPool.getActiveCount());
12
        log.info("Number of Tasks : {}", threadPool.getCompletedTaskCount());
13
        log.info("Number of Tasks in Queue: {}", threadPool.getQueue().size());
14
        log.info("=========================");
15
    }, 0, 1, TimeUnit.SECONDS);
16
}

Use different thread pools for different types of work#

Many people encounter this question in real projects: My project has multiple business areas that require thread pools—should I define a separate pool for each area, or should I define a single shared pool?

The usual recommendation is to use different thread pools for different businesses, configuring each pool according to the current business scenario, because different businesses have different concurrency levels and resource usage, and the optimization should focus on the system’s bottlenecks.

Let’s look at a real incident case

The code above may have a deadlock situation. Why?

Imagine an extreme scenario: suppose the core pool size of our thread pool is n, the number of parent tasks (charge deduction tasks) is n, and under each parent task there are two sub-tasks (sub-tasks under the deduction task), one of which has already completed while the other is queued. Because the parent task has exhausted the core thread resources, the sub-task cannot obtain a thread resource and cannot proceed, so it remains blocked in the queue. The parent task waits for the sub-task to complete, while the sub-task waits for the parent task to release the thread pool resources, which leads to a “deadlock”.

The solution is simple: add a new thread pool dedicated to executing the sub-tasks.

Don’t forget to name your thread pools#

When initializing a thread pool, you should explicitly name it (set a thread pool name prefix); this helps with debugging.

By default, created thread names look like pool-1-thread-n, which lacks business meaning and makes it harder to locate problems.

There are usually two ways to name the threads in a thread pool:

Using Guava’s ThreadFactoryBuilder

1
ThreadFactory threadFactory = new ThreadFactoryBuilder()
2
                        .setNameFormat(threadNamePrefix + "-%d")
3
                        .setDaemon(true).build();
4
ExecutorService threadPool = new ThreadPoolExecutor(corePoolSize, maximumPoolSize, keepAliveTime, TimeUnit.MINUTES, workQueue, threadFactory)

Implementing your own ThreadFactory.

1
import java.util.concurrent.ThreadFactory;
2
import java.util.concurrent.atomic.AtomicInteger;
3

4
/**
5
 * Thread factory that sets thread names to help locate issues.
6
 */
7
public final class NamingThreadFactory implements ThreadFactory {
8

9
    private final AtomicInteger threadNum = new AtomicInteger();
10
    private final String name;
11

12
    /**
13
     * Create a thread factory with a name prefix.
14
     */
15
    public NamingThreadFactory(String name) {
16
        this.name = name;
17
    }
18

19
    @Override
20
    public Thread newThread(Runnable r) {
21
        Thread t = new Thread(r);
22
        t.setName(name + " [#" + threadNum.incrementAndGet() + "]");
23
        return t;
24
    }
25
}

Correctly configure thread pool parameters#

Let’s first review the common ways books and blogs recommend configuring thread pool parameters, which can serve as a reference.

General practices#

The impact of having too many threads is similar to how many people you allocate to do the work. In multithreaded scenarios, the main effect is to increase the cost of context switching.

If the thread pool size is too small, when many tasks/requests arrive at once, many will queue up waiting to execute, or the queue may fill up and prevent new tasks from being processed, or a large number of tasks may accumulate in the queue causing OOM. This is clearly problematic, as the CPU is not utilized efficiently.
If the thread count is too large, many threads may compete for CPU resources, causing a lot of context switching and increasing the execution time for each thread, reducing overall efficiency.

A simple, widely applicable formula:

CPU-bound tasks (N+1): These tasks primarily consume CPU resources; you can set the thread count to N (CPU cores) + 1. The extra thread helps prevent the impact of occasional page faults or other reasons for task pauses. When a task pauses, the CPU would be idle; the extra thread can make better use of that idle time.
I/O-bound tasks (2N): In practice, these tasks spend most of their time waiting for I/O; while waiting, the threads don’t use CPU, so you can give CPU time to other threads. For I/O-bound workloads, you can configure more threads, with a rule of thumb of 2N.

How to determine CPU-bound vs IO-bound tasks?

CPU-bound tasks are those that use CPU compute, such as sorting large data in memory. Any operation involving network reads or file reads tends to be I/O-bound. These tasks typically spend little time on CPU calculations compared to waiting for I/O.

A more rigorous calculation for the optimal thread count is: bestThreadCount = N (CPU cores) * (1 + WT/ST), where WT = total running time - ST.

The higher the proportion of waiting time, the more threads you need. The higher the proportion of compute time, the fewer threads you need.

You can use VisualVM, a tool included with the JDK, to view the WT/ST ratio.

For CPU-bound tasks, WT/ST is close to 0, so set the number of threads to N * (1 + 0) = N, which is similar to the N cores case discussed above.

For IO-bound tasks, WT is almost entirely waiting time; theoretically you could set the thread count to 2N (in practice WT/ST would be large; the choice of 2N helps avoid creating an excessive number of threads).

Note: the formulas above are only references; in real projects you rarely set thread pool parameters strictly by formulas, since different scenarios require different needs. Dynamic tuning based on actual runtime conditions is advisable.

Meituan’s optimization approach#

Meituan’s technical team, in the article Java Thread Pool Implementation Principles and Practices in Meituan, discusses ideas and methods for making thread pool parameters configurable.

Their approach focuses on making core thread pool parameters customizable. The three core parameters are:

corePoolSize: core thread count; defines the minimum number of threads that can run simultaneously.
maximumPoolSize: when the queue is full, the maximum number of threads that can run concurrently.
workQueue: when a new task arrives, the pool first checks whether the current number of running threads has reached the core pool size; if so, the new task is stored in the queue.

Why these three parameters?

These are the most important parameters for ThreadPoolExecutor; they largely determine the pool’s handling strategy for tasks.

Note that corePoolSize is special: during runtime, if you call setCorePoolSize(), the pool will first check if the current number of worker threads is greater than corePoolSize; if so, it will shrink the workers.

Also, you’ll notice there’s no dynamic way to set the queue length above; Meituan’s approach customizes a queue called ResizableCapacityLinkedBlockingQueue (essentially removing the final modifier on the capacity field of LinkedBlockingQueue, making it mutable).

If your project also wants to achieve this, you can leverage existing open-source projects:

Hippo4j: asynchronous thread pool framework, supports dynamic changes, monitoring, and alerting with no code changes required. Supports multiple usage modes; aims to improve system reliability.
Dynamic TP: lightweight dynamic thread pool with built-in monitoring and alerting; integrates with third-party middleware thread pool management; based on mainstream configuration centers (Nacos, Apollo, Zookeeper, Consul, Etcd; SPI extension available).

Don’t forget to shut down thread pools#

When a thread pool is no longer needed, you should explicitly shut it down to release resources.

Thread pools offer two shutdown methods:

shutdown(): shuts down the thread pool; its state becomes SHUTDOWN. It will not accept new tasks, but tasks in the queue will be completed.
shutdownNow(): shuts down the thread pool; its state becomes STOP. It will attempt to stop currently executing tasks, halt processing of queued tasks, and return a list of tasks that are awaiting execution.

Calling shutdownNow and shutdown does not mean the shutdown is complete; it merely asynchronously requests the pool to stop. If you need to wait synchronously for the pool to fully shut down before proceeding, you should call awaitTermination to wait.

1
// ...
2
// Shutdown the thread pool
3
executor.shutdown();
4
try {
5
    // Wait for shutdown, up to 5 minutes
6
    if (!executor.awaitTermination(5, TimeUnit.MINUTES)) {
7
        // If waiting times out, log
8
        System.err.println("Thread pool did not terminate within 5 minutes");
9
    }
10
} catch (InterruptedException e) {
11
    // Exception handling
12
}

Try to avoid submitting long-running tasks to the thread pool#

The purpose of a thread pool is to improve task execution efficiency and avoid the overhead of repeatedly creating and destroying threads. If you submit long-running tasks to the pool, threads may be occupied for a long time, preventing timely responses to other tasks, and could even cause the pool to crash or the program to hang.

Therefore, when using a thread pool, try to avoid submitting time-consuming tasks to it. For long-running operations such as network requests or file I/O, consider asynchronous processing to avoid blocking threads in the pool.

Some gotchas when using thread pools#

Pitfall of repeatedly creating thread pools#

Thread pools are reusable; do not create a new thread pool for every request, for example:

1
@GetMapping("wrong")
2
public String wrong() throws InterruptedException {
3
    // Custom thread pool
4
    ThreadPoolExecutor executor = new ThreadPoolExecutor(5,10,1L,TimeUnit.SECONDS,new ArrayBlockingQueue<>(100),new ThreadPoolExecutor.CallerRunsPolicy());
5

6
    //  Process tasks
7
    executor.execute(() -> {
8
      // ......
9
    }
10
    return "OK";
11
}

The problem stems from insufficient understanding of thread pools; improve your knowledge of thread pools.

Pitfalls of Spring’s internal thread pools#

When using Spring’s internal thread pools, you must manually customize the pool with reasonable parameters, otherwise you may encounter production issues (one thread per request).

1
@Configuration
2
@EnableAsync
3
public class ThreadPoolExecutorConfig {
4

5
    @Bean(name="threadPoolExecutor")
6
    public Executor threadPoolExecutor(){
7
        ThreadPoolTaskExecutor threadPoolExecutor = new ThreadPoolTaskExecutor();
8
        int processNum = Runtime.getRuntime().availableProcessors();
9
        int corePoolSize = (int) (processNum / (1 - 0.2));
10
        int maxPoolSize = (int) (processNum / (1 - 0.5));
11
        threadPoolExecutor.setCorePoolSize(corePoolSize);
12
        threadPoolExecutor.setMaxPoolSize(maxPoolSize);
13
        threadPoolExecutor.setQueueCapacity(maxPoolSize * 1000);
14
        threadPoolExecutor.setThreadPriority(Thread.MAX_PRIORITY);
15
        threadPoolExecutor.setDaemon(false);
16
        threadPoolExecutor.setKeepAliveSeconds(300);
17
        threadPoolExecutor.setThreadNamePrefix("test-Executor-");
18
        return threadPoolExecutor;
19
    }
20
}

Pitfalls where ThreadLocal and thread pools collide#

Using a thread pool with ThreadLocal can cause a thread to read stale or dirty values. This happens because the pool reuses worker threads, and the ThreadLocal variables bound to the thread’s class are reused as well, so a thread might read another thread’s ThreadLocal value.

Don’t assume that not explicitly using a thread pool in your code means there’s no thread pool involved; web servers like Tomcat use thread pools to handle requests and concurrency, often using custom thread pools built on top of native Java thread pools.

Of course, you could configure Tomcat to handle requests with a single thread, but this is not advisable as it would severely limit throughput.

1
server.tomcat.max-threads=1

A recommended solution to the above issue is Alibaba’s open-source TransmittableThreadLocal (TTL). The TransmittableThreadLocal class extends and enhances the JDK’s built-in InheritableThreadLocal. In components that pool and reuse threads, TTL provides propagation of ThreadLocal values to solve context transmission problems in asynchronous execution.

TransmittableThreadLocal project page: https://github.com/alibaba/transmittable-thread-local.

Share

If this article helped you, please share it with others!

Using Java Thread Pools

https://dreaife.tokyo/en/posts/java-thread-pool/

Author

dreaife

Published at

2024-02-03

License

CC BY-NC-SA 4.0

Some information may be outdated

Java AQS

Java JMM Memory Model

dreaife的休憩小栈

Correct declaration of thread pools#

Monitoring thread pool status#

Use different thread pools for different types of work#

Don’t forget to name your thread pools#

Correctly configure thread pool parameters#

General practices#

Meituan’s optimization approach#

Don’t forget to shut down thread pools#

Try to avoid submitting long-running tasks to the thread pool#

Some gotchas when using thread pools#

Pitfall of repeatedly creating thread pools#

Pitfalls of Spring’s internal thread pools#

Pitfalls where ThreadLocal and thread pools collide#

Table of Contents