Java 21 Virtual Threads: A Brief Introduction

on 2024-08-15

In recent years, reactive Java has been growing in popularity, but writing reactive code is tricky to get used to due to the asynchronous (non-blocking) style it requires. Furthermore, asynchronous Java comes with some common pains such as:

  • hard to comprehend stacktraces/thread dumps
  • unfriendly debugging
  • messy chaining of operations

Lots of nested callbacks inside of each other to showcase async messiness

async code can look like this and it's not pleasant


Virtual threads are a new addition with Java 21 that may provide some benefits for you. They are a thread which runs on top of a platform thread (and yes, with their own stack!). A virtual thread will unmount itself from the platform thread when blocking operations occur, thus giving another virtual thread the chance to do some work.

Note: A platform thread is a JVM thin wrapper around an OS thread, and requires about 1~2MB of memory depending on the platform (in contrast to **KB for vthreads) for the default initial stack size.

The primary goal of virtual threads is to allow you to write blocking, imperative style code and gain the benefits non-blocking style code. They won't provide a benefit if your codebase is modeled towards the asynchronous style.

Two issues that are present with virtual threads is that they will pin themselves to the platform thread if:

  • entered a synchronised block
  • FFI

It's advised that it's best to replace your synchronised blocks with ReentrantLock (you should measure and monitor for your specific case(s)). If library code contains it, then you will need to wait until further work has fixed this.

Keep in mind: this is most important for blocks where a lot of the work doesn't require synchronisation.

Sample code where synchronised is used on a method

Refactored above to use reentrantlock instead

This is a gross oversimplification however, Netflix wrote a blog regarding an issue they faced: (Java 21 Virtual Threads - Dude, Where’s My Lock?)

The issue is really about how virtual threads interact with monitors right now. There is an open JEP draft to improve on this: (JEP draft: Adapt Object Monitors for Virtual Threads)


The goal of virtual threads is not about making things faster, but allowing scalability with blocking-style code. If your work is CPU-bound, then you would continue to use platform threads.

Since a virtual thread is still a thread, the semantics between using a platform thread or a virtual thread means there is no mental overhead, compared to mixing lots of async code.

Bytecode of a pthread vs vthread

The bytecode above highlights this as well. (Bytecode Cheatsheet)

The JVM has a scheduler for managing the virtual threads so you don't need to think about how they work compared to platform threads.


To set the scene if you are still a little puzzled:

Imagine you have a CPU that has 4 physical cores (which makes up 8 virtual cores in total), this means you have 8 platform threads that can take advantage of the CPU. Virtual threads however, can be (almost) limitless due to their lighterweight cost, since they are internal to the JVM with no outbound calls to the OS.

Another key detail here is that to benefit from the overhead of creating a virtual thread, you will need to have your blocking operations long enough that you gain throughout compared to N platform threads, and this is why you must measure to see if gains are possible.

Note: to calculate your throughput, you can use the following: (1000ms / latency) * concurrency = throughput

A trivial example:

Here are 2048 tasks that take 2 seconds through virtual threads and platform threads:

Running 2048 virtual threads (1 thread per task)

alt text

Running 20 OS threads (on 20 virtual cores) executing 2048 tasks

alt text

Virtual thread result: 2024ms

Platform threads result: 206,030ms (3m26s)

Based on what we know from above, this makes sense. When a virtual thread is blocked, a new virtual thread will come in. For 2048 threads to result in 2024ms shows how cheap the construction/cleanup of virtual threads are!

So why did the platform threads result in such a drastic result? Based on the formula above:

(1000ms / 2000) * 20 = 10rps

So for 2048 tasks, 2048 / 10 = 3m24s


It's a simple example, but I hope it highlights their usefulness.

There's still more for myself to learn about on virtual threads, so I'm sure there will be plenty more to talk about on them in the future.

Other Links

Virtual threads JEP

Virtual threads PR

[Presentation] Java's Virtual Threads - Next Steps

"Virtual Threads: An Adoption Guide" by Oracle

Project Loom mailing list