Sync Vs Async
Synchronous vs asynchronous is not just a JavaScript concept. It shows up in OS scheduling, database commits, replication, and file writes. Here is what it actually means at every layer.
There is one question that cuts through every layer of backend engineering.
Can I do other work while I wait for this?
That is it. That is the entire distinction between synchronous and asynchronous execution. Everything else — callbacks, promises, epoll, async commits, WAL flushing — is a specific answer to that question at a specific layer.
What Blocking Actually Means
When a program performs a synchronous IO operation — reading a file, waiting for a network response — it blocks. Not just "pauses." It physically cannot execute the next line of code.
If you used early Windows development tools in the nineties, you felt this directly. Build a loop in VB5 that ran for a few seconds and every button in the UI would freeze. Clicking anything did nothing. The UI thread was blocked executing your loop, so no click event could fire. There was a hack called DoEvents that would yield control briefly, letting the UI breathe. That was the workaround for synchronous execution in a world that had not yet figured out async.
What happens to a blocked process at the OS level? The OS removes it from the CPU.
This is called a context switch. The scheduler says: you are not executing instructions, you are waiting for IO — I am going to evict you and give this CPU time to a process that is actually doing work. When your IO completes, I will put you back.
Context switches are not free. Each one costs microseconds. Done millions of times per second across many processes, it adds up.
The blocked process is not just idle — it is actively costing the machine work.
Two Models for Async IO
When you do not want to block, you have two options. Both are OS-level mechanisms.
Readiness model (epoll, select): You tell the OS: "Watch these file descriptors. Tell me when any of them have data ready to read." The OS monitors them. When data arrives, it notifies you. You then perform the actual read — which completes almost instantly because the data is already there.
This is epoll on Linux. It does not work well with regular file reads (files are always "ready"), which is why Node.js uses a different approach for disk IO.
Completion model (io_uring, IOCP): You tell the OS: "Go perform this read operation for me. When it is completely done, write the result to this buffer and notify me." You get notified only when the data is already sitting in your buffer, fully transferred.
This is io_uring on modern Linux and IO Completion Ports (IOCP) on Windows. Node.js uses IOCP on Windows. The completion model is generally more efficient because the OS does the work instead of just announcing readiness.
ExpandSynchronous vs asynchronous IO — blocked caller vs caller free to execute other work
Two Approaches: Thread-Per-Request vs Event Loop
Before going deeper, it is worth naming the two dominant runtime architectures.
Thread-per-request model (traditional Java, PHP): A thread is assigned per request. That thread handles the entire lifecycle — reads the socket, runs the business logic, writes the response. If the request blocks on a database query, that thread blocks with it. One thread per concurrent request. Under high load, you spawn more threads until memory runs out or context-switching overhead kills throughput.
Event loop model (Node.js, Nginx): A single thread runs an event loop. It accepts many connections, dispatches IO work, and handles completions via callbacks or async/await. Blocking on a database call does not freeze the thread — the loop continues handling other connections while waiting.
Node.js chose the event loop because most web workloads are IO-bound. Threads spending 90% of their time waiting for database responses is expensive. The event loop model reuses that thread instead of wasting it on waiting.
How Node.js Actually Does This
Node.js uses epoll on Linux. For most network operations, this works cleanly — the OS watches sockets and notifies the event loop when data is ready.
File reads are the awkward case. epoll cannot signal readiness for file operations. So Node.js uses a different trick: it offloads file reads to a thread pool.
When you call fs.readFile():
- Node.js picks a worker thread from the pool (4 by default)
- That thread calls the blocking OS read
- The OS blocks that thread and context-switches it out
- The main thread — your event loop — continues running
- When the read completes, the worker thread calls back to the main thread
The main thread never blocked. Only the worker thread did.
Node.js ships with 4 worker threads by default. This is configurable via the UV_THREADPOOL_SIZE environment variable. The right value depends on your workload type:
- IO-bound workload (mostly file reads, database calls, network requests): bump the thread count. The threads spend most of their time blocked waiting for IO — more threads means more concurrent IO operations.
- CPU-bound workload (image processing, encryption, heavy computation): do not exceed the number of physical CPU cores. Extra threads compete for CPU time instead of adding throughput, and the context-switching overhead hurts performance.
This is why async/await in Node.js is not magic — it is coordination. The blocking work is genuinely happening somewhere, just not on your main execution path.
The difference is visible in execution order. Here is the three-generation evolution of async patterns in Node.js — all doing the same file read, all producing different execution order:
// Generation 1: Callbacks
console.log('1');
fs.readFile('data.txt', (err, data) => {
console.log(data.toString()); // fires AFTER '2' prints
});
console.log('2');
// Output: 1 → 2 → (file contents)
// Generation 2: Promises
console.log('1');
fs.promises.readFile('data.txt').then(data => {
console.log(data.toString()); // same async behaviour, nicer syntax
});
console.log('2');
// Output: 1 → 2 → (file contents)
// Generation 3: async/await (syntactic sugar over promises)
console.log('1');
const data = await fs.promises.readFile('data.txt'); // pauses THIS function, not main thread
console.log(data.toString());
console.log('2');
// Output: 1 → (file contents) → 2 — ordering guaranteed inside this functionThe callback and promise versions show the async reality: '2' prints before the file. The await version enforces ordering inside the current function — but the main thread is still not blocked. It can serve other requests while this function waits for its file.
async/await is not blocking. It is ordered asynchrony. You are writing constraints ("I need this before I do that") without freezing the whole thread.
Synchronicity Is a Client Property
Here is a useful framing: in a distributed system, whether an interaction is synchronous or asynchronous is almost always a client property, not a server property.
The question is: does the client block and wait, or does it move on?
A meeting is synchronous. You ask a question. You cannot do anything else until the other person answers. Skipping ahead would be weird.
Email is asynchronous. You send a message and go do other things. The reply comes when it comes.
Most modern HTTP client libraries are asynchronous. fetch(), Axios, and similar tools send a request and return a promise. Your code continues. The callback fires when the response arrives. The client is not blocked.
But at the system level, there is a subtler distinction worth separating out carefully.
Async Backend Processing
Imagine a client sends a request to compress a 2GB video file. The server starts compressing. The client has issued the request using fetch() — so at the language level, it is asynchronous. It moved on. It is doing other things.
But zoom out to the system level. The server's response is still tied to completing that compression. The client is waiting for a response that depends on finishing a long task. The client is asynchronous, but the system is synchronous — because the backend still believes someone is waiting for it to finish.
That distinction matters. To make the system truly asynchronous, you need to decouple accept from execute.
The clean solution: queues.
When the request arrives, the server does not start compressing. Instead, it puts the job into a queue and responds immediately:
{ "jobId": "job-abc-123", "status": "queued" }The client gets this response in milliseconds. It can disconnect, save the job ID, and come back later.
A separate worker process drains the queue and does the actual compression. When done, the result is stored. When the client asks "is job-abc-123 done?", it gets the answer.
This is asynchronous backend processing. The initial request and the result retrieval are both request-response — you just split them across time.
This pattern is the foundation of every job queue system you have ever used: Sidekiq, Celery, Bull, AWS SQS. The pattern is always: accept fast, process slow, poll or push when done.
Async Commits in Postgres
Synchronous and asynchronous execution shows up inside databases too — and understanding it requires knowing how Postgres stores data.
Postgres has two data structures on disk. Pages are where the actual table rows live — all the columns, all the values, the full data. The WAL (Write-Ahead Log) is a compact, append-only journal of every change made. Think of pages as the current state and the WAL as the history of edits.
When you run a transaction, your changes accumulate in memory: pages updated in the buffer pool, WAL entries built up. The key insight: a commit only needs to flush the WAL to disk, not the pages.
Why? Because if the database crashes and restarts, it can reconstruct the correct state: load the pages from disk (which may be slightly stale), then replay all the WAL entries since the last checkpoint. The WAL is the source of truth for recovery.
So when you call COMMIT, Postgres flushes the WAL to disk — bypassing the OS cache, writing directly to physical storage — and then returns success. That is a synchronous commit: the client blocks until the WAL is durably on disk.
Note: even a single INSERT without an explicit BEGIN has this behaviour. Postgres automatically wraps every statement in a transaction (auto-commit). There is no such thing as a write that bypasses transactions in Postgres.
With asynchronous commits enabled, Postgres returns success before the WAL hits disk.
The WAL write is queued in memory. The client gets unblocked immediately. The disk write follows shortly after in the background.
The trade-off: if the server crashes in that window between "client got success" and "WAL hit disk," that committed transaction is gone. It was never persisted. For high-throughput insert workloads where losing a handful of recent writes is survivable, this is a deliberate performance trade-off. For financial transactions, it is not.
Async Replication
A similar trade-off exists at the replication layer.
With synchronous replication, a commit does not succeed until the primary has confirmed that at least one replica received the change. The client waits. If a replica is slow or unreachable, every write stalls.
With asynchronous replication, the primary commits and returns immediately. The replica receives the change eventually. If the primary crashes before the replica catches up, that replica is behind — reads from it may be stale.
The choice between these is a consistency vs availability decision. Synchronous replication guarantees no data loss but introduces latency. Asynchronous replication is faster but allows lag.
This trade-off is the core of the CAP theorem, and it shows up in every distributed database — Postgres streaming replication, MySQL binlog replication, Redis replication, and more.
The OS Write Cache
One last layer where async appears: the OS file system cache.
When you write a file in Linux, the data does not go to disk immediately. It goes to an in-memory page cache. The OS batches writes and flushes them to disk periodically. This is asynchronous.
This is good for general-purpose workloads — and the reason goes deeper than just "performance."
SSDs have a physical constraint: every block (erasure unit) has a finite number of write cycles before it degrades. You cannot overwrite a block in place. You have to erase the entire block first, then write fresh. If you write one byte, then another byte, then another byte to the same block in rapid succession, that is three erase-and-write cycles on the same physical unit. The block wears out faster.
The OS page cache exists partly to protect SSD lifespan. By batching many small writes into fewer, larger flushes, the OS reduces the number of times each erasure unit is hit. Fewer cycles. Longer disk life. Less fragmentation.
Database engineers hate it.
Databases want to control exactly when data hits disk. They cannot afford to let the OS decide. An OS crash or power failure with data still in the page cache means lost commits.
The fsync() system call forces a flush — it tells the OS "write this to physical disk right now, skip the cache." Postgres calls fsync() during WAL flushes. This is why Postgres write performance is lower than raw file write performance: it bypasses the optimization the OS is trying to do.
This is a deliberate correctness-over-performance choice, and it is one of the reasons database engineers and OS engineers have historically disagreed.
The Essentials
- Synchronous means the caller blocks — it cannot do other work while waiting. The OS removes a blocked process from the CPU. Context switches happen in microseconds but accumulate.
- Async IO has two models: readiness (epoll) and completion (io_uring/IOCP). Node.js uses epoll for network IO and a worker thread pool for file IO. The main thread never blocks in either case.
async/awaitis ordered asynchrony, not blocking. The callback → promise → async/await evolution all produce the same async behaviour — only the ordering guarantees and readability differ.- A client can be async while the system is still synchronous. Using
fetch()does not decouple the system — the server still expects to respond to that request. Queues and job IDs are what break the synchronous coupling at the system level. - Postgres WAL is the source of truth for recovery. Commit flushes the WAL to disk, not the pages. Async commits skip waiting for that flush — faster, but exposes a crash window.
- The OS page cache protects SSD lifespan by batching small writes. Databases bypass it with
fsync()to guarantee durability — trading performance for correctness.
The async backend processing pattern (queues, job IDs) connects directly back to the request-response pattern — you are still making two request-response calls, just decoupled in time. The async replication trade-off discussed here is also central to high-level design when reasoning about distributed database consistency.
Further Reading and Watching
- Video: The Node.js Event Loop from the Inside Out — a conference talk that traces exactly how libuv and the event loop interact with epoll and the thread pool
- Article: PostgreSQL: Asynchronous Commit — the official Postgres docs explaining when async commits are appropriate and what you actually risk
Practice what you just read.