Write-Ahead Log (WAL)

👋 Hi! I’m Bibin Wilson. In each edition, I share practical tips, guides, and the latest trends in DevOps and MLOps to make your day-to-day DevOps tasks more efficient. If someone forwarded this post to you, you can subscribe here to never miss out!

As DevOps engineers, we work with databases and monitoring tools in our day-to-day jobs.

But have you ever thought about this: What if the server crashes before data is fully saved?

That is where the concept of a Write-Ahead Log (WAL) comes into play.

Write-Ahead Log (WAL)

A Write-Ahead Log (WAL) is a technique used to make systems more reliable when saving data.

It is a simple but powerful idea.

  • Every operation (eg, inserting in database) is first written to a special log file

  • Once it is safe in the log, the system updates the actual database or storage.

  • If something fails, the system can use the log and recover data

In short, WAL ensures durability by logging operations before applying them to main storage.

If a system fails, you can replay the log (system reads the saved steps) to restore the data to a correct state.

🧱 Use Cases

Now lets look at some real-world use cases where WAL is used.

Prometheus

The best use case you can related to is prometheus.

When Prometheus scrapes metrics, it does two things at the same time.

  • Puts the new samples in memory (fast access).

  • Writes them to the Write-Ahead Log (WAL) on disk.

If Prometheus crashes, the data in memory would be lost. But since the WAL has the same data, Prometheus can replay it on Prometheus server restart.

Database Recovery

Another key use case where WAL is used is Database Crash Recovery.

Databases like PostgreSQL, SQLite, Oracle, and SQL Server write changes to WAL before applying them.

When a database crashes or needs to be recovered, the admin can first restore the last full backup. For example, if the last backup was taken yesterday at midnight, the database is restored to that point.

But what about all the changes made after the backup? WAL keeps a record of every change made after the backup.

So the admin can replay the WAL files. It means applying all the logged changes step by step until the database state exactly matches the moment you want.

kafka

Systems like Kafka is a WAL system at its core.

It depends entirely on an append-only log. Every message is first written to the log and then consumers read from there.

Reply

or to participate.