Jan's blog - big data and databases

Analyzing PostgreSQL Performance Using Flame Graphs

Jan Nidzwetzki

July 05, 2025

A flame graph is a graphical representation that helps to quickly understand where a program spends most of its processing time. These graphs are based on sampled information collected by a profiler while the observed software is running. At regular intervals, the profiler captures and stores the current call stack. A flame graph is then generated from this data to provide a visual representation of the functions in which the software spends most of its processing time. This is useful for understanding the characteristics of a program and for improving its performance. This blog post explores the fundamentals of flame graphs and offers a few practical tips on utilizing them to identify and debug performance bottlenecks in PostgreSQL.

The Art of SQL Query Optimization

Jan Nidzwetzki

June 03, 2025

SQL is a declarative language; only the result of the query is specified. The exact steps to produce the result must be determined by the DBMS. Often, multiple ways exist to calculate a query result. For example, the DBMS can choose to use an index or perform a sequential scan on a table to find the needed tuples. The query optimizer is responsible for finding the most efficient plan for a given query. The plan generator creates possible plans, which are then evaluated based on their costs. Afterward, the cheapest plan is chosen and executed. When the DBMS expects to return a large portion of the table, a full table scan can be more efficient than following the pointers in an index structure. However, it is hard to determine when the DBMS prefers one plan over another and when the switch between plans occurs. In a few evenings, I implemented the plan explorer for PostgreSQL. It iterates over a search space and generates visualizations that show when the plan changes and how many tuples are expected versus the actual number returned. This blog post examines the “art” of query optimization. It discusses the plan explorer tool, the images the tool generates, and the insights the tool provides into the decisions made by the PostgreSQL query optimizer.

Building a Query Plan Explorer using GitHub Copilot

Jan Nidzwetzki

May 18, 2025

Large language models (LLMs) that generate code are nowadays common. Since a couple of weeks, VS Code has an agent mode that performs multi-step coding tasks. I was actively involved in web development roughly 20–25 years ago, when CGI, Perl, and early versions of PHP were popular. I have no idea how modern web development actually works. I always had some projects in mind that I wanted to create, but I never had the time to dig into one of the modern JavaScript frameworks like React. GitHub Copilot now seems like a way to create (web) applications just by describing the requirements (i.e., vibe coding) for an entire application. This post describes my experience building a PostgreSQL query plan explorer using React and VS Code in two evenings—without writing a single line of code myself.

Introduction to Snapshots and Tuple Visibility in PostgreSQL

Jan Nidzwetzki

April 03, 2024

Like many relational DBMSs, PostgreSQL uses multi-version concurrency control (MVCC) to support parallel transactions and coordinate concurrent access to tuples. Snapshots are used to determine which version of a tuple is visible in a given transaction. Each transaction that modifies data has a transaction ID (txid). Tuples are stored with two attributes (xmin, xmax) that determine in which snapshots (and transactions) they are visible. This blog post discusses some implementation details of snapshots.

Trace PostgreSQL Row-Level Locks with pg_row_lock_tracer

Jan Nidzwetzki

February 28, 2024

PostgreSQL uses several types of locks to coordinate parallel transactions and manage access to resources like tuples, tables, and in-memory data structures. Heavyweight locks are used to control access to tables. Lightweight locks (LWLocks) manage access to data structures, such as adding data to the write-ahead log (WAL). Row-level locks control access to individual tuples. For example, tuples need to be locked when executing an SQL statement like SELECT * FROM table WHERE i > 10 FOR UPDATE;. The tuples returned by the query are internally locked with an exclusive lock (LOCK_TUPLE_EXCLUSIVE). Another transaction attempting to lock the same tuples must wait until the first transaction releases the locks. In this article, we discuss the tool pg_row_lock_tracer, which uses eBPF and UProbes to trace PostgreSQL’s row-locking behavior. The tool can be downloaded from the pg-lock-tracer project website. This is the third article in a series about tracing PostgreSQL locks. The first article covers the tracing of heavyweight locks, and the second article focuses on LW locks.

1 / 4