Hardware breakpoints can trigger eBPF programs when specific memory addresses are accessed, leveraging CPU hardware support for low overhead. By utilizing these hardware breakpoints, we can efficiently monitor PostgreSQL’s internal variable updates, such as transaction ID generation and OID assignment. In this post, we will discuss what hardware breakpoints are, whether they have less overhead than uprobes, and how to answer questions like “How many transactions are being executed per second?” or “Which backend is consuming the most OIDs?” with bpftrace.
PostgreSQL uses a cost-based optimizer (CBO) to determine the best execution plan for a given query. The optimizer considers multiple alternative plans during the planning phase. Using the EXPLAIN command, a user can only inspect the chosen plan, but not the alternatives that were considered. To address this gap, I developed pg_plan_alternatives, a tool that uses eBPF to instrument the PostgreSQL optimizer and trace all alternative plans and their costs that were considered during the planning phase. This information helps the user understand the optimizer’s decision-making process and tune system parameters. This article explains how pg_plan_alternatives works, provides examples, and discusses the insights the tool can provide.
PostgreSQL uses a process-based architecture where each connection is handled by a separate process. Some data structures are shared between these processes, for example, the shared buffer cache or the write-ahead log (WAL). To coordinate access to these shared resources, PostgreSQL uses several locking mechanisms, including spinlocks. Spinlocks are intended for very short-term protection of shared structures: rather than immediately putting a waiting process to sleep, they busy-wait and repeatedly check whether the lock is free. Under contention, PostgreSQL also applies an adaptive backoff that can include brief sleeps. This article explains what spinlocks are and how they are implemented in PostgreSQL. It also describes how spinlocks can be monitored and demonstrates how my new pg_spinlock_tracer tool can be used to trace spinlock internals using eBPF.
A flame graph is a graphical representation that helps to quickly understand where a program spends most of its processing time. These graphs are based on sampled information collected by a profiler while the observed software is running. At regular intervals, the profiler captures and stores the current call stack. A flame graph is then generated from this data to provide a visual representation of the functions in which the software spends most of its processing time. This is useful for understanding the characteristics of a program and for improving its performance. This blog post explores the fundamentals of flame graphs and offers a few practical tips on utilizing them to identify and debug performance bottlenecks in PostgreSQL.
SQL is a declarative language; only the result of the query is specified. The exact steps to produce the result must be determined by the DBMS. Often, multiple ways exist to calculate a query result. For example, the DBMS can choose to use an index or perform a sequential scan on a table to find the needed tuples. The query optimizer is responsible for finding the most efficient plan for a given query. The plan generator creates possible plans, which are then evaluated based on their costs. Afterward, the cheapest plan is chosen and executed. When the DBMS expects to return a large portion of the table, a full table scan can be more efficient than following the pointers in an index structure. However, it is hard to determine when the DBMS prefers one plan over another and when the switch between plans occurs. In a few evenings, I implemented the plan explorer for PostgreSQL. It iterates over a search space and generates visualizations that show when the plan changes and how many tuples are expected versus the actual number returned. This blog post examines the “art” of query optimization. It discusses the plan explorer tool, the images the tool generates, and the insights the tool provides into the decisions made by the PostgreSQL query optimizer.