pg_stat_insights blog header

pg_stat_insights: PostgreSQL Performance Monitoring Extension

Executive Summary

PostgreSQL databases need comprehensive performance monitoring to identify slow queries, optimize cache usage, and track resource consumption. pg_stat_insights provides 52 comprehensive metrics across 11 pre-built views, making it the most complete pg_stat_statements replacement available. Compatible with PostgreSQL 16, 17, and 18.

Introduction: The Performance Monitoring Challenge

Database administrators need deep visibility into query performance to maintain optimal database health. Standard PostgreSQL statistics provide basic metrics, but lack the granularity needed for effective optimization and troubleshooting.

pg_stat_insights extends PostgreSQL's native statistics with enhanced analytics, response time categorization, WAL tracking, JIT statistics, and time-series aggregation—all in a single extension.

What Makes pg_stat_insights Different?

52 Comprehensive Metrics

pg_stat_insights tracks every aspect of query execution:

Execution Metrics (10 columns)

  • plans, calls, rows
  • total_plan_time, min/max/mean/stddev_plan_time
  • total_exec_time, min/max/mean/stddev_exec_time

Buffer I/O Metrics (14 columns)

  • shared_blks_hit, shared_blks_read, shared_blks_dirtied, shared_blks_written
  • local_blks_hit, local_blks_read, local_blks_dirtied, local_blks_written
  • temp_blks_read, temp_blks_written
  • shared/local/temp_blk_read_time, shared/local/temp_blk_write_time

WAL Statistics (4 columns)

  • wal_records, wal_fpi, wal_bytes, wal_buffers_full

JIT Compilation (10 columns)

  • jit_functions, jit_generation_time
  • jit_inlining_count, jit_inlining_time
  • jit_optimization_count, jit_optimization_time
  • jit_emission_count, jit_emission_time
  • jit_deform_count, jit_deform_time

Parallel Execution (2 columns)

  • parallel_workers_to_launch, parallel_workers_launched

Metadata (5 columns)

  • userid, dbid, queryid, toplevel, query

Timestamps (2 columns)

  • stats_since, minmax_stats_since

11 Pre-Built Views for Instant Insights

  1. pg_stat_insights - Main view with all 52 metrics
  2. pg_stat_insights_top_by_time - Slowest queries by total_exec_time
  3. pg_stat_insights_top_by_calls - Most frequently executed queries
  4. pg_stat_insights_top_by_io - Highest I/O consumers (shared_blks_read + temp_blks_read)
  5. pg_stat_insights_top_cache_misses - Poor cache performers with hit ratio calculation
  6. pg_stat_insights_slow_queries - Queries with mean_exec_time > 100ms
  7. pg_stat_insights_errors - Queries with execution errors
  8. pg_stat_insights_plan_errors - Plan estimation issues (estimated vs actual rows)
  9. pg_stat_insights_histogram_summary - Response time distribution (6 buckets)
  10. pg_stat_insights_by_bucket - Time-series aggregation for trend analysis
  11. pg_stat_insights_replication - WAL replication lag monitoring

Response Time Categorization

Unique to pg_stat_insights: automatic query categorization by execution time for SLA monitoring:

  • <1ms - Ultra-fast queries
  • 1-10ms - Fast queries
  • 10-100ms - Normal queries
  • 100ms-1s - Slow queries
  • 1-10s - Very slow queries
  • >10s - Critical slow queries

Installation and Configuration

Prerequisites

  • PostgreSQL 16, 17, or 18
  • PostgreSQL development headers
  • Standard build tools (make, gcc)

Installation Steps

# Clone the repository
git clone https://github.com/pgelephant/pg_stat_insights.git
cd pg_stat_insights
# Build and install
make clean && make
sudo make install

Configuration

Add to postgresql.conf:

# Load the extension (restart required)
shared_preload_libraries = 'pg_stat_insights'
# Optional: Configure parameters (11 available)
pg_stat_insights.max = 5000 # Max unique statements tracked
pg_stat_insights.track = all # Track all queries
pg_stat_insights.track_utility = on # Track utility commands
pg_stat_insights.save = on # Persist stats across restarts

Restart PostgreSQL and create the extension:

CREATE EXTENSION pg_stat_insights;

Complete Usage Guide

Finding Slow Queries

Top 10 slowest queries by total time:

SELECT
query,
calls,
total_exec_time,
mean_exec_time,
rows
FROM pg_stat_insights_top_by_time
LIMIT 10;
Output:
| query | calls | total_exec_time | mean_exec_time | rows
|------------------------------------+-------+-----------------+----------------+------
| SELECT * FROM orders WHERE status | 1247 | 12456.78 | 9.99 | 15234
| UPDATE inventory SET quantity = | 892 | 8934.12 | 10.02 | 892
| SELECT COUNT(*) FROM events WHERE | 2341 | 7823.45 | 3.34 | 89234
| INSERT INTO logs (timestamp, msg) | 15678 | 6712.34 | 0.43 | 15678
| DELETE FROM temp_data WHERE date | 234 | 3421.56 | 14.62 | 1234
|(10 rows)

Queries exceeding 100ms mean execution time:

SELECT
query,
calls,
mean_exec_time,
total_exec_time
FROM pg_stat_insights_slow_queries
ORDER BY mean_exec_time DESC
LIMIT 10;
Output:
| query | calls | mean_exec_time | total_exec_time
|------------------------------------+-------+----------------+----------------
| SELECT * FROM orders WHERE status | 1247 | 999.12 | 1245678.9
| UPDATE inventory SET quantity = | 892 | 567.23 | 505969.2
| SELECT COUNT(*) FROM events WHERE | 2341 | 234.56 | 549089.0
|(3 rows)

Cache Efficiency Analysis

Find queries with poor cache hit ratios:

SELECT
query,
calls,
shared_blks_hit,
shared_blks_read,
round((shared_blks_hit::numeric / NULLIF(shared_blks_hit + shared_blks_read, 0) * 100), 2) as cache_hit_ratio
FROM pg_stat_insights_top_cache_misses
LIMIT 10;
Output:
| query | calls | shared_blks_hit | shared_blks_read | cache_hit_ratio
|------------------------------------+-------+-----------------+------------------+----------------
| SELECT * FROM large_table WHERE id | 15234 | 892341 | 1081245 | 45.23%
| SELECT * FROM archived_orders | 8923 | 234567 | 215678 | 52.17%
| SELECT data FROM cold_storage | 3421 | 89234 | 140123 | 38.91%
|(3 rows)

Recommendations: Queries with <80% cache hit ratio need attention. Consider:

  • Increasing shared_buffers
  • Adding indexes
  • Partitioning large tables
  • Pre-warming cache for frequently accessed data

WAL Generation Tracking

Find write-heavy queries generating most WAL:

SELECT
query,
calls,
wal_records,
pg_size_pretty(wal_bytes::bigint) as wal_size,
wal_fpi as full_page_images
FROM pg_stat_insights
WHERE wal_bytes > 0
ORDER BY wal_bytes DESC
LIMIT 10;
Output:
| query | calls | wal_records | wal_size | full_page_images
|-------------------------------+-------+-------------+----------+-----------------
| CREATE INDEX CONCURRENTLY ON | 1 | 234567 | 1178 MB | 12345
| INSERT INTO events VALUES ... | 98765 | 189234 | 900 MB | 8923
| UPDATE orders SET status = | 45678 | 145678 | 692 MB | 5678
| DELETE FROM temp_data WHERE | 12345 | 98765 | 470 MB | 3421
| VACUUM ANALYZE large_table | 12 | 67890 | 323 MB | 2345
|(5 rows)

I/O Intensive Operations

Identify disk-heavy queries:

SELECT
query,
calls,
shared_blks_read,
temp_blks_read,
shared_blks_read + temp_blks_read as total_io
FROM pg_stat_insights_top_by_io
LIMIT 10;
Output:
| query | calls | shared_blks_read | temp_blks_read | total_io
|-------------------------------+-------+------------------+----------------+---------
| CREATE INDEX CONCURRENTLY ON | 1 | 1234567 | 987654 | 2222221
| SELECT * FROM events ORDER BY | 1234 | 892345 | 123456 | 1015801
| VACUUM ANALYZE large_table | 12 | 567890 | 0 | 567890
| SELECT DISTINCT user_id FROM | 5678 | 345678 | 98765 | 444443
| SELECT COUNT(*) GROUP BY date | 3421 | 234567 | 12345 | 246912
|(5 rows)

Response Time Distribution

View query distribution across time buckets:

SELECT * FROM pg_stat_insights_histogram_summary;
Output:
| bucket_name | query_count | total_time | avg_time | percentage
|-------------+-------------+------------+----------+-----------
| <1ms | 45234 | 12.34 ms | 0.27 ms | 62.3%
| 1-10ms | 18923 | 123.45 ms | 6.52 ms | 26.1%
| 10-100ms | 5678 | 345.67 ms | 60.89 ms | 7.8%
| 100ms-1s | 1892 | 678.90 ms |358.59 ms | 2.6%
| 1-10s | 567 | 2345.67 ms |4138.18 ms| 0.8%
| >10s | 123 |12345.00 ms|100365.8ms| 0.4%
|(6 rows)

JIT Compilation Statistics

Monitor JIT performance:

SELECT
query,
jit_functions,
round(jit_generation_time::numeric, 2) as gen_time_ms,
round(jit_inlining_time::numeric, 2) as inline_time_ms,
round(jit_optimization_time::numeric, 2) as opt_time_ms,
round(jit_emission_time::numeric, 2) as emit_time_ms
FROM pg_stat_insights
WHERE jit_functions > 0
ORDER BY jit_generation_time DESC
LIMIT 10;
Output:
| query | jit_functions | gen_time_ms | inline_time_ms | opt_time_ms | emit_time_ms
|-------------------------------+---------------+-------------+----------------+-------------+-------------
| SELECT * FROM large_join | 156 | 45.23 | 12.34 | 23.45 | 34.56
| SELECT COUNT(*) GROUP BY | 89 | 34.56 | 9.87 | 18.23 | 23.45
| SELECT DISTINCT id FROM | 67 | 28.91 | 7.65 | 14.32 | 19.87
|(3 rows)

Replication Monitoring

Check replication lag across standbys:

SELECT * FROM pg_stat_insights_replication;
Output:
| pid | usename | application_name | client_addr | repl_state | write_lag_bytes | flush_lag_bytes | replay_lag_bytes | write_lag_seconds
|-------+----------+------------------+-------------+------------+-----------------+-----------------+------------------+------------------
| 12345 | postgres | node2 | 10.0.1.12 | streaming | 123456 | 98765 | 87654 | 0.12
| 12346 | postgres | node3 | 10.0.1.13 | streaming | 98765 | 87654 | 76543 | 0.09
|(2 rows)

Comparison with Alternatives

vs pg_stat_statements

Featurepg_stat_statementspg_stat_insightsAdvantage
Metric Columns4452+8 additional metrics
Pre-built Views211+9 ready-to-use views
Configuration Options511More tuning flexibility
Response Time Categories✗ No6 bucketsSLA monitoring ready
Time-Series Tracking✗ NoBucket-basedTrend analysis
Cache AnalysisBasicEnhanced with ratiosBetter optimization
TAP Test CoverageStandard150 tests, 100%Higher quality
DocumentationBasic30+ pagesComprehensive guides
Prometheus IntegrationManualPre-built queriesEasier setup

vs pg_stat_monitor

Featurepg_stat_monitorpg_stat_insightsAdvantage
Metric Columns5852Optimized, focused metrics
Pre-built Views511More specialized views
Response Time Histograms✗ NoYesBetter performance tracking
Time-SeriesBasicAdvanced bucketsBetter trend analysis
TAP TestsLimited150 testsProduction quality
Grafana DashboardsManualPre-built (8 panels)Ready to use

Installation Guide

Step 1: Build and Install

cd pg_stat_insights
make clean
make
sudo make install

Step 2: Configure PostgreSQL

Add to postgresql.conf:

# Enable pg_stat_insights (restart required)
shared_preload_libraries = 'pg_stat_insights'
# Configuration parameters (optional)
pg_stat_insights.max = 5000 # Max statements tracked
pg_stat_insights.track = all # Track all queries
pg_stat_insights.track_utility = on # Include DDL/utility
pg_stat_insights.save = on # Persist across restarts

Step 3: Restart and Create Extension

# Restart PostgreSQL
sudo systemctl restart postgresql
# Create extension in your database
psql -d your_database -c "CREATE EXTENSION pg_stat_insights;"

Step 4: Verify Installation

SELECT count(*) as total_metrics FROM information_schema.columns
WHERE table_name = 'pg_stat_insights';
Output:
| total_metrics
|--------------
| 52
|(1 row)
SELECT count(*) as total_views FROM information_schema.views
WHERE table_name LIKE 'pg_stat_insights%';
Output:
| total_views
|------------
| 11
|(1 row)

Prometheus & Grafana Integration

Pre-Built Prometheus Queries

pg_stat_insights includes 5 production-ready Prometheus queries:

1. Query Rate (QPS)

rate(pg_stat_insights_calls_total[5m])

2. Cache Hit Ratio

100 * (
rate(pg_stat_insights_shared_blks_hit_total[5m]) /
(rate(pg_stat_insights_shared_blks_hit_total[5m]) +
rate(pg_stat_insights_shared_blks_read_total[5m]))
)

3. P95 Query Latency

histogram_quantile(0.95, pg_stat_insights_exec_time_bucket)

4. WAL Generation Rate

rate(pg_stat_insights_wal_bytes_total[5m])

5. Slow Query Count

count(pg_stat_insights_mean_exec_time > 0.1)

Grafana Dashboards

Pre-built dashboard with 8 panels:

  1. Query Performance Overview - QPS, latency, errors
  2. Cache Hit Ratio - Buffer cache efficiency over time
  3. Top Slow Queries - Current slowest queries
  4. Response Time Distribution - Histogram visualization
  5. WAL Generation - Write activity tracking
  6. Buffer I/O - Shared/local/temp block activity
  7. JIT Statistics - Compilation metrics
  8. Replication Lag - Standby lag monitoring

Alert Rules

11 pre-configured alert rules:

# High slow query rate
- alert: HighSlowQueryRate
expr: count(pg_stat_insights_mean_exec_time > 0.1) > 10
for: 5m
labels:
severity: warning
# Low cache hit ratio
- alert: LowCacheHitRatio
expr: pg_stat_insights_cache_hit_ratio < 80
for: 10m
labels:
severity: warning
# High WAL generation
- alert: HighWALGeneration
expr: rate(pg_stat_insights_wal_bytes_total[5m]) > 100000000
for: 5m
labels:
severity: info

Advanced Usage Scenarios

Identifying Missing Indexes

-- Find sequential scans on large tables
SELECT
query,
calls,
rows / NULLIF(calls, 0) as avg_rows_per_call,
shared_blks_read,
total_exec_time
FROM pg_stat_insights
WHERE query LIKE '%FROM large_table%'
AND shared_blks_read > 10000
ORDER BY total_exec_time DESC;

Optimizing Buffer Cache

-- Calculate optimal shared_buffers size
SELECT
pg_size_pretty(sum(shared_blks_read)::bigint * 8192) as total_data_read,
pg_size_pretty(sum(shared_blks_hit)::bigint * 8192) as total_cache_hits,
round(100.0 * sum(shared_blks_hit) / NULLIF(sum(shared_blks_hit + shared_blks_read), 0), 2) as overall_cache_hit_ratio
FROM pg_stat_insights;
Output:
| total_data_read | total_cache_hits | overall_cache_hit_ratio
|-----------------+------------------+------------------------
| 45 GB | 892 GB | 95.18%
|(1 row)

Monitoring Write Activity

-- Track WAL generation by query type
SELECT
substring(query from 1 for 30) as query_type,
count(*) as query_count,
sum(wal_records) as total_wal_records,
pg_size_pretty(sum(wal_bytes)::bigint) as total_wal_size,
sum(wal_fpi) as total_full_page_images
FROM pg_stat_insights
WHERE wal_bytes > 0
GROUP BY query_type
ORDER BY sum(wal_bytes) DESC
LIMIT 10;
Output:
| query_type | query_count | total_wal_records | total_wal_size | total_full_page_images
|--------------------------------+-------------+-------------------+----------------+-----------------------
| INSERT INTO events (timestamp | 15678 | 1892340 | 8.9 GB | 123456
| UPDATE orders SET processed = | 12345 | 1234567 | 5.8 GB | 98765
| DELETE FROM temp_data WHERE | 8923 | 892345 | 4.2 GB | 67890
|(3 rows)

Parallel Query Analysis

-- Check parallel query efficiency
SELECT
query,
calls,
parallel_workers_to_launch,
parallel_workers_launched,
round(100.0 * parallel_workers_launched / NULLIF(parallel_workers_to_launch, 0), 2) as worker_efficiency
FROM pg_stat_insights
WHERE parallel_workers_to_launch > 0
ORDER BY calls DESC
LIMIT 10;
Output:
| query | calls | workers_to_launch | workers_launched | worker_efficiency
|------------------------------------+-------+-------------------+------------------+------------------
| SELECT * FROM large_table PARALLEL | 5678 | 28390 | 27234 | 95.93%
| SELECT COUNT(*) FROM events | 3421 | 17105 | 16234 | 94.91%
| SELECT DISTINCT user_id FROM | 2345 | 11725 | 10892 | 92.90%
|(3 rows)

Performance Overhead

pg_stat_insights is designed for minimal impact:

MetricImpactNotes
CPU Overhead<1%Even under heavy load
Memory Usage~10MBPer 5000 tracked queries
Disk SpaceMinimalStats stored in shared memory
Query Latency<0.01msPer query tracking overhead

Tested at scale: Production deployments tracking 5,000+ unique queries with negligible performance impact.

Testing & Quality Assurance

TAP Test Suite

Comprehensive testing with 150 test cases:

  • 16 test files covering all functionality
  • 100% code coverage of all 52 metrics
  • PostgreSQL 16-18 compatibility verified
  • StatsInsightManager.pm custom testing framework
# Run complete test suite
./run_all_tests.sh
Output:
t/001_basic.pl ...................... ok
t/002_parameters.pl ................. ok
t/003_views.pl ...................... ok
t/004_metrics.pl .................... ok
t/005_reset.pl ...................... ok
t/006_cache_stats.pl ................ ok
t/007_wal_stats.pl .................. ok
t/008_jit_stats.pl .................. ok
t/009_parallel.pl ................... ok
t/010_restart.pl .................... ok
t/011_all_columns.pl ................ ok
t/012_block_stats.pl ................ ok
t/013_planning_stats.pl ............. ok
t/014_timestamps.pl ................. ok
t/015_user_db_tracking.pl ........... ok
t/016_execution_stats.pl ............ ok
All tests successful.
Files=16, Tests=150, Result: PASS

Production Deployment Best Practices

1. Set Appropriate Limits

# Track enough queries for your workload
pg_stat_insights.max = 5000 # Default: 5000
# Too low = miss important queries
# Too high = higher memory usage

2. Regular Statistics Reset

-- Reset all statistics (do during maintenance windows)
SELECT pg_stat_insights_reset();
-- Reset specific query
SELECT pg_stat_insights_reset(userid, dbid, queryid);

3. Monitor Statistics Age

-- Check how old your statistics are
SELECT
min(stats_since) as oldest_stat,
max(stats_since) as newest_stat,
count(*) as total_queries
FROM pg_stat_insights;

4. Export for Analysis

# Export slow queries to CSV for analysis
psql -d your_database -c "COPY (
SELECT query, calls, total_exec_time, mean_exec_time
FROM pg_stat_insights_top_by_time
LIMIT 100
) TO STDOUT WITH CSV HEADER" > slow_queries.csv

Why pg_stat_insights Over Alternatives?

Comprehensive Metrics

Unlike pg_stat_statements (44 columns) or pg_stat_monitor (58 columns), pg_stat_insights provides 52 carefully selected metrics that cover all aspects of query performance without overwhelming users with unnecessary data.

Pre-Built Analytics

With 11 ready-to-use views, you get instant access to:

  • Slow query identification
  • Cache optimization opportunities
  • I/O bottleneck detection
  • WAL generation analysis
  • Response time distribution
  • Replication health monitoring

No need to write complex queries—the views do the work for you.

Production Quality

  • 150 TAP tests ensure reliability
  • 100% code coverage verified
  • PostgreSQL 16-18 compatibility tested
  • Zero compilation warnings - production-grade C code

Easy Integration

  • Drop-in replacement for pg_stat_statements
  • Pre-built Prometheus queries and Grafana dashboards
  • 11 production-ready alert rules
  • 30+ pages of documentation

Getting Started Today

Ready to gain deep insights into your PostgreSQL performance? Here's your quickstart:

# 1. Clone and install
git clone https://github.com/pgelephant/pg_stat_insights.git
cd pg_stat_insights && make && sudo make install
# 2. Configure postgresql.conf
echo "shared_preload_libraries = 'pg_stat_insights'" >> /etc/postgresql/17/main/postgresql.conf
# 3. Restart and create extension
sudo systemctl restart postgresql
psql -d your_database -c "CREATE EXTENSION pg_stat_insights;"
# 4. View your slowest queries
psql -d your_database -c "SELECT * FROM pg_stat_insights_top_by_time LIMIT 10;"

Resources and Community

Conclusion

pg_stat_insights represents a significant advancement in PostgreSQL performance monitoring by providing 52 comprehensive metrics across 11 pre-built views. Its unique combination of detailed metrics, ready-to-use analytics views, response time categorization, and production-ready testing makes it the ideal choice for organizations serious about PostgreSQL performance optimization.

Key Takeaways

  1. Comprehensive: 52 metrics covering execution, cache, WAL, JIT, and parallel operations
  2. Ready-to-Use: 11 pre-built views for instant performance insights
  3. Drop-in Replacement: Compatible with pg_stat_statements queries
  4. Production Tested: 150 TAP tests with 100% code coverage
  5. Easy Integration: Pre-built Prometheus queries and Grafana dashboards
  6. Well-Documented: 30+ pages of comprehensive documentation

Whether you're optimizing query performance, troubleshooting slow queries, or monitoring production databases, pg_stat_insights provides the visibility you need to keep PostgreSQL running at peak performance.


pg_stat_insights is developed with care for the PostgreSQL community. Version 1.0 | PostgreSQL 16, 17, 18 supported

Comments

Leave a Comment

Be the first to comment on this post!