Query AnalyticsDeep Performance Analysis

Comprehensive guide to analyzing and optimizing PostgreSQL query performance. Identify bottlenecks, optimize execution plans, and monitor query patterns.

Query analysis

Performance optimization

Real-time monitoring

Key Query Metrics

Essential metrics for understanding query performance

Execution Time

Total and average time spent executing queries

Key Fields

total_time: Total execution time across all calls
avg_time: Average execution time per call
min_time: Fastest execution time recorded
max_time: Slowest execution time recorded

Optimization Focus

Focus on queries with high total_time and avg_time for maximum impact

Call Frequency

How often queries are executed

Key Fields

calls: Total number of times query was executed
percentage_of_total: Percentage of all query calls
calls_per_second: Average calls per second
peak_calls: Highest call rate in any time period

Optimization Focus

Optimize frequently called queries even if individual execution time is low

Row Processing

Data volume processed by queries

Key Fields

rows: Total rows returned across all calls
avg_rows: Average rows returned per call
rows_per_second: Rows processed per second
blk_read_time: Time spent reading from disk

Optimization Focus

Look for queries processing large amounts of data inefficiently

I/O Operations

Disk and memory I/O patterns

Key Fields

shared_blks_hit: Blocks read from shared buffer cache
shared_blks_read: Blocks read from disk
local_blks_hit: Local buffer cache hits
local_blks_read: Local buffer cache reads

Optimization Focus

Improve cache hit ratios to reduce disk I/O

Query Categories

Different types of performance issues and their solutions

Slow Queries

Queries taking longer than expected to execute

Threshold

> 100ms average execution time

Impact

High - Direct user experience impact

Common Examples

Complex JOINs without proper indexes
Queries scanning large tables
Inefficient subqueries or CTEs
Missing or outdated statistics

Solutions

Add appropriate indexes
Rewrite query logic
Update table statistics
Consider query optimization techniques

Frequent Queries

Queries executed very often

Threshold

> 1000 calls per hour

Impact

Medium - Resource consumption

Common Examples

Authentication queries
Session management queries
Configuration lookups
Health check queries

Solutions

Implement query result caching
Optimize query execution plan
Consider connection pooling
Review application query patterns

High I/O Queries

Queries causing excessive disk reads

Threshold

> 50% disk reads vs cache hits

Impact

High - System performance impact

Common Examples

Full table scans
Queries on cold data
Missing indexes
Insufficient shared_buffers

Solutions

Add missing indexes
Increase shared_buffers
Implement data partitioning
Optimize query filters

Resource Intensive

Queries consuming significant system resources

Threshold

High CPU or memory usage

Impact

High - System stability impact

Common Examples

Complex analytical queries
Large data aggregations
Inefficient sorting operations
Memory-intensive operations

Solutions

Break down complex queries
Use appropriate data types
Implement query timeouts
Consider materialized views

Optimization Techniques

Proven methods for improving query performance

Index Optimization

Create and maintain optimal indexes

Implementation Steps

1Identify queries with high shared_blks_read
2Analyze WHERE clauses and JOIN conditions
3Create composite indexes for multi-column filters
4Monitor index usage with pg_stat_user_indexes
5Remove unused indexes to reduce maintenance overhead

Code Examples

-- Analyze index usage
SELECT schemaname, tablename, indexname, idx_scan, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes 
WHERE idx_scan = 0  -- Unused indexes
ORDER BY pg_relation_size(indexrelid) DESC;

-- Create composite index
CREATE INDEX idx_users_email_status 
ON users(email, status) 
WHERE status = 'active';

-- Partial index for common filter
CREATE INDEX idx_orders_recent 
ON orders(created_at) 
WHERE created_at > '2024-01-01';

Query Rewriting

Improve query structure and logic

Implementation Steps

1Replace subqueries with JOINs where possible
2Use EXISTS instead of IN for large datasets
3Avoid SELECT * and specify only needed columns
4Use appropriate data types to reduce storage
5Consider using window functions for analytical queries

Code Examples

-- Instead of subquery
SELECT * FROM users 
WHERE id IN (SELECT user_id FROM orders WHERE status = 'completed');

-- Use JOIN
SELECT DISTINCT u.* FROM users u
JOIN orders o ON u.id = o.user_id 
WHERE o.status = 'completed';

-- Use EXISTS for better performance
SELECT * FROM users u
WHERE EXISTS (
    SELECT 1 FROM orders o 
    WHERE o.user_id = u.id AND o.status = 'completed'
);

Statistics and Planning

Ensure optimal query planning

Implementation Steps

1Keep table statistics up to date with ANALYZE
2Adjust statistics targets for large tables
3Use EXPLAIN ANALYZE to verify execution plans
4Consider query hints for complex scenarios
5Monitor and tune planner parameters

Code Examples

-- Update statistics
ANALYZE users;
ANALYZE orders;

-- Increase statistics target for large table
ALTER TABLE large_table ALTER COLUMN important_column SET STATISTICS 1000;

-- Analyze query execution plan
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)
SELECT u.*, o.total 
FROM users u 
JOIN orders o ON u.id = o.user_id 
WHERE u.created_at > '2024-01-01';

-- Check for sequential scans
SELECT schemaname, tablename, seq_scan, seq_tup_read
FROM pg_stat_user_tables 
WHERE seq_scan > 0 
ORDER BY seq_tup_read DESC;

Caching and Materialization

Reduce repeated computation

Implementation Steps

1Implement application-level query caching
2Use materialized views for complex aggregations
3Consider query result caching in application
4Use prepared statements for repeated queries
5Implement connection pooling

Code Examples

-- Create materialized view for complex aggregation
CREATE MATERIALIZED VIEW user_order_summary AS
SELECT 
    u.id,
    u.email,
    COUNT(o.id) as order_count,
    SUM(o.total) as total_spent,
    MAX(o.created_at) as last_order
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.email;

-- Refresh materialized view
REFRESH MATERIALIZED VIEW user_order_summary;

-- Create index on materialized view
CREATE INDEX idx_user_order_summary_email 
ON user_order_summary(email);

Monitoring Queries

Ready-to-use SQL queries for query performance monitoring

Top Slow Queries

Identify queries with highest execution times

SELECT 
    query,
    calls,
    total_time,
    mean_time,
    rows,
    100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent
FROM pg_stat_statements 
WHERE query NOT LIKE '%pg_stat_statements%'
ORDER BY mean_time DESC 
LIMIT 10;

Most Frequent Queries

Find queries executed most often

SELECT 
    query,
    calls,
    total_time,
    mean_time,
    (calls * 100.0 / SUM(calls) OVER()) AS percent_of_total
FROM pg_stat_statements 
WHERE query NOT LIKE '%pg_stat_statements%'
ORDER BY calls DESC 
LIMIT 10;

High I/O Queries

Queries causing most disk reads

SELECT 
    query,
    calls,
    shared_blks_read,
    shared_blks_hit,
    100.0 * shared_blks_read / nullif(shared_blks_hit + shared_blks_read, 0) AS read_percent
FROM pg_stat_statements 
WHERE shared_blks_read > 0
ORDER BY shared_blks_read DESC 
LIMIT 10;

Query Performance Trends

Track query performance over time

SELECT 
    DATE_TRUNC('hour', now() - interval '1 hour' * generate_series(0, 23)) as hour,
    COUNT(*) as query_count,
    AVG(mean_time) as avg_execution_time,
    MAX(mean_time) as max_execution_time
FROM pg_stat_statements 
WHERE query NOT LIKE '%pg_stat_statements%'
GROUP BY hour
ORDER BY hour;

Best Practices

Guidelines for effective query performance management

Regular Monitoring

Establish consistent query performance monitoring

Set up automated alerts for slow queries
Review query performance reports weekly
Monitor trends over time, not just current state
Track both individual query performance and overall system health

Baseline Establishment

Create performance baselines for comparison

Document normal query performance ranges
Establish thresholds for different query types
Track performance before and after changes
Maintain historical performance data

Incremental Optimization

Optimize queries systematically and safely

Start with highest impact, lowest effort optimizations
Test changes in development environment first
Implement one optimization at a time
Measure impact of each change

Documentation and Knowledge

Maintain query performance knowledge base

Document query optimization decisions
Share knowledge with development team
Create query performance guidelines
Regular team training on database optimization

API Integration

Use pg_stat_insights API to integrate query analytics into your applications

API Reference

Complete API documentation and examples

Best Practices

Optimization strategies and monitoring tips

Getting Started

Quick setup and first insights