Query AnalyticsDeep Performance Analysis
Comprehensive guide to analyzing and optimizing PostgreSQL query performance. Identify bottlenecks, optimize execution plans, and monitor query patterns.
Key Query Metrics
Essential metrics for understanding query performance
Execution Time
Total and average time spent executing queries
Key Fields
- total_time: Total execution time across all calls
- avg_time: Average execution time per call
- min_time: Fastest execution time recorded
- max_time: Slowest execution time recorded
Optimization Focus
Focus on queries with high total_time and avg_time for maximum impact
Call Frequency
How often queries are executed
Key Fields
- calls: Total number of times query was executed
- percentage_of_total: Percentage of all query calls
- calls_per_second: Average calls per second
- peak_calls: Highest call rate in any time period
Optimization Focus
Optimize frequently called queries even if individual execution time is low
Row Processing
Data volume processed by queries
Key Fields
- rows: Total rows returned across all calls
- avg_rows: Average rows returned per call
- rows_per_second: Rows processed per second
- blk_read_time: Time spent reading from disk
Optimization Focus
Look for queries processing large amounts of data inefficiently
I/O Operations
Disk and memory I/O patterns
Key Fields
- shared_blks_hit: Blocks read from shared buffer cache
- shared_blks_read: Blocks read from disk
- local_blks_hit: Local buffer cache hits
- local_blks_read: Local buffer cache reads
Optimization Focus
Improve cache hit ratios to reduce disk I/O
Query Categories
Different types of performance issues and their solutions
Slow Queries
Queries taking longer than expected to execute
Threshold
> 100ms average execution time
Impact
High - Direct user experience impact
Common Examples
- Complex JOINs without proper indexes
- Queries scanning large tables
- Inefficient subqueries or CTEs
- Missing or outdated statistics
Solutions
- Add appropriate indexes
- Rewrite query logic
- Update table statistics
- Consider query optimization techniques
Frequent Queries
Queries executed very often
Threshold
> 1000 calls per hour
Impact
Medium - Resource consumption
Common Examples
- Authentication queries
- Session management queries
- Configuration lookups
- Health check queries
Solutions
- Implement query result caching
- Optimize query execution plan
- Consider connection pooling
- Review application query patterns
High I/O Queries
Queries causing excessive disk reads
Threshold
> 50% disk reads vs cache hits
Impact
High - System performance impact
Common Examples
- Full table scans
- Queries on cold data
- Missing indexes
- Insufficient shared_buffers
Solutions
- Add missing indexes
- Increase shared_buffers
- Implement data partitioning
- Optimize query filters
Resource Intensive
Queries consuming significant system resources
Threshold
High CPU or memory usage
Impact
High - System stability impact
Common Examples
- Complex analytical queries
- Large data aggregations
- Inefficient sorting operations
- Memory-intensive operations
Solutions
- Break down complex queries
- Use appropriate data types
- Implement query timeouts
- Consider materialized views
Optimization Techniques
Proven methods for improving query performance
Index Optimization
Create and maintain optimal indexes
Implementation Steps
- 1Identify queries with high shared_blks_read
- 2Analyze WHERE clauses and JOIN conditions
- 3Create composite indexes for multi-column filters
- 4Monitor index usage with pg_stat_user_indexes
- 5Remove unused indexes to reduce maintenance overhead
Code Examples
-- Analyze index usage SELECT schemaname, tablename, indexname, idx_scan, idx_tup_read, idx_tup_fetch FROM pg_stat_user_indexes WHERE idx_scan = 0 -- Unused indexes ORDER BY pg_relation_size(indexrelid) DESC; -- Create composite index CREATE INDEX idx_users_email_status ON users(email, status) WHERE status = 'active'; -- Partial index for common filter CREATE INDEX idx_orders_recent ON orders(created_at) WHERE created_at > '2024-01-01';
Query Rewriting
Improve query structure and logic
Implementation Steps
- 1Replace subqueries with JOINs where possible
- 2Use EXISTS instead of IN for large datasets
- 3Avoid SELECT * and specify only needed columns
- 4Use appropriate data types to reduce storage
- 5Consider using window functions for analytical queries
Code Examples
-- Instead of subquery
SELECT * FROM users
WHERE id IN (SELECT user_id FROM orders WHERE status = 'completed');
-- Use JOIN
SELECT DISTINCT u.* FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.status = 'completed';
-- Use EXISTS for better performance
SELECT * FROM users u
WHERE EXISTS (
SELECT 1 FROM orders o
WHERE o.user_id = u.id AND o.status = 'completed'
);Statistics and Planning
Ensure optimal query planning
Implementation Steps
- 1Keep table statistics up to date with ANALYZE
- 2Adjust statistics targets for large tables
- 3Use EXPLAIN ANALYZE to verify execution plans
- 4Consider query hints for complex scenarios
- 5Monitor and tune planner parameters
Code Examples
-- Update statistics ANALYZE users; ANALYZE orders; -- Increase statistics target for large table ALTER TABLE large_table ALTER COLUMN important_column SET STATISTICS 1000; -- Analyze query execution plan EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) SELECT u.*, o.total FROM users u JOIN orders o ON u.id = o.user_id WHERE u.created_at > '2024-01-01'; -- Check for sequential scans SELECT schemaname, tablename, seq_scan, seq_tup_read FROM pg_stat_user_tables WHERE seq_scan > 0 ORDER BY seq_tup_read DESC;
Caching and Materialization
Reduce repeated computation
Implementation Steps
- 1Implement application-level query caching
- 2Use materialized views for complex aggregations
- 3Consider query result caching in application
- 4Use prepared statements for repeated queries
- 5Implement connection pooling
Code Examples
-- Create materialized view for complex aggregation
CREATE MATERIALIZED VIEW user_order_summary AS
SELECT
u.id,
u.email,
COUNT(o.id) as order_count,
SUM(o.total) as total_spent,
MAX(o.created_at) as last_order
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.email;
-- Refresh materialized view
REFRESH MATERIALIZED VIEW user_order_summary;
-- Create index on materialized view
CREATE INDEX idx_user_order_summary_email
ON user_order_summary(email);Monitoring Queries
Ready-to-use SQL queries for query performance monitoring
Top Slow Queries
Identify queries with highest execution times
SELECT
query,
calls,
total_time,
mean_time,
rows,
100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent
FROM pg_stat_statements
WHERE query NOT LIKE '%pg_stat_statements%'
ORDER BY mean_time DESC
LIMIT 10;Most Frequent Queries
Find queries executed most often
SELECT
query,
calls,
total_time,
mean_time,
(calls * 100.0 / SUM(calls) OVER()) AS percent_of_total
FROM pg_stat_statements
WHERE query NOT LIKE '%pg_stat_statements%'
ORDER BY calls DESC
LIMIT 10;High I/O Queries
Queries causing most disk reads
SELECT
query,
calls,
shared_blks_read,
shared_blks_hit,
100.0 * shared_blks_read / nullif(shared_blks_hit + shared_blks_read, 0) AS read_percent
FROM pg_stat_statements
WHERE shared_blks_read > 0
ORDER BY shared_blks_read DESC
LIMIT 10;Query Performance Trends
Track query performance over time
SELECT
DATE_TRUNC('hour', now() - interval '1 hour' * generate_series(0, 23)) as hour,
COUNT(*) as query_count,
AVG(mean_time) as avg_execution_time,
MAX(mean_time) as max_execution_time
FROM pg_stat_statements
WHERE query NOT LIKE '%pg_stat_statements%'
GROUP BY hour
ORDER BY hour;Best Practices
Guidelines for effective query performance management
Regular Monitoring
Establish consistent query performance monitoring
- Set up automated alerts for slow queries
- Review query performance reports weekly
- Monitor trends over time, not just current state
- Track both individual query performance and overall system health
Baseline Establishment
Create performance baselines for comparison
- Document normal query performance ranges
- Establish thresholds for different query types
- Track performance before and after changes
- Maintain historical performance data
Incremental Optimization
Optimize queries systematically and safely
- Start with highest impact, lowest effort optimizations
- Test changes in development environment first
- Implement one optimization at a time
- Measure impact of each change
Documentation and Knowledge
Maintain query performance knowledge base
- Document query optimization decisions
- Share knowledge with development team
- Create query performance guidelines
- Regular team training on database optimization
API Integration
Use pg_stat_insights API to integrate query analytics into your applications