Introduction
Did you know that 80% of production outages can be traced back to misconfigured or under-optimized Linux systems? Site Reliability Engineers (SREs) are constantly challenged to keep systems running optimally under high workloads, making Linux performance tuning an essential skill. In this guide, you’ll discover powerful, practical techniques to proactively optimize your Linux systems, enhancing reliability, performance, and operational efficiency.
Step-by-Step Linux Optimization Guide
Step 1: Adjust Swappiness for Optimal Memory Management
Check current swappiness:
cat /proc/sys/vm/swappiness
Set recommended swappiness value:
sudo sysctl vm.swappiness=10
Step 2: Increase File Descriptor Limits
Check current limits:
ulimit -n
Update limits:
echo '* soft nofile 65535' | sudo tee -a /etc/security/limits.conf
echo '* hard nofile 65535' | sudo tee -a /etc/security/limits.conf
Step 3: Resource Isolation with cgroups
Create a memory cgroup:
sudo cgcreate -g memory:/critical_service
echo $((1024*1024*1024)) | sudo tee /sys/fs/cgroup/memory/critical_service/memory.limit_in_bytes
Step 4: Networking Optimization
Adjust TCP parameters:
sudo sysctl -w net.ipv4.tcp_tw_reuse=1
sudo sysctl -w net.core.somaxconn=1024
Step 5: Select Appropriate I/O Scheduler
Check current scheduler:
cat /sys/block/sda/queue/scheduler
Set deadline scheduler:
echo 'deadline' | sudo tee /sys/block/sda/queue/scheduler
Step 6: Real-time Diagnostics with perf
Monitor kernel-level events:
sudo perf top
Step 7: Disable Transparent Huge Pages (THP)
Check THP status:
cat /sys/kernel/mm/transparent_hugepage/enabled
Disable THP:
echo never | sudo tee /sys/kernel/mm/transparent_hugepage/enabled
Step 8: Enable HugePages
Configure HugePages:
sudo sysctl vm.nr_hugepages=1024
Step 9: Tweak Cache Behavior
Adjust dirty ratios:
sudo sysctl -w vm.dirty_ratio=15
sudo sysctl -w vm.dirty_background_ratio=5
Step 10: Optimize IRQ Balancing
Install and configure irqbalance:
sudo apt-get install irqbalance
sudo systemctl enable irqbalance
sudo systemctl start irqbalance
Step 11: Network Throughput Optimization
Adjust network backlog:
sudo sysctl -w net.core.netdev_max_backlog=5000
Step 12: Manage TCP SYN Backlog
Increase SYN backlog:
sudo sysctl -w net.ipv4.tcp_max_syn_backlog=2048
Step 13: TCP Connection Timeout
Reduce FIN timeout:
sudo sysctl -w net.ipv4.tcp_fin_timeout=15
Step 14: Optimize TCP Buffer Sizes
Set TCP buffer sizes:
sudo sysctl -w net.core.rmem_max=16777216
sudo sysctl -w net.core.wmem_max=16777216
Step 15: Apply tuned-adm Profiles
Install and apply profiles:
sudo apt-get install tuned
sudo tuned-adm profile throughput-performance
Step 16: Scheduler Tunables
Optimize scheduler responsiveness:
sudo sysctl -w kernel.sched_autogroup_enabled=1
Step 17: Implement zswap
Enable zswap:
sudo sysctl -w vm.zswap.enabled=1
Step 18: SSD Optimization with udev
Create udev rule for SSD:
sudo echo 'ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="deadline"' | sudo tee /etc/udev/rules.d/60-ssd.rules
Step 19: Kernel Samepage Merging (KSM)
Enable KSM:
echo 1 | sudo tee /sys/kernel/mm/ksm/run
Step 20: Regular fstrim
Schedule fstrim:
sudo systemctl enable fstrim.timer
sudo systemctl start fstrim.timer
Step 21: CPU Governor Adjustment
Set performance governor:
sudo apt-get install cpufrequtils
sudo cpufreq-set -g performance
Automating Performance Tuning
Consistency in configuration across systems is crucial. Automate using tools like Ansible or Chef.
Example Ansible Playbook for Performance Tuning
- hosts: all
tasks:
- name: Set vm.swappiness
sysctl:
name: vm.swappiness
value: '10'
state: present
reload: yes
- name: Increase file descriptor limits
lineinfile:
path: /etc/security/limits.conf
line: '* soft nofile 65535'
create: yes
Actionable Takeaways: Your Tuning Checklist
- Adjust kernel parameters (swappiness, file descriptors)
- Implement cgroups for resource isolation
- Optimize networking and TCP stack
- Choose appropriate I/O schedulers
- Automate tuning tasks with Ansible or Chef
- Monitor continuously using tools like
perf
- Apply additional advanced optimization techniques listed above
By implementing these Linux performance tuning techniques step-by-step, you’re empowering your infrastructure to handle peak loads, ensuring optimal uptime and reliability.