Purushotam Adhikari

Posted on May 5

5 Custom Bash Scripts to Monitor Your Linux Server Performance

#linux #bash #server #performance

As developers and system administrators, we often rely on commercial monitoring tools for our servers. However, sometimes you need a lightweight, customizable solution that you can fully control. In this guide, I'll show you how to create five practical Bash scripts that you can use to monitor various aspects of your Linux server performance.

These scripts are ideal for:

Small to medium-sized server environments
Developers looking to understand server monitoring basics
Anyone who wants to automate responses to common server issues

Let's dive in!

1. CPU and Memory Usage Monitor

One of the most critical aspects of server performance is CPU and memory usage. Here's a script that monitors both and sends email alerts when thresholds are exceeded:

#!/bin/bash
# cpu_mem_monitor.sh - Monitors CPU and memory usage and sends alerts

# Configuration
THRESHOLD_CPU=80    # Alert when CPU usage exceeds 80%
THRESHOLD_MEM=80    # Alert when memory usage exceeds 80%
EMAIL="admin@yourdomain.com"
LOG_FILE="/var/log/server-monitor.log"

# Create log file if it doesn't exist
if [ ! -f $LOG_FILE ]; then
    touch $LOG_FILE
    chmod 644 $LOG_FILE
fi

log_message() {
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> $LOG_FILE
}

send_alert() {
    log_message "$1"
    echo "$1" | mail -s "Server Alert: $(hostname)" $EMAIL
}

# Get CPU usage (excluding idle time)
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | sed "s/.*, *\([0-9.]*\)%* id.*/\1/" | awk '{print 100 - $1}')
CPU_USAGE_INT=${CPU_USAGE%.*}

# Get memory usage percentage
MEM_USAGE=$(free | grep Mem | awk '{print $3/$2 * 100.0}')
MEM_USAGE_INT=${MEM_USAGE%.*}

log_message "CPU: ${CPU_USAGE}%, Memory: ${MEM_USAGE}%"

# Check if CPU usage exceeds threshold
if [ $CPU_USAGE_INT -gt $THRESHOLD_CPU ]; then
    send_alert "HIGH CPU ALERT: Usage at ${CPU_USAGE}% (Threshold: ${THRESHOLD_CPU}%)"
fi

# Check if memory usage exceeds threshold
if [ $MEM_USAGE_INT -gt $THRESHOLD_MEM ]; then
    send_alert "HIGH MEMORY ALERT: Usage at ${MEM_USAGE}% (Threshold: ${THRESHOLD_MEM}%)"
fi

How to Use This Script:

Save it as cpu_mem_monitor.sh
Make it executable: chmod +x cpu_mem_monitor.sh
Set up a cron job to run it at regular intervals:

# Run every 5 minutes
*/5 * * * * /path/to/cpu_mem_monitor.sh

Make sure you have mailutils or a similar package installed to enable email functionality: sudo apt-get install mailutils

2. Disk Space Monitor

Running out of disk space can cause serious issues for your services. This script monitors your disk space and sends alerts when it gets too low:

#!/bin/bash
# disk_monitor.sh - Monitors disk space usage and sends alerts

# Configuration
THRESHOLD=85         # Alert when usage exceeds 85%
EMAIL="admin@yourdomain.com"
LOG_FILE="/var/log/disk-monitor.log"
EXCLUDE_LIST=("tmpfs" "devtmpfs" "squashfs")  # Filesystems to exclude

# Create log file if it doesn't exist
if [ ! -f $LOG_FILE ]; then
    touch $LOG_FILE
    chmod 644 $LOG_FILE
fi

log_message() {
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> $LOG_FILE
}

send_alert() {
    log_message "$1"
    echo -e "$1" | mail -s "Disk Space Alert: $(hostname)" $EMAIL
}

# Get disk usage for mounted filesystems
DISK_USAGE=$(df -h | awk '$NF!~/^\/$|\/boot$/ {print}')
log_message "Running disk space check"

# Check each filesystem
ALERTS=""
while read -r line; do
    # Skip header line
    if echo "$line" | grep -q "Filesystem"; then
        continue
    fi

    # Extract filesystem details
    FILESYSTEM=$(echo "$line" | awk '{print $1}')
    MOUNT_POINT=$(echo "$line" | awk '{print $NF}')
    USAGE_PERCENT=$(echo "$line" | awk '{print $5}' | sed 's/%//')

    # Skip excluded filesystems
    SKIP=0
    for EXCL in "${EXCLUDE_LIST[@]}"; do
        if echo "$FILESYSTEM" | grep -q "$EXCL"; then
            SKIP=1
            break
        fi
    done

    if [ $SKIP -eq 1 ]; then
        continue
    fi

    # Check if usage exceeds threshold
    if [ "$USAGE_PERCENT" -gt "$THRESHOLD" ]; then
        DETAIL=$(echo "$line" | awk '{print "Used: "$3"  Free: "$4"  Usage: "$5}')
        ALERTS="${ALERTS}${MOUNT_POINT} - ${DETAIL}\n"
    fi
done <<< "$DISK_USAGE"

# Send alert if needed
if [ ! -z "$ALERTS" ]; then
    ALERT_MESSAGE="DISK SPACE ALERT: The following filesystems exceed ${THRESHOLD}% usage:\n\n${ALERTS}"
    send_alert "$ALERT_MESSAGE"
fi

How to Use This Script:

Save it as disk_monitor.sh
Make it executable: chmod +x disk_monitor.sh
Set up a cron job to run it daily or hourly:

# Run once a day at 8am
0 8 * * * /path/to/disk_monitor.sh

3. Critical Log File Monitor

This script scans your system logs for important error patterns and alerts you when they occur:

#!/bin/bash
# log_monitor.sh - Monitors critical log files for important patterns

# Configuration
EMAIL="admin@yourdomain.com"
LOG_FILE="/var/log/logmonitor.log"

# Log files to monitor with their patterns
declare -A LOG_PATTERNS
LOG_PATTERNS["/var/log/syslog"]="(error|critical|fatal|failed|failure)"
LOG_PATTERNS["/var/log/auth.log"]="(failed|invalid|error|authentication failure)"
LOG_PATTERNS["/var/log/apache2/error.log"]="(fatal|error|critical)"
LOG_PATTERNS["/var/log/mysql/error.log"]="(ERROR|CRITICAL)"

# Create log file if it doesn't exist
if [ ! -f $LOG_FILE ]; then
    touch $LOG_FILE
    chmod 644 $LOG_FILE
fi

log_message() {
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> $LOG_FILE
}

send_alert() {
    log_message "$1"
    echo -e "$1" | mail -s "Log Alert: $(hostname)" $EMAIL
}

# Store timestamp of last run to only check new logs
TIMESTAMP_FILE="/tmp/logmonitor_timestamp"
if [ -f "$TIMESTAMP_FILE" ]; then
    LAST_RUN=$(cat "$TIMESTAMP_FILE")
else
    # If running for the first time, check last 10 minutes
    LAST_RUN=$(date -d "10 minutes ago" "+%Y-%m-%d %H:%M:%S")
fi

# Update timestamp file with current time
date "+%Y-%m-%d %H:%M:%S" > "$TIMESTAMP_FILE"

# Function to convert date to seconds since epoch for comparison
date_to_seconds() {
    date -d "$1" +%s
}

LAST_RUN_SECONDS=$(date_to_seconds "$LAST_RUN")
CURRENT_TIME=$(date "+%Y-%m-%d %H:%M:%S")
CURRENT_SECONDS=$(date_to_seconds "$CURRENT_TIME")

log_message "Checking logs from $LAST_RUN to $CURRENT_TIME"

# Check each log file for patterns
ALERTS=""
for LOG_PATH in "${!LOG_PATTERNS[@]}"; do
    if [ -f "$LOG_PATH" ]; then
        PATTERN="${LOG_PATTERNS[$LOG_PATH]}"

        # Get new log entries matching pattern
        NEW_ENTRIES=$(grep -i -E "$PATTERN" "$LOG_PATH" | while read -r line; do
            # Try to extract timestamp from log line (formats vary)
            if [[ "$line" =~ ^[A-Za-z]{3}\ [0-9]{1,2}\ [0-9]{2}:[0-9]{2}:[0-9]{2} ]]; then
                # Syslog format
                LOG_DATE=$(echo "$line" | cut -d ' ' -f 1-3)
                FULL_DATE=$(date -d "$LOG_DATE $(date +%Y)" "+%Y-%m-%d %H:%M:%S")
            elif [[ "$line" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}\ [0-9]{2}:[0-9]{2}:[0-9]{2} ]]; then
                # ISO format
                FULL_DATE=$(echo "$line" | grep -oE '[0-9]{4}-[0-9]{2}-[0-9]{2}\ [0-9]{2}:[0-9]{2}:[0-9]{2}')
            else
                # If can't parse date, assume it's recent
                continue
            fi

            # Convert log entry date to seconds
            ENTRY_SECONDS=$(date_to_seconds "$FULL_DATE")

            # Only show entries newer than last run
            if [ "$ENTRY_SECONDS" -ge "$LAST_RUN_SECONDS" ]; then
                echo "$line"
            fi
        done)

        if [ ! -z "$NEW_ENTRIES" ]; then
            ENTRY_COUNT=$(echo "$NEW_ENTRIES" | wc -l)
            ALERTS="${ALERTS}$LOG_PATH - $ENTRY_COUNT new critical entries found:\n"
            ALERTS="${ALERTS}$(echo "$NEW_ENTRIES" | head -n 15)\n"
            # If more than 15 entries, indicate there are more
            if [ "$ENTRY_COUNT" -gt 15 ]; then
                ALERTS="${ALERTS}... and $(($ENTRY_COUNT - 15)) more entries (check full log)\n"
            fi
            ALERTS="${ALERTS}\n"
        fi
    else
        log_message "Warning: Log file $LOG_PATH not found"
    fi
done

# Send alert if needed
if [ ! -z "$ALERTS" ]; then
    ALERT_MESSAGE="LOG ALERT: Critical entries found in system logs:\n\n${ALERTS}"
    send_alert "$ALERT_MESSAGE"
fi

How to Use This Script:

Save it as log_monitor.sh
Make it executable: chmod +x log_monitor.sh
Set up a cron job to run it hourly:

# Run every hour
0 * * * * /path/to/log_monitor.sh

4. Network Connection Monitor

This script monitors network connections and alerts you about unusual connection counts:

#!/bin/bash
# network_monitor.sh - Monitors network connections and detects unusual patterns

# Configuration
EMAIL="admin@yourdomain.com"
LOG_FILE="/var/log/network-monitor.log"
MAX_CONNECTIONS=200     # Alert threshold for total connections
MAX_PER_IP=50          # Alert threshold for connections per IP

# Create log file if it doesn't exist
if [ ! -f $LOG_FILE ]; then
    touch $LOG_FILE
    chmod 644 $LOG_FILE
fi

log_message() {
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> $LOG_FILE
}

send_alert() {
    log_message "$1"
    echo -e "$1" | mail -s "Network Alert: $(hostname)" $EMAIL
}

# Count total current connections
TOTAL_CONNECTIONS=$(netstat -an | grep ESTABLISHED | wc -l)
log_message "Total active connections: $TOTAL_CONNECTIONS"

# Check if total connections exceed threshold
if [ "$TOTAL_CONNECTIONS" -gt "$MAX_CONNECTIONS" ]; then
    TOP_PORTS=$(netstat -an | grep ESTABLISHED | awk '{print $4}' | cut -d: -f2 | sort | uniq -c | sort -nr | head -5)
    TOP_SERVICES=$(for port in $(echo "$TOP_PORTS" | awk '{print $2}'); do 
        service=$(grep -w "$port/tcp" /etc/services 2>/dev/null | head -1 | awk '{print $1}')
        if [ -z "$service" ]; then service="unknown"; fi
        echo "Port $port: $service"
    done)

    ALERT_MESSAGE="HIGH CONNECTION ALERT: $TOTAL_CONNECTIONS active connections (Threshold: $MAX_CONNECTIONS)\n\nTop destination ports:\n$TOP_PORTS\n\nService identification:\n$TOP_SERVICES"
    send_alert "$ALERT_MESSAGE"
fi

# Check for excessive connections from single IPs
IP_CONNECTIONS=$(netstat -an | grep ESTABLISHED | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -nr)
SUSPICIOUS_IPS=""

while read -r line; do
    COUNT=$(echo "$line" | awk '{print $1}')
    IP=$(echo "$line" | awk '{print $2}')

    if [ "$COUNT" -gt "$MAX_PER_IP" ]; then
        IP_DETAILS=$(whois "$IP" 2>/dev/null | grep -E "Organization|OrgName|netname|country" | head -4 | tr '\n' ' ')
        SUSPICIOUS_IPS="${SUSPICIOUS_IPS}$IP - $COUNT connections - $IP_DETAILS\n"
    fi
done <<< "$IP_CONNECTIONS"

# Send alert if suspicious IPs found
if [ ! -z "$SUSPICIOUS_IPS" ]; then
    ALERT_MESSAGE="SUSPICIOUS CONNECTION ALERT: The following IPs have excessive connection counts (Threshold: $MAX_PER_IP):\n\n$SUSPICIOUS_IPS"
    send_alert "$ALERT_MESSAGE"
fi

How to Use This Script:

Save it as network_monitor.sh
Make it executable: chmod +x network_monitor.sh
Install required tools: sudo apt-get install net-tools whois
Set up a cron job to run it periodically:

# Run every 15 minutes
*/15 * * * * /path/to/network_monitor.sh

5. Service Status Monitor

This script ensures your critical services are running and can automatically restart them if they fail:

#!/bin/bash
# service_monitor.sh - Monitors critical services and restarts them if needed

# Configuration
EMAIL="admin@yourdomain.com"
LOG_FILE="/var/log/service-monitor.log"

# Define services to monitor and whether to restart them
# Format: "service_name:restart_flag (0 or 1)"
SERVICES=(
    "nginx:1"
    "mysql:1"
    "ssh:1"
    "apache2:1"
    "mongodb:0"    # Monitor but don't restart automatically
)

# Create log file if it doesn't exist
if [ ! -f $LOG_FILE ]; then
    touch $LOG_FILE
    chmod 644 $LOG_FILE
fi

log_message() {
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> $LOG_FILE
}

send_alert() {
    log_message "$1"
    echo -e "$1" | mail -s "Service Alert: $(hostname)" $EMAIL
}

# Check each service
log_message "Checking service status"
FAILED_SERVICES=""
RESTARTED_SERVICES=""

for SERVICE_DEF in "${SERVICES[@]}"; do
    # Split definition into service name and restart flag
    SERVICE_NAME=$(echo "$SERVICE_DEF" | cut -d: -f1)
    RESTART_FLAG=$(echo "$SERVICE_DEF" | cut -d: -f2)

    # Check if service is active
    if systemctl is-active --quiet "$SERVICE_NAME"; then
        log_message "$SERVICE_NAME is running"
    else
        log_message "$SERVICE_NAME is not running"
        FAILED_SERVICES="${FAILED_SERVICES}$SERVICE_NAME "

        if [ "$RESTART_FLAG" -eq 1 ]; then
            log_message "Attempting to restart $SERVICE_NAME"
            systemctl restart "$SERVICE_NAME"

            # Check if restart was successful
            sleep 2
            if systemctl is-active --quiet "$SERVICE_NAME"; then
                log_message "$SERVICE_NAME successfully restarted"
                RESTARTED_SERVICES="${RESTARTED_SERVICES}$SERVICE_NAME "
            else
                log_message "Failed to restart $SERVICE_NAME"
            fi
        fi
    fi
done

# Send alerts if needed
if [ ! -z "$FAILED_SERVICES" ]; then
    ALERT_MESSAGE="SERVICE ALERT: The following services were detected as not running:\n$FAILED_SERVICES\n\n"

    if [ ! -z "$RESTARTED_SERVICES" ]; then
        ALERT_MESSAGE="${ALERT_MESSAGE}Successfully restarted services: $RESTARTED_SERVICES\n\n"
    fi

    # List still-failed services
    STILL_FAILED=""
    for SERVICE in $FAILED_SERVICES; do
        if ! systemctl is-active --quiet "$SERVICE"; then
            STILL_FAILED="${STILL_FAILED}$SERVICE "
        fi
    done

    if [ ! -z "$STILL_FAILED" ]; then
        ALERT_MESSAGE="${ALERT_MESSAGE}Services still not running: $STILL_FAILED\n\n"
        ALERT_MESSAGE="${ALERT_MESSAGE}Please check these services manually."
    fi

    send_alert "$ALERT_MESSAGE"
fi

How to Use This Script:

Save it as service_monitor.sh
Make it executable: chmod +x service_monitor.sh
Edit the SERVICES array to include the services you want to monitor
Set up a cron job to run it frequently:

# Run every 5 minutes
*/5 * * * * /path/to/service_monitor.sh

Setting Up as a Comprehensive Monitoring System

These scripts work well independently, but they're even more powerful when used together as part of a comprehensive monitoring system. Here's how to set everything up:

Step 1: Create a Central Directory

sudo mkdir -p /opt/server-monitor/scripts
sudo mkdir -p /opt/server-monitor/logs

Step 2: Copy All Scripts

Save all five scripts to /opt/server-monitor/scripts/ and make them executable:

sudo chmod +x /opt/server-monitor/scripts/*.sh

Step 3: Create a Central Configuration File

Create /opt/server-monitor/config.sh:

#!/bin/bash
# Central configuration for all monitoring scripts

# Email settings
EMAIL_ADMIN="admin@yourdomain.com"
EMAIL_ALERTS="alerts@yourdomain.com"

# Thresholds
CPU_THRESHOLD=80
MEM_THRESHOLD=80
DISK_THRESHOLD=85
NET_MAX_CONNECTIONS=200
NET_MAX_PER_IP=50

# Log file locations
LOG_DIR="/opt/server-monitor/logs"
CPU_MEM_LOG="$LOG_DIR/cpu-mem-monitor.log"
DISK_LOG="$LOG_DIR/disk-monitor.log"
LOG_MONITOR_LOG="$LOG_DIR/log-monitor.log"
NETWORK_LOG="$LOG_DIR/network-monitor.log"
SERVICE_LOG="$LOG_DIR/service-monitor.log"

# Services to monitor (service:restart_flag)
SERVICES=(
    "nginx:1"
    "mysql:1"
    "ssh:1"
    "apache2:1"
    "mongodb:0"
)

Step 4: Set Up a Main Wrapper Script

Create /opt/server-monitor/monitor.sh:

#!/bin/bash
# Main monitoring wrapper script

# Source configuration
source /opt/server-monitor/config.sh

# Create log directory if it doesn't exist
if [ ! -d "$LOG_DIR" ]; then
    mkdir -p "$LOG_DIR"
    chmod 755 "$LOG_DIR"
fi

# Run all monitoring scripts
/opt/server-monitor/scripts/cpu_mem_monitor.sh
/opt/server-monitor/scripts/disk_monitor.sh
/opt/server-monitor/scripts/log_monitor.sh
/opt/server-monitor/scripts/network_monitor.sh
/opt/server-monitor/scripts/service_monitor.sh

# Generate daily report (optional)
if [ "$(date +%H)" -eq "6" ]; then  # Run at 6 AM
    echo "Daily System Health Report - $(date '+%Y-%m-%d')" > "$LOG_DIR/daily_report.txt"
    echo "----------------------------------------" >> "$LOG_DIR/daily_report.txt"

    echo "CPU/Memory Summary:" >> "$LOG_DIR/daily_report.txt"
    grep "CPU:" "$CPU_MEM_LOG" | tail -24 >> "$LOG_DIR/daily_report.txt"

    echo -e "\nDisk Usage:" >> "$LOG_DIR/daily_report.txt"
    df -h >> "$LOG_DIR/daily_report.txt"

    echo -e "\nService Status:" >> "$LOG_DIR/daily_report.txt"
    for SERVICE in $(echo ${SERVICES[@]} | tr ' ' '\n' | cut -d: -f1); do
        systemctl status "$SERVICE" | head -3 >> "$LOG_DIR/daily_report.txt"
    done

    # Send daily report
    cat "$LOG_DIR/daily_report.txt" | mail -s "Daily System Health Report - $(hostname)" "$EMAIL_ADMIN"
fi

Step 5: Set Up the Cron Jobs

sudo crontab -e

Add these lines:

# Run the main monitoring script every 10 minutes
*/10 * * * * /opt/server-monitor/monitor.sh

# Run disk check only once per day (to avoid excessive alerts)
0 7 * * * /opt/server-monitor/scripts/disk_monitor.sh

# Rotate logs weekly
0 0 * * 0 find /opt/server-monitor/logs -type f -name "*.log" -exec cp {} {}.old \; -exec echo "" > {} \;

Conclusion

These five scripts provide a solid foundation for monitoring your Linux servers without relying on complex external tools. You can customize them to fit your specific needs and environment, and they're particularly useful for:

Small to medium-sized deployments
Development and staging environments
Personal projects where commercial monitoring is overkill
Learning how server monitoring works under the hood

As you become more comfortable with these scripts, you can extend them to perform more sophisticated monitoring and even integrate them with other tools like Prometheus or Grafana for visualization.

Remember that while these scripts are powerful, they're not a complete replacement for enterprise monitoring solutions in large, critical production environments. However, they can complement those solutions by providing a lightweight, customizable alternative that you fully control.

Next Steps

Add a simple web UI to display the monitoring data
Set up SMS alerts for critical issues
Create a central monitoring server that collects data from multiple servers
Add graphing capabilities using tools like gnuplot

What other monitoring scripts would you find useful? Let me know in the comments!

DEV Community

5 Custom Bash Scripts to Monitor Your Linux Server Performance

1. CPU and Memory Usage Monitor

How to Use This Script:

2. Disk Space Monitor

How to Use This Script:

3. Critical Log File Monitor

How to Use This Script:

4. Network Connection Monitor

How to Use This Script:

5. Service Status Monitor

How to Use This Script:

Setting Up as a Comprehensive Monitoring System

Step 1: Create a Central Directory

Step 2: Copy All Scripts

Step 3: Create a Central Configuration File

Step 4: Set Up a Main Wrapper Script

Step 5: Set Up the Cron Jobs

Conclusion

Next Steps

Top comments (0)