Linux Fundamentals and Shell Scripting¶

Linux powers the vast majority of servers, cloud instances, and containers. Proficiency with the Linux command line is a non-negotiable skill for software engineers. This chapter covers the Linux filesystem, essential commands, shell scripting, service management, and system administration.

Linux Filesystem Hierarchy¶

The Filesystem Hierarchy Standard (FHS) defines the directory structure of Linux systems. Understanding it helps you find configuration files, logs, and binaries:

/
├── bin/         → Essential user binaries (ls, cp, cat) — symlink to /usr/bin on modern systems
├── boot/        → Kernel and bootloader files (vmlinuz, grub)
├── dev/         → Device files (everything is a file: /dev/sda, /dev/null, /dev/urandom)
├── etc/         → System-wide configuration files (/etc/nginx/, /etc/ssh/, /etc/hosts)
├── home/        → User home directories (/home/alice, /home/bob)
├── lib/         → Shared libraries for /bin and /sbin
├── mnt/         → Temporary mount points
├── opt/         → Optional/third-party software (/opt/myapp)
├── proc/        → Virtual filesystem — process and kernel info (/proc/cpuinfo, /proc/PID/)
├── root/        → Root user's home directory
├── run/         → Runtime data (PID files, sockets) — cleared on reboot
├── sbin/        → System binaries (iptables, fdisk) — symlink to /usr/sbin on modern systems
├── srv/         → Service data (web server files, FTP data)
├── sys/         → Virtual filesystem — kernel and device info (sysfs)
├── tmp/         → Temporary files (cleared on reboot or by tmpwatch)
├── usr/         → User programs and data (read-only after install)
│   ├── bin/     → User binaries
│   ├── lib/     → Libraries
│   ├── local/   → Locally installed software (/usr/local/bin)
│   └── share/   → Architecture-independent data (man pages, docs)
└── var/         → Variable data (changes during operation)
    ├── log/     → Log files (/var/log/syslog, /var/log/nginx/)
    ├── cache/   → Application caches
    ├── lib/     → State data (databases, package manager state)
    └── tmp/     → Temporary files preserved between reboots

Everything Is a File¶

In Linux, almost everything is represented as a file: - Regular files: Text, binaries, images - Directories: Special files containing references to other files - Device files: /dev/sda (block device — disk), /dev/tty (character device — terminal) - Pseudo-files: /proc/cpuinfo (kernel information exposed as readable files) - Sockets: /var/run/docker.sock (inter-process communication) - Named pipes (FIFOs): For inter-process data streaming - Symbolic links: Pointers to other files

Links¶

# Hard link: another name for the same inode (same data on disk)
# Cannot cross filesystems, cannot link directories
ln original.txt hardlink.txt
# Both original.txt and hardlink.txt point to the same inode
# Deleting original.txt doesn't affect hardlink.txt (data persists until all links are removed)

# Symbolic (soft) link: pointer to a path (like a shortcut)
# Can cross filesystems, can link directories
ln -s /var/log/nginx/access.log ~/nginx-access.log
# If the target is moved/deleted, the symlink becomes a "dangling" link

Inodes¶

Every file on a Linux filesystem has an inode — a data structure storing metadata (permissions, ownership, timestamps, disk block locations) but NOT the filename. Directory entries map filenames to inode numbers.

ls -i file.txt          # Show inode number
stat file.txt           # Show full inode information
df -i                   # Show inode usage per filesystem (running out of inodes is possible!)

Essential Linux Commands¶

# Navigation
pwd                          # Print working directory
ls -la                       # List files with details and hidden files
ls -lhS                      # List sorted by size (human-readable)
ls -lt                       # List sorted by modification time (newest first)
cd /path/to/dir              # Change directory
cd ~                         # Go to home directory
cd -                         # Go to previous directory
pushd /tmp && popd           # Push/pop directory stack

# File operations
cp source dest               # Copy file
cp -r source/ dest/          # Copy directory recursively
cp -a source/ dest/          # Copy preserving all attributes (archive mode)
mv source dest               # Move/rename
rm file                      # Remove file
rm -rf directory/            # Remove directory recursively (DANGEROUS!)
mkdir -p path/to/dir         # Create nested directories
touch file.txt               # Create empty file or update timestamp

# Viewing files
cat file.txt                 # Print entire file
less file.txt                # Paginated view (q to quit, /pattern to search)
head -n 20 file.txt          # First 20 lines
tail -n 20 file.txt          # Last 20 lines
tail -f /var/log/app.log     # Follow log file in real-time
tail -F /var/log/app.log     # Follow + retry if file is rotated

# File information
file document.pdf            # Determine file type
wc -l file.txt               # Count lines
wc -w file.txt               # Count words
du -sh directory/            # Directory size (human-readable)
du -sh * | sort -rh | head   # Largest items in current directory
df -h                        # Disk usage of mounted filesystems

# Finding files
find / -name "*.log" -type f              # Find by name
find / -name "*.log" -mtime -7            # Modified in last 7 days
find / -size +100M -type f                # Files larger than 100MB
find . -name "*.pyc" -delete              # Find and delete
find . -name "*.py" -exec grep -l "TODO" {} \;  # Find files containing "TODO"
find . -type f -newer reference.txt       # Files newer than reference.txt
find . -maxdepth 2 -name "*.js"           # Limit search depth

# locate: faster alternative (uses database, updated by updatedb)
locate "*.conf"                           # Find all .conf files (from database)
sudo updatedb                             # Update the locate database

# which/whereis: find binaries
which python3                # Path to the executable
whereis nginx                # Binary, source, and man page locations
type ls                      # Show how a command is resolved (alias, builtin, file)

Text Processing¶

# grep — search for patterns
grep "error" app.log                    # Lines containing "error"
grep -i "error" app.log                 # Case-insensitive
grep -r "TODO" src/                     # Recursive search in directory
grep -rn "function" src/                # Recursive with line numbers
grep -c "error" app.log                 # Count matches
grep -E "error|warning" app.log         # Extended regex (OR)
grep -v "debug" app.log                 # Invert match (exclude lines)
grep -A 3 "Exception" app.log           # Show 3 lines After match
grep -B 2 "Exception" app.log           # Show 2 lines Before match
grep -C 2 "Exception" app.log           # Show 2 lines of Context (before+after)
grep -P '\d{4}-\d{2}-\d{2}' app.log    # Perl-compatible regex (PCRE)
grep -l "TODO" *.py                     # List filenames with matches only
grep --include="*.py" -r "import" .     # Recursive but only in .py files

# ripgrep (rg) — modern, faster alternative to grep
rg "error" app.log                      # Basic search (respects .gitignore)
rg -t py "import"                       # Search only Python files
rg -C 3 "Exception"                     # Context lines
rg --json "pattern"                     # Machine-readable JSON output

# sed — stream editor (find and replace)
sed 's/old/new/' file.txt               # Replace first occurrence per line
sed 's/old/new/g' file.txt              # Replace all occurrences
sed -i 's/old/new/g' file.txt           # In-place edit (modifies file)
sed -i.bak 's/old/new/g' file.txt      # In-place with backup (.bak)
sed -n '10,20p' file.txt                # Print lines 10-20
sed '/^#/d' config.txt                  # Delete lines starting with #
sed '/^$/d' file.txt                    # Delete blank lines
sed '1i\HEADER LINE' file.txt           # Insert line at beginning

# awk — pattern scanning and processing
awk '{print $1, $3}' file.txt           # Print 1st and 3rd columns
awk -F',' '{print $2}' data.csv         # Use comma as delimiter
awk '/error/ {count++} END {print count}' log.txt  # Count error lines
awk '{sum += $1} END {print sum}' nums.txt          # Sum a column
awk '{print NR": "$0}' file.txt         # Add line numbers
awk 'NR==10,NR==20' file.txt            # Print lines 10-20
awk -F: '$3 >= 1000 {print $1}' /etc/passwd  # Users with UID >= 1000

# Real-world awk: parse Nginx access log for slow requests
awk '$NF > 1.0 {print $7, $NF"s"}' access.log  # Requests taking > 1 second
# $NF = last field (request time), $7 = URL path

# sort, uniq, cut
sort file.txt                           # Sort lines alphabetically
sort -n file.txt                        # Sort numerically
sort -k2 -t',' file.txt                # Sort by 2nd column (comma-delimited)
sort -rh file.txt                       # Reverse, human-readable numbers
uniq                                     # Remove adjacent duplicates
sort file.txt | uniq -c | sort -rn      # Frequency count (most common first)
cut -d',' -f1,3 data.csv               # Extract columns 1 and 3
cut -c1-10 file.txt                     # Extract first 10 characters per line

# Other text tools
tr 'a-z' 'A-Z' < file.txt              # Translate characters (uppercase)
tr -d '\r' < windows.txt > unix.txt     # Remove carriage returns
tr -s ' ' < file.txt                    # Squeeze repeated spaces
tee output.log                          # Read stdin, write to stdout AND file
diff file1.txt file2.txt                # Show differences between files
diff -u file1.txt file2.txt             # Unified diff format (for patches)
comm <(sort file1) <(sort file2)        # Compare sorted files (unique/common lines)

# jq — JSON processing
echo '{"name":"Alice","age":30}' | jq '.name'        # "Alice"
echo '{"users":[{"id":1},{"id":2}]}' | jq '.users[].id'  # 1, 2
cat data.json | jq '.[] | select(.status == "active")'     # Filter objects
cat data.json | jq '[.[] | {name: .name, email: .email}]'  # Transform

# Combining tools with pipes
cat access.log | grep "POST" | awk '{print $1}' | sort | uniq -c | sort -rn | head -10
# → Top 10 IPs making POST requests

# Parse Apache/Nginx logs for 5xx errors by endpoint
grep " 5[0-9][0-9] " access.log | awk '{print $7}' | sort | uniq -c | sort -rn | head -20

Process Management¶

# Viewing processes
ps aux                       # All running processes
ps aux | grep python         # Filter for python processes
ps -ef --forest              # Process tree (shows parent-child relationships)
top                          # Interactive process viewer
htop                         # Better interactive viewer (if installed)
pgrep -la python             # Find processes by name

# Process control
kill PID                     # Send SIGTERM (graceful shutdown)
kill -9 PID                  # Send SIGKILL (force kill — last resort)
kill -HUP PID                # Send SIGHUP (reload config, common for Nginx)
kill -USR1 PID               # Send SIGUSR1 (application-defined, e.g., log rotation)
pkill -f "python app.py"     # Kill by command pattern
killall nginx                # Kill all processes by name

nohup command &              # Run command immune to hangups (survives logout)
jobs                         # List background jobs
fg %1                        # Bring job 1 to foreground
bg %1                        # Resume job 1 in background
disown %1                    # Detach job from terminal (survives terminal close)

Linux signals in detail:

Signal	Number	Default Action	Common Use
`SIGHUP`	1	Terminate	Reload config (Nginx, Apache), terminal disconnect
`SIGINT`	2	Terminate	Ctrl+C — interrupt from keyboard
`SIGQUIT`	3	Core dump	Ctrl+\ — quit with core dump (for debugging)
`SIGKILL`	9	Terminate	Force kill — cannot be caught or ignored
`SIGTERM`	15	Terminate	Graceful shutdown — process can clean up
`SIGUSR1`	10	Terminate	User-defined (app-specific: log rotation, debug toggle)
`SIGUSR2`	12	Terminate	User-defined
`SIGSTOP`	19	Stop	Pause process — cannot be caught (Ctrl+Z sends SIGTSTP instead)
`SIGCONT`	18	Continue	Resume stopped process (bg, fg)
`SIGCHLD`	17	Ignore	Child process terminated (parent notification)
`SIGPIPE`	13	Terminate	Write to a broken pipe (common in pipeline failures)

Process states (visible in ps and top):

State	Code	Description
Running	R	Actively executing on CPU
Sleeping	S	Waiting for an event (I/O, signal) — interruptible
Disk Sleep	D	Waiting for I/O — uninterruptible (can't be killed)
Stopped	T	Stopped by signal (SIGSTOP/SIGTSTP)
Zombie	Z	Process finished but parent hasn't read its exit status (wait())

Zombie processes: A zombie is a process that has completed execution but still has an entry in the process table because its parent hasn't called wait(). Zombies consume no CPU or memory but consume a PID. A few zombies are normal; many indicate a buggy parent process. Fix: kill the parent process (the init system will adopt and reap the children).

# System information
uname -a                     # System info (kernel, architecture)
uname -r                     # Kernel version
uptime                       # How long the system has been running + load averages
free -h                      # Memory usage (total, used, free, buffers/cache)
lscpu                        # CPU information
cat /etc/os-release          # Distribution information
hostnamectl                  # Hostname and OS info

Networking Commands¶

# Connectivity
ping -c 5 host               # Send 5 ICMP echo requests
curl -v https://api.example.com/health  # HTTP request with verbose output
curl -s -o /dev/null -w "%{http_code}" https://example.com  # Just get status code
curl -X POST -H "Content-Type: application/json" \
  -d '{"key": "value"}' https://api.example.com/data       # POST JSON
curl -L https://example.com  # Follow redirects
curl --retry 3 --retry-delay 5 https://api.example.com     # Retry on failure
wget -q https://example.com/file.tar.gz   # Download file (quiet mode)
wget -r -l 2 https://example.com/docs/    # Recursive download (2 levels deep)

# DNS
nslookup example.com         # DNS lookup
dig example.com              # Detailed DNS lookup
dig +short example.com       # Just the IP
dig @8.8.8.8 example.com     # Query specific DNS server
host example.com             # Simple DNS lookup

# Network inspection
ss -tlnp                     # Show listening TCP ports (modern replacement for netstat)
    # -t: TCP, -l: listening, -n: numeric (don't resolve names), -p: show process
ss -tunap                    # All TCP/UDP connections with processes
netstat -tlnp                # Legacy: show listening TCP ports
lsof -i :8080                # What process is using port 8080
lsof -i -P -n               # All network connections
traceroute example.com       # Trace route to host (ICMP)
mtr example.com              # Combines ping and traceroute (live updates)
ip addr show                 # Show network interfaces and IPs
ip route show                # Show routing table

# Packet capture (requires root)
tcpdump -i eth0 port 80                    # Capture HTTP traffic
tcpdump -i any -A 'port 8080' | head -100  # Capture with ASCII output
tcpdump -w capture.pcap -i eth0            # Save to file (open in Wireshark)

# Testing connectivity
nc -zv host 443              # Test if port 443 is open (netcat)
nc -l 8080                   # Listen on port 8080 (simple server)
echo "hello" | nc host 8080  # Send data to a port

Permissions and Ownership¶

# Permission format: rwxrwxrwx (owner-group-others)
# r=4, w=2, x=1

chmod 755 script.sh          # rwxr-xr-x (owner: full, group/others: read+execute)
chmod 644 file.txt           # rw-r--r-- (owner: read+write, group/others: read)
chmod +x script.sh           # Add execute permission
chmod -R 755 directory/      # Recursive permission change
chmod u+s binary             # Set SUID — run as file owner (e.g., /usr/bin/passwd)
chmod g+s directory/         # Set SGID — new files inherit group
chmod +t /tmp                # Set sticky bit — only owner can delete files

chown user:group file.txt    # Change owner and group
chown -R user:group dir/     # Recursive ownership change

Special permissions:

Permission	On Files	On Directories
SUID (4xxx)	Execute as file owner, not caller	No effect
SGID (2xxx)	Execute as file group	New files inherit directory's group
Sticky bit (1xxx)	No effect	Only file owner can delete files (e.g., /tmp)

# umask: default permission mask for new files
umask                        # Show current umask (usually 022)
umask 027                    # Set: new files get 750 (dirs) / 640 (files)
# New file permissions = 666 - umask (files), 777 - umask (directories)
# umask 022: files=644, dirs=755 (default)
# umask 027: files=640, dirs=750 (more restrictive)

# ACLs (Access Control Lists) — fine-grained permissions beyond owner/group/other
getfacl file.txt                           # View ACLs
setfacl -m u:alice:rw file.txt             # Give alice read+write
setfacl -m g:developers:rx directory/      # Give developers group read+execute
setfacl -R -m u:alice:rwx project/         # Recursive ACL
setfacl -b file.txt                        # Remove all ACLs

systemd and Service Management¶

systemd is the init system and service manager for modern Linux distributions. It manages the boot process, services, timers, mounts, and more.

Core Concepts¶

# Service management
systemctl start nginx             # Start a service
systemctl stop nginx              # Stop a service
systemctl restart nginx           # Restart (stop + start)
systemctl reload nginx            # Reload config without restart (graceful)
systemctl enable nginx            # Start on boot
systemctl disable nginx           # Don't start on boot
systemctl status nginx            # Show service status
systemctl is-active nginx         # Check if running (returns "active" or "inactive")
systemctl is-enabled nginx        # Check if enabled on boot

# List services
systemctl list-units --type=service              # All loaded services
systemctl list-units --type=service --state=running  # Running services
systemctl list-unit-files --type=service         # All installed services

# Failed services
systemctl --failed                # List failed services
systemctl reset-failed nginx      # Clear failed state

Writing Custom Service Files¶

# /etc/systemd/system/myapp.service
[Unit]
Description=My Application Server
Documentation=https://docs.myapp.com
After=network.target postgresql.service    # Start after these units
Requires=postgresql.service                # Hard dependency (fails if postgres fails)
Wants=redis.service                        # Soft dependency (starts but doesn't require)

[Service]
Type=notify                    # Service notifies systemd when ready
User=myapp                     # Run as this user (don't run as root!)
Group=myapp
WorkingDirectory=/opt/myapp
Environment="NODE_ENV=production"
EnvironmentFile=/opt/myapp/.env     # Load environment from file

ExecStartPre=/opt/myapp/migrate.sh  # Run before starting
ExecStart=/usr/bin/node /opt/myapp/server.js
ExecReload=/bin/kill -HUP $MAINPID  # How to reload

# Restart policy
Restart=on-failure                   # Restart on non-zero exit
RestartSec=5                         # Wait 5 seconds before restart
StartLimitBurst=5                    # Max 5 restarts...
StartLimitIntervalSec=300            # ...within 300 seconds

# Resource limits
LimitNOFILE=65535                    # Max open file descriptors
MemoryMax=512M                       # Memory limit (cgroup)
CPUQuota=200%                        # CPU limit (200% = 2 cores)

# Security hardening
NoNewPrivileges=yes
ProtectSystem=strict                 # Read-only filesystem (except specified)
ProtectHome=yes                      # No access to /home
ReadWritePaths=/opt/myapp/data /var/log/myapp
PrivateTmp=yes                       # Isolated /tmp

[Install]
WantedBy=multi-user.target           # Enable in multi-user mode (normal boot)

# After creating/modifying a service file:
sudo systemctl daemon-reload         # Reload systemd to pick up changes
sudo systemctl enable --now myapp    # Enable and start in one command

systemd Timers (Modern Cron)¶

# /etc/systemd/system/backup.timer
[Unit]
Description=Daily backup timer

[Timer]
OnCalendar=*-*-* 03:00:00       # Every day at 3 AM
# Other formats:
# OnCalendar=hourly              # Every hour
# OnCalendar=Mon *-*-* 09:00:00  # Every Monday at 9 AM
# OnCalendar=*-*-01 00:00:00     # First of every month
Persistent=true                  # Run immediately if missed (system was off)
RandomizedDelaySec=300           # Random delay up to 5 min (prevent thundering herd)

[Install]
WantedBy=timers.target

# /etc/systemd/system/backup.service
[Unit]
Description=Daily backup job

[Service]
Type=oneshot                     # Run once and exit
ExecStart=/opt/scripts/backup.sh
User=backup

systemctl enable --now backup.timer  # Enable and start the timer
systemctl list-timers               # Show all active timers
systemctl status backup.timer       # Timer status (next run time)
journalctl -u backup.service        # View backup job logs

journalctl (systemd Logs)¶

# Viewing logs
journalctl -u nginx                 # Logs for a specific service
journalctl -u nginx --since "1 hour ago"  # Time-filtered
journalctl -u nginx --since "2025-01-15" --until "2025-01-16"
journalctl -u nginx -f              # Follow (like tail -f)
journalctl -u nginx -n 50           # Last 50 lines
journalctl -p err                   # Only error-level and above
journalctl -p warning -u nginx      # Warnings and above for nginx
journalctl --no-pager               # Don't use pager (for scripting)
journalctl -o json-pretty -u nginx  # JSON output (for parsing)
journalctl --disk-usage             # How much disk logs are using
journalctl --vacuum-size=500M       # Trim logs to 500MB
journalctl -b                       # Logs from current boot
journalctl -b -1                    # Logs from previous boot

SSH and Remote Access¶

SSH Key Management¶

# Generate SSH key pair (Ed25519 is recommended — faster, smaller, more secure than RSA)
ssh-keygen -t ed25519 -C "alice@company.com"
# Creates: ~/.ssh/id_ed25519 (private) and ~/.ssh/id_ed25519.pub (public)

# For systems that don't support Ed25519:
ssh-keygen -t rsa -b 4096 -C "alice@company.com"

# Copy public key to remote server
ssh-copy-id -i ~/.ssh/id_ed25519.pub user@server

# SSH agent (avoids typing passphrase repeatedly)
eval "$(ssh-agent -s)"          # Start agent
ssh-add ~/.ssh/id_ed25519       # Add key to agent
ssh-add -l                      # List loaded keys

SSH Config File¶

# ~/.ssh/config — Simplify SSH connections

# Jump through a bastion host
Host production-*
    User deploy
    ProxyJump bastion

Host bastion
    HostName bastion.example.com
    User admin
    IdentityFile ~/.ssh/id_ed25519
    ForwardAgent yes

Host production-web
    HostName 10.0.1.10

Host production-db
    HostName 10.0.2.10

# Default settings for all hosts
Host *
    ServerAliveInterval 60        # Send keepalive every 60 seconds
    ServerAliveCountMax 3         # Disconnect after 3 missed keepalives
    AddKeysToAgent yes            # Auto-add keys to ssh-agent
    IdentitiesOnly yes            # Only use specified identity files
    StrictHostKeyChecking ask     # Prompt on new host keys

# Now you can simply:
ssh production-web               # Instead of: ssh -i key -J bastion deploy@10.0.1.10

Port Forwarding (Tunneling)¶

# Local port forwarding: access remote service through local port
ssh -L 5432:db-server:5432 bastion
# Now localhost:5432 connects to db-server:5432 through bastion
# Use case: Connect to a database that's only accessible from within the VPC

# Remote port forwarding: expose local service to remote
ssh -R 8080:localhost:3000 server
# server:8080 now forwards to your localhost:3000
# Use case: Show your local development server to someone on the remote server

# Dynamic port forwarding (SOCKS proxy)
ssh -D 1080 server
# Creates a SOCKS proxy on localhost:1080 — all traffic through the tunnel
# Use case: Browse the web as if you were on the remote server

# File transfer
scp file.txt user@server:/path/        # Copy file to remote
scp user@server:/path/file.txt .       # Copy file from remote
scp -r directory/ user@server:/path/   # Copy directory recursively

# rsync: better than scp (incremental, resume, compression)
rsync -avz --progress local/ user@server:/path/
# -a: archive (preserves permissions, timestamps, symlinks)
# -v: verbose
# -z: compress during transfer
# --progress: show progress
# --delete: delete files on destination that don't exist on source (mirror)
rsync -avz --exclude='node_modules' --exclude='.git' . server:/deploy/

SSH Security Hardening¶

# /etc/ssh/sshd_config — Key security settings

PermitRootLogin no               # Never allow root SSH login
PasswordAuthentication no         # Disable password auth (keys only)
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys
MaxAuthTries 3
LoginGraceTime 30                # 30 seconds to authenticate
AllowUsers deploy admin          # Whitelist specific users
ClientAliveInterval 300          # Disconnect idle sessions after 5 min
ClientAliveCountMax 2
X11Forwarding no
PermitEmptyPasswords no

Shell Scripting (Bash)¶

Script Safety¶

#!/bin/bash
set -euo pipefail  # The holy trinity of bash safety

# set -e: Exit immediately if a command exits with non-zero status
# Without -e: script continues after errors, potentially causing damage

# set -u: Treat unset variables as errors
# Without -u: $UNDEFINED silently expands to empty string

# set -o pipefail: Pipeline fails if ANY command in the pipe fails
# Without pipefail: only the last command's exit code matters
# Example: `false | true` exits 0 without pipefail, exits 1 with it

Trap for Cleanup¶

#!/bin/bash
set -euo pipefail

TMPDIR=$(mktemp -d)

# Cleanup function — runs on EXIT (normal or error)
cleanup() {
    echo "Cleaning up temporary files..."
    rm -rf "$TMPDIR"
}
trap cleanup EXIT    # Always runs, even if script fails

# trap can catch specific signals:
trap 'echo "Caught SIGINT"; exit 1' INT        # Ctrl+C
trap 'echo "Caught SIGTERM"; exit 1' TERM       # kill command
trap 'echo "Error on line $LINENO"; exit 1' ERR # Any error (with set -e)

# Now use $TMPDIR safely — cleanup happens automatically
cp important_file "$TMPDIR/"
process_file "$TMPDIR/important_file"

Argument Parsing with getopts¶

#!/bin/bash
set -euo pipefail

usage() {
    echo "Usage: $0 [-v] [-n count] [-o output_file] input_file"
    echo "  -v          Verbose mode"
    echo "  -n count    Number of iterations (default: 1)"
    echo "  -o file     Output file (default: stdout)"
    echo "  -h          Show this help"
    exit 1
}

VERBOSE=false
COUNT=1
OUTPUT="/dev/stdout"

while getopts "vn:o:h" opt; do
    case $opt in
        v) VERBOSE=true ;;
        n) COUNT="$OPTARG" ;;
        o) OUTPUT="$OPTARG" ;;
        h) usage ;;
        *) usage ;;
    esac
done

# Shift past parsed options to get positional arguments
shift $((OPTIND - 1))

# Validate required arguments
if [[ $# -lt 1 ]]; then
    echo "Error: input_file is required" >&2
    usage
fi

INPUT_FILE="$1"

if [[ "$VERBOSE" == true ]]; then
    echo "Processing $INPUT_FILE ($COUNT iterations) → $OUTPUT"
fi

I/O Redirection Deep Dive¶

# File descriptors: 0=stdin, 1=stdout, 2=stderr

command > file.txt       # Redirect stdout to file (overwrite)
command >> file.txt      # Redirect stdout to file (append)
command 2> error.log     # Redirect stderr to file
command > out.log 2>&1   # Redirect both stdout and stderr to same file
command &> all.log       # Shorthand: redirect both stdout and stderr
command 2>/dev/null      # Discard stderr
command > /dev/null 2>&1 # Discard all output

# Process substitution: treat command output as a file
diff <(sort file1.txt) <(sort file2.txt)    # Compare sorted versions
# <(command) creates a temporary file descriptor containing command's output

# Here documents (multi-line input)
cat << 'EOF' > config.yaml
database:
  host: localhost
  port: 5432
  name: myapp
EOF
# Single quotes around EOF prevent variable expansion inside

# Here strings (single-line input)
grep "error" <<< "$log_output"

# Named pipes (FIFOs) — for inter-process communication
mkfifo /tmp/mypipe
producer_command > /tmp/mypipe &   # Producer writes to pipe (blocks until reader)
consumer_command < /tmp/mypipe     # Consumer reads from pipe
rm /tmp/mypipe

Practical Deployment Script¶

#!/bin/bash
# Script: deploy.sh — Production deployment with safety features

set -euo pipefail

# --- Variables ---
APP_NAME="myapp"
DEPLOY_DIR="/opt/${APP_NAME}"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/opt/backups/${APP_NAME}/${TIMESTAMP}"
LOG_FILE="/var/log/${APP_NAME}/deploy.log"
HEALTH_URL="http://localhost:8080/health"
MAX_HEALTH_RETRIES=30

# --- Functions ---
log() {
    echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
}

rollback() {
    log "ERROR: Deployment failed. Rolling back..."
    if [[ -d "$BACKUP_DIR" ]]; then
        rm -rf "$DEPLOY_DIR"
        mv "$BACKUP_DIR" "$DEPLOY_DIR"
        log "Rollback complete."
    else
        log "ERROR: No backup found. Manual intervention required."
    fi
    exit 1
}

# --- Set trap for rollback on failure ---
trap rollback ERR

# --- Main deployment ---
log "Starting deployment of ${APP_NAME}..."

# 1. Check prerequisites
for cmd in docker curl; do
    if ! command -v "$cmd" &> /dev/null; then
        log "ERROR: $cmd is not installed"
        exit 1
    fi
done

# 2. Backup current deployment
log "Backing up current deployment..."
mkdir -p "$BACKUP_DIR"
cp -r "$DEPLOY_DIR"/* "$BACKUP_DIR"/ 2>/dev/null || true

# 3. Pull latest image
log "Pulling latest Docker image..."
docker pull "registry.example.com/${APP_NAME}:latest"

# 4. Run database migrations
log "Running database migrations..."
docker run --rm \
    --env-file "${DEPLOY_DIR}/.env" \
    "registry.example.com/${APP_NAME}:latest" \
    python manage.py migrate

# 5. Restart services
log "Restarting services..."
docker compose -f "${DEPLOY_DIR}/docker-compose.yml" up -d

# 6. Health check with retry
log "Running health check..."
for i in $(seq 1 "$MAX_HEALTH_RETRIES"); do
    if curl -sf "$HEALTH_URL" > /dev/null 2>&1; then
        log "Health check passed on attempt $i!"
        break
    fi
    if [[ "$i" -eq "$MAX_HEALTH_RETRIES" ]]; then
        log "ERROR: Health check failed after $MAX_HEALTH_RETRIES attempts"
        exit 1  # Will trigger rollback via trap
    fi
    log "Health check attempt $i/$MAX_HEALTH_RETRIES failed, retrying in 2s..."
    sleep 2
done

log "Deployment of ${APP_NAME} complete!"

Key Bash Constructs Reference¶

# Conditionals
if [[ -f "file.txt" ]]; then
    echo "File exists"
elif [[ -d "directory" ]]; then
    echo "Directory exists"
else
    echo "Neither exists"
fi

# Common test operators:
# -f file    → file exists and is a regular file
# -d dir     → directory exists
# -e path    → path exists (file or directory)
# -r file    → file is readable
# -w file    → file is writable
# -x file    → file is executable
# -s file    → file is not empty
# -z "$var"  → variable is empty
# -n "$var"  → variable is not empty
# -eq, -ne, -lt, -gt, -le, -ge → numeric comparisons
# ==, != → string comparisons (inside [[ ]])
# =~ → regex match (inside [[ ]])

# Regex matching
if [[ "$input" =~ ^[0-9]+$ ]]; then
    echo "$input is a number"
fi

# Loops
for file in *.log; do
    echo "Processing $file"
    gzip "$file"
done

for i in {1..10}; do
    echo "Iteration $i"
done

for server in web{1..5}; do
    ssh "$server" "sudo systemctl restart nginx"
done

while read -r line; do
    echo "Line: $line"
done < input.txt

# Read CSV with custom delimiter
while IFS=',' read -r name email role; do
    echo "User: $name, Email: $email, Role: $role"
done < users.csv

# Arrays
declare -a servers=("web1" "web2" "web3")
echo "${servers[0]}"              # First element
echo "${servers[@]}"              # All elements
echo "${#servers[@]}"             # Array length
servers+=("web4")                 # Append
for server in "${servers[@]}"; do
    ssh "$server" "sudo systemctl restart nginx"
done

# Associative arrays (bash 4+)
declare -A colors
colors[red]="#FF0000"
colors[green]="#00FF00"
echo "${colors[red]}"

# String operations
name="hello-world"
echo "${name^^}"           # HELLO-WORLD (uppercase)
echo "${name,,}"           # hello-world (lowercase)
echo "${name/hello/hi}"    # hi-world (substitution)
echo "${name:0:5}"         # hello (substring)
echo "${#name}"            # 11 (length)
echo "${name%.txt}"        # Remove shortest suffix match (file.txt → file)
echo "${name##*.}"         # Remove longest prefix match (file.tar.gz → gz)

# Default values
echo "${UNDEFINED_VAR:-default}"     # Use default if unset
echo "${UNDEFINED_VAR:=default}"     # Set AND use default if unset

Cron and Scheduled Tasks¶

Crontab Syntax¶

# ┌───────────── minute (0-59)
# │ ┌───────────── hour (0-23)
# │ │ ┌───────────── day of month (1-31)
# │ │ │ ┌───────────── month (1-12)
# │ │ │ │ ┌───────────── day of week (0-7, 0 and 7 = Sunday)
# │ │ │ │ │
# * * * * * command

# Edit crontab
crontab -e                    # Edit current user's crontab
crontab -l                    # List current user's crontab
sudo crontab -u alice -e      # Edit alice's crontab

# Common patterns
0 * * * *     /opt/scripts/hourly.sh          # Every hour on the hour
0 3 * * *     /opt/scripts/daily-backup.sh    # Daily at 3 AM
0 0 * * 0     /opt/scripts/weekly-cleanup.sh  # Weekly on Sunday midnight
0 0 1 * *     /opt/scripts/monthly-report.sh  # Monthly on the 1st
*/5 * * * *   /opt/scripts/check-health.sh    # Every 5 minutes
0 9-17 * * 1-5 /opt/scripts/business-hours.sh # Hourly during business hours, weekdays

# CRITICAL: Cron gotchas
# 1. PATH: Cron has a minimal PATH. Always use full paths:
0 * * * *  /usr/bin/python3 /opt/scripts/process.py

# 2. Environment: Cron doesn't source .bashrc. Set variables explicitly:
SHELL=/bin/bash
PATH=/usr/local/bin:/usr/bin:/bin
MAILTO=ops@company.com

# 3. Output: Capture output or it goes to email/nowhere:
0 * * * *  /opt/scripts/backup.sh >> /var/log/backup.log 2>&1

# 4. Locking: Prevent overlapping runs:
*/5 * * * *  flock -n /tmp/myjob.lock /opt/scripts/slow-job.sh

Disk and Storage Management¶

# Disk usage
df -h                        # Filesystem usage (human-readable)
df -ih                       # Inode usage (running out of inodes is possible!)
du -sh /var/log/*            # Size of each item in /var/log
du -sh --max-depth=1 /       # Size of top-level directories
ncdu /                       # Interactive disk usage explorer (if installed)

# Disk information
lsblk                        # List block devices (disks, partitions)
fdisk -l                     # List all partitions
blkid                        # Show block device UUIDs and types

# Mount/unmount
mount /dev/sdb1 /mnt/data    # Mount a partition
umount /mnt/data             # Unmount
mount -o remount,rw /        # Remount root as read-write

# fstab: permanent mounts
# /etc/fstab format: device  mountpoint  type  options  dump  pass
# UUID=abc-123  /data  ext4  defaults,noatime  0  2

# Filesystem operations
mkfs.ext4 /dev/sdb1          # Create ext4 filesystem
fsck /dev/sdb1               # Filesystem check (unmount first!)
tune2fs -l /dev/sda1         # Show filesystem parameters

RAID levels overview:

Level	Description	Min Disks	Capacity	Fault Tolerance	Use Case
RAID 0	Striping (speed, no redundancy)	2	100%	None	Temporary data, scratch space
RAID 1	Mirroring (redundancy)	2	50%	1 disk failure	Boot drives, critical data
RAID 5	Striping + parity	3	(N-1)/N	1 disk failure	General purpose storage
RAID 6	Striping + double parity	4	(N-2)/N	2 disk failures	Large arrays
RAID 10	Mirrored stripes	4	50%	1 per mirror pair	Databases (performance + safety)

Environment and Shell Configuration¶

Shell Startup Sequence¶

# Login shell (ssh, console login):
#   /etc/profile → ~/.bash_profile → ~/.bash_login → ~/.profile

# Non-login interactive shell (opening a terminal in GUI):
#   /etc/bash.bashrc → ~/.bashrc

# Non-interactive shell (scripts):
#   $BASH_ENV (if set)

# Best practice: Put configuration in ~/.bashrc and source it from ~/.bash_profile:
# ~/.bash_profile:
if [ -f ~/.bashrc ]; then
    source ~/.bashrc
fi

Useful .bashrc Configuration¶

# ~/.bashrc

# --- Aliases ---
alias ll='ls -lah'
alias la='ls -A'
alias grep='grep --color=auto'
alias ..='cd ..'
alias ...='cd ../..'
alias k='kubectl'
alias g='git'
alias dc='docker compose'

# --- Functions ---
# Create and cd into directory
mkcd() { mkdir -p "$1" && cd "$1"; }

# Quick HTTP server
serve() { python3 -m http.server "${1:-8000}"; }

# Extract any archive
extract() {
    case "$1" in
        *.tar.bz2) tar xjf "$1" ;;
        *.tar.gz)  tar xzf "$1" ;;
        *.tar.xz)  tar xJf "$1" ;;
        *.bz2)     bunzip2 "$1" ;;
        *.gz)      gunzip "$1" ;;
        *.tar)     tar xf "$1" ;;
        *.zip)     unzip "$1" ;;
        *)         echo "Unknown format: $1" ;;
    esac
}

# --- PATH ---
export PATH="$HOME/.local/bin:$HOME/go/bin:$PATH"

# --- Prompt ---
export PS1='\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]\$ '

# --- History ---
export HISTSIZE=10000
export HISTFILESIZE=20000
export HISTCONTROL=ignoreboth:erasedups  # Ignore duplicates and spaces
shopt -s histappend                       # Append to history, don't overwrite

Log Management¶

System Logs¶

Log File	Contents
`/var/log/syslog` (Debian) / `/var/log/messages` (RHEL)	General system messages
`/var/log/auth.log` / `/var/log/secure`	Authentication logs (SSH, sudo)
`/var/log/kern.log`	Kernel messages
`/var/log/nginx/access.log`	Nginx access logs
`/var/log/nginx/error.log`	Nginx error logs
`/var/log/apt/` / `/var/log/yum.log`	Package manager logs
`/var/log/dmesg`	Kernel ring buffer (hardware, drivers)

Log Rotation¶

# /etc/logrotate.d/myapp — logrotate configuration
/var/log/myapp/*.log {
    daily                    # Rotate daily
    missingok                # Don't error if log is missing
    rotate 14                # Keep 14 rotated files
    compress                 # gzip old logs
    delaycompress            # Don't compress the most recent rotated file
    notifempty               # Don't rotate if empty
    create 0640 myapp myapp  # Create new log with these permissions
    sharedscripts            # Run postrotate only once (not per file)
    postrotate
        systemctl reload myapp  # Notify app to reopen log files
    endscript
}

System Monitoring and Troubleshooting¶

# CPU and load
top                          # Interactive process viewer
htop                         # Better interactive viewer
uptime                       # Load averages (1, 5, 15 minutes)
mpstat -P ALL 1              # Per-CPU utilization every second
vmstat 1                     # System overview (memory, swap, I/O, CPU)

# Memory
free -h                      # Memory overview
vmstat -s                    # Detailed memory stats
cat /proc/meminfo            # Raw memory information
slabtop                      # Kernel memory cache usage

# Disk I/O
iostat -xz 1                 # Disk I/O stats (extended, skip zeros, every 1 sec)
iotop                        # Top for disk I/O (see which process is doing I/O)

# System calls
strace -p PID                # Trace system calls of running process
strace -e trace=open,read command    # Trace specific syscalls
strace -c command            # Count system calls (summary)
ltrace command               # Trace library calls

# Open files and connections
lsof -p PID                  # All files opened by process
lsof -i :8080                # What's using port 8080
lsof -u alice                # All files opened by user alice

# Kernel messages
dmesg                        # Kernel ring buffer
dmesg -T                     # With human-readable timestamps
dmesg --level=err,warn       # Only errors and warnings
dmesg -w                     # Follow (like tail -f)

# /proc and /sys pseudo-filesystems
cat /proc/cpuinfo            # CPU details
cat /proc/meminfo            # Memory details
cat /proc/loadavg            # Load averages
cat /proc/PID/status         # Process details
cat /proc/PID/fd/            # File descriptors for a process
cat /proc/net/tcp            # Active TCP connections
echo 3 > /proc/sys/vm/drop_caches   # Drop filesystem caches (root only)

Useful Shell One-Liners¶

# Find large files (> 100MB)
find / -type f -size +100M 2>/dev/null | head -20

# Find files modified in last 24 hours
find /var/log -mmin -1440 -type f

# Watch a command output (refresh every 2 seconds)
watch -n 2 'docker ps'

# Parallel execution with xargs
cat servers.txt | xargs -P 4 -I {} ssh {} "uptime"

# Quick HTTP server (Python)
python3 -m http.server 8000

# JSON processing with jq
curl -s https://api.github.com/users/octocat | jq '.name, .location'

# Monitor disk space
while true; do df -h / | tail -1; sleep 60; done

# Find and replace in multiple files
find . -name "*.py" -exec sed -i 's/old_function/new_function/g' {} +

# Count lines of code by language
find . -name "*.py" | xargs wc -l | sort -rn | head -20

# Show TCP connections by state
ss -tan | awk '{print $1}' | sort | uniq -c | sort -rn

# Kill all processes matching a pattern
pgrep -f "zombie_process" | xargs kill -9

# Monitor a log file for errors and alert
tail -F /var/log/app.log | grep --line-buffered "ERROR" | while read -r line; do
    echo "$line" | mail -s "Error Alert" ops@company.com
done

# Create a compressed archive excluding certain directories
tar czf backup.tar.gz --exclude='node_modules' --exclude='.git' ./project/

# Quick benchmarking (average response time)
for i in {1..100}; do
    curl -s -o /dev/null -w "%{time_total}\n" https://api.example.com/health
done | awk '{sum+=$1} END {print "avg: " sum/NR "s"}'

Security Hardening Basics¶

# Firewall (ufw — simple frontend for iptables)
sudo ufw enable
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow ssh                    # Allow SSH (port 22)
sudo ufw allow 80/tcp                 # Allow HTTP
sudo ufw allow 443/tcp                # Allow HTTPS
sudo ufw allow from 10.0.0.0/8 to any port 5432  # PostgreSQL from private network only
sudo ufw status verbose               # Show current rules

# fail2ban — ban IPs with too many failed login attempts
sudo apt install fail2ban
# /etc/fail2ban/jail.local:
# [sshd]
# enabled = true
# maxretry = 3
# bantime = 3600      # Ban for 1 hour
# findtime = 600      # Within 10 minutes

# Automatic security updates (Debian/Ubuntu)
sudo apt install unattended-upgrades
sudo dpkg-reconfigure unattended-upgrades

# User management
sudo useradd -m -s /bin/bash alice    # Create user with home dir and bash shell
sudo passwd alice                      # Set password
sudo usermod -aG docker alice          # Add user to group
sudo userdel -r alice                  # Delete user and home directory
sudo passwd -l alice                   # Lock account (disable login)

# Audit who logged in
last                                   # Recent logins
lastb                                  # Failed login attempts
who                                    # Currently logged in users
w                                      # Who is logged in and what they're doing