Website Downtime Troubleshooting: Find the Root Cause Fast

A website outage costs money and erodes user trust. When a site goes down, the first priority is diagnosing the root cause as fast as possible. Is it DNS? SSL? The server? The network? Each failure type has distinct symptoms and a different fix. This guide provides a systematic diagnostic approach.

Step 1: Is It Down for Everyone or Just You?

First, determine whether the outage is global or local to you:

Use ShowDNS Ping Tool to check if the server responds from multiple global locations.
Ask a colleague on a different network to try accessing the site.
Try from your mobile phone on cellular data (bypasses your home/office network).

If the site loads on mobile data but not on your network, the problem is your network or local DNS — not the website.

Step 2: Check DNS Resolution

DNS failures prevent any connection from being made:

bash

# Check if the domain resolves at all
nslookup example.com

# Test with multiple resolvers
nslookup example.com 8.8.8.8   # Google
nslookup example.com 1.1.1.1   # Cloudflare

# Use dig for more detail
dig example.com A +short

DNS Result	Likely Cause	Next Step
NXDOMAIN (domain not found)	Domain expired, DNS misconfiguration, or record deleted	Check domain expiry; check A/CNAME records
SERVFAIL	Nameserver error, DNSSEC failure, or all NS unreachable	Check nameserver health; check DNSSEC
Returns wrong IP	DNS hijacking, stale cache, or misconfigured record	Flush DNS cache; verify A record at authoritative NS
Resolves correctly	DNS is fine — problem is at the server	Proceed to Step 3

Use the ShowDNS DNS Lookup tool to check DNS from multiple global locations simultaneously.

Step 3: Check Server Connectivity

If DNS resolves but the site doesn't load, test if the server itself is reachable:

bash

# Get the server IP
SERVER_IP=$(dig example.com A +short)

# Ping the server
ping -c 4 $SERVER_IP

# Check if port 80 (HTTP) is open
nc -zv $SERVER_IP 80

# Check if port 443 (HTTPS) is open
nc -zv $SERVER_IP 443

# Try a direct HTTP request
curl -v http://$SERVER_IP -H "Host: example.com"
curl -v https://$SERVER_IP -H "Host: example.com" --insecure

If ping succeeds but port 80/443 is closed: the web server is not running. Log in to the server and check.

If ping fails: the server is unreachable. Check firewall rules, server status, and hosting provider status page.

Step 4: Check Web Server Logs

SSH into the server and examine logs for errors:

bash

# Check Nginx status
sudo systemctl status nginx

# Check Nginx error log
sudo tail -50 /var/log/nginx/error.log

# Check Apache status
sudo systemctl status apache2

# Check Apache error log
sudo tail -50 /var/log/apache2/error.log

# Check if web server process is running
ps aux | grep nginx
ps aux | grep apache2

Step 5: Check SSL Certificate

An expired or invalid SSL certificate causes browsers to block access to the site entirely:

bash

# Check certificate expiry
openssl s_client -connect example.com:443 2>/dev/null | openssl x509 -noout -dates

# Check for SSL errors
openssl s_client -connect example.com:443 2>/dev/null | head -20

Use the ShowDNS SSL Checker for a quick check. If the certificate is expired, see the SSL error fix guide.

Step 6: Check Hosting Provider Status

Infrastructure outages at your hosting provider cause downtime that you cannot fix yourself:

Check your hosting provider's status page (usually at status.provider.com).
Check social media for reports from other customers.
Check cloud provider dashboards (AWS Health Dashboard, Google Cloud Status, Azure Status).
Open a support ticket if a provider-side issue is confirmed.

Set up uptime monitoringUse an uptime monitoring service to alert you within seconds of any downtime. This gives you a head start before users report the issue. Many services offer free tiers with 1-minute check intervals.

Step 7: Common Downtime Causes and Quick Fixes

Domain Expired

bash

whois example.com | grep -i "expir"
# If expired — log in to registrar and renew immediately

Server Disk Full

bash

# Check disk usage
df -h

# Find large files
du -sh /* 2>/dev/null | sort -rh | head -20

# Quick fix — clear old logs or temp files
sudo journalctl --vacuum-time=7d
sudo apt clean   # Ubuntu/Debian

Out of Memory (OOM)

bash

# Check memory usage
free -h

# Check if web server was OOM-killed
sudo dmesg | grep -i "oom|killed"

# Restart web server
sudo systemctl restart nginx
# or
sudo systemctl restart apache2

Web Server Crash

bash

# Restart web server
sudo systemctl restart nginx

# Verify it started correctly
sudo systemctl status nginx
sudo nginx -t   # Test config before restart

Firewall Blocking Traffic

bash

# Check UFW rules (Ubuntu)
sudo ufw status

# Allow HTTP and HTTPS if blocked
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp

# Check iptables rules
sudo iptables -L -n | grep -E "80|443"

Building a Post-Incident Checklist

After resolving downtime, run a post-incident review:

What was the root cause?
How long did the outage last?
How was it detected (user report vs monitoring)?
What can prevent this class of issue in the future?
Are monitoring and alerting in place for this failure mode?

Frequently Asked Questions

My site loads for me but not for some users — why?

This is typically a DNS propagation issue (different resolvers serving different IPs), a CDN caching issue serving a cached error page to some users, or geographic routing that sends different users to different servers with different health.

The server is running but the site still shows an error — why?

Common causes: the web server is running but the application layer (PHP-FPM, Node.js, Docker container) crashed; a database is down; a configuration change introduced a syntax error; or a deployment broke the application code. Check application logs, not just web server logs.

How do I recover if I cannot SSH into the server?

Use your hosting provider's out-of-band access: VPS providers typically offer a rescue mode or console access that bypasses the network. Cloud providers offer serial console access. This lets you diagnose issues even if SSH is down.

Step 1: Is It Down for Everyone or Just You?

Step 2: Check DNS Resolution

Step 3: Check Server Connectivity

Step 4: Check Web Server Logs

Step 5: Check SSL Certificate

Step 6: Check Hosting Provider Status

Step 7: Common Downtime Causes and Quick Fixes

Domain Expired

Server Disk Full

Out of Memory (OOM)

Web Server Crash

Firewall Blocking Traffic

Building a Post-Incident Checklist

Frequently Asked Questions

My site loads for me but not for some users — why?

The server is running but the site still shows an error — why?

How do I recover if I cannot SSH into the server?

Related Articles