Website Downtime Troubleshooting: Find the Root Cause Fast

When a website goes down, every minute matters. This guide provides a systematic approach to diagnosing the root cause — whether it's DNS, SSL, server, or network — and fixing it fast.


A website outage costs money and erodes user trust. When a site goes down, the first priority is diagnosing the root cause as fast as possible. Is it DNS? SSL? The server? The network? Each failure type has distinct symptoms and a different fix. This guide provides a systematic diagnostic approach.

Step 1: Is It Down for Everyone or Just You?

First, determine whether the outage is global or local to you:

  • Use ShowDNS Ping Tool to check if the server responds from multiple global locations.
  • Ask a colleague on a different network to try accessing the site.
  • Try from your mobile phone on cellular data (bypasses your home/office network).

If the site loads on mobile data but not on your network, the problem is your network or local DNS — not the website.

Step 2: Check DNS Resolution

DNS failures prevent any connection from being made:

bash
# Check if the domain resolves at all nslookup example.com # Test with multiple resolvers nslookup example.com 8.8.8.8 # Google nslookup example.com 1.1.1.1 # Cloudflare # Use dig for more detail dig example.com A +short
DNS ResultLikely CauseNext Step
NXDOMAIN (domain not found)Domain expired, DNS misconfiguration, or record deletedCheck domain expiry; check A/CNAME records
SERVFAILNameserver error, DNSSEC failure, or all NS unreachableCheck nameserver health; check DNSSEC
Returns wrong IPDNS hijacking, stale cache, or misconfigured recordFlush DNS cache; verify A record at authoritative NS
Resolves correctlyDNS is fine — problem is at the serverProceed to Step 3

Use the ShowDNS DNS Lookup tool to check DNS from multiple global locations simultaneously.

Step 3: Check Server Connectivity

If DNS resolves but the site doesn't load, test if the server itself is reachable:

bash
# Get the server IP SERVER_IP=$(dig example.com A +short) # Ping the server ping -c 4 $SERVER_IP # Check if port 80 (HTTP) is open nc -zv $SERVER_IP 80 # Check if port 443 (HTTPS) is open nc -zv $SERVER_IP 443 # Try a direct HTTP request curl -v http://$SERVER_IP -H "Host: example.com" curl -v https://$SERVER_IP -H "Host: example.com" --insecure

If ping succeeds but port 80/443 is closed: the web server is not running. Log in to the server and check.

If ping fails: the server is unreachable. Check firewall rules, server status, and hosting provider status page.

Step 4: Check Web Server Logs

SSH into the server and examine logs for errors:

bash
# Check Nginx status sudo systemctl status nginx # Check Nginx error log sudo tail -50 /var/log/nginx/error.log # Check Apache status sudo systemctl status apache2 # Check Apache error log sudo tail -50 /var/log/apache2/error.log # Check if web server process is running ps aux | grep nginx ps aux | grep apache2

Step 5: Check SSL Certificate

An expired or invalid SSL certificate causes browsers to block access to the site entirely:

bash
# Check certificate expiry openssl s_client -connect example.com:443 2>/dev/null | openssl x509 -noout -dates # Check for SSL errors openssl s_client -connect example.com:443 2>/dev/null | head -20

Use the ShowDNS SSL Checker for a quick check. If the certificate is expired, see the SSL error fix guide.

Step 6: Check Hosting Provider Status

Infrastructure outages at your hosting provider cause downtime that you cannot fix yourself:

  • Check your hosting provider's status page (usually at status.provider.com).
  • Check social media for reports from other customers.
  • Check cloud provider dashboards (AWS Health Dashboard, Google Cloud Status, Azure Status).
  • Open a support ticket if a provider-side issue is confirmed.
Set up uptime monitoringUse an uptime monitoring service to alert you within seconds of any downtime. This gives you a head start before users report the issue. Many services offer free tiers with 1-minute check intervals.

Step 7: Common Downtime Causes and Quick Fixes

Domain Expired

bash
whois example.com | grep -i "expir" # If expired — log in to registrar and renew immediately

Server Disk Full

bash
# Check disk usage df -h # Find large files du -sh /* 2>/dev/null | sort -rh | head -20 # Quick fix — clear old logs or temp files sudo journalctl --vacuum-time=7d sudo apt clean # Ubuntu/Debian

Out of Memory (OOM)

bash
# Check memory usage free -h # Check if web server was OOM-killed sudo dmesg | grep -i "oom|killed" # Restart web server sudo systemctl restart nginx # or sudo systemctl restart apache2

Web Server Crash

bash
# Restart web server sudo systemctl restart nginx # Verify it started correctly sudo systemctl status nginx sudo nginx -t # Test config before restart

Firewall Blocking Traffic

bash
# Check UFW rules (Ubuntu) sudo ufw status # Allow HTTP and HTTPS if blocked sudo ufw allow 80/tcp sudo ufw allow 443/tcp # Check iptables rules sudo iptables -L -n | grep -E "80|443"

Building a Post-Incident Checklist

After resolving downtime, run a post-incident review:

  • What was the root cause?
  • How long did the outage last?
  • How was it detected (user report vs monitoring)?
  • What can prevent this class of issue in the future?
  • Are monitoring and alerting in place for this failure mode?

Frequently Asked Questions

My site loads for me but not for some users — why?

This is typically a DNS propagation issue (different resolvers serving different IPs), a CDN caching issue serving a cached error page to some users, or geographic routing that sends different users to different servers with different health.

The server is running but the site still shows an error — why?

Common causes: the web server is running but the application layer (PHP-FPM, Node.js, Docker container) crashed; a database is down; a configuration change introduced a syntax error; or a deployment broke the application code. Check application logs, not just web server logs.

How do I recover if I cannot SSH into the server?

Use your hosting provider's out-of-band access: VPS providers typically offer a rescue mode or console access that bypasses the network. Cloud providers offer serial console access. This lets you diagnose issues even if SSH is down.

Related Articles