Your website will break at the worst possible time.
Not because your team is careless. Because websites sit on top of a stack of moving parts: hosting, DNS, SSL, plugins, payment processors, form tools, analytics scripts, CDNs, email services, browser changes, and human beings pushing updates on a Friday afternoon.
When something breaks, the difference between a bad hour and a bad week is usually not technical brilliance. It’s whether someone knows what to check first, who has authority to make decisions, and what to say to customers while the fix is happening.
This is the website incident response plan I recommend for small businesses that rely on their site for leads, bookings, ecommerce, quote requests, recruiting, or customer support. It’s built for owners, marketing managers, web teams, and outsourced developers who need a clear playbook before the next outage, hacked page, form failure, or botched launch.
What counts as a website incident?
A website incident is any failure that blocks revenue, trust, compliance, security, or customer access.
That includes obvious problems like the site being offline, but it also includes quieter problems that cost money for days before anyone notices. A contact form that stops sending email is an incident. A payment form that rejects good cards is an incident. A hacked page promoting spam is an incident. A bad robots.txt file that blocks Google is an incident.
Use this rule: if the issue can stop a customer, lead, search engine, or employee from doing something important, treat it like an incident.
The risk is real. IBM’s 2025 Cost of a Data Breach report put the global average data breach cost at $4.44 million. Verizon’s 2025 DBIR coverage reported ransomware in 88% of SMB breach incidents. Uptime Institute’s 2025 outage analysis exists because IT and data center outages still carry material costs and consequences. Even if your business is nowhere near enterprise scale, the same basic lesson applies: downtime and security failures get expensive fast when nobody owns the clock.
The 60-minute website incident checklist
Print this section. Put it in your shared drive. Add the names and phone numbers before you need them.
Minutes 0 to 5: Confirm the incident
Don’t start changing things until you know what failed.
Open the website from a normal browser, an incognito window, and a mobile connection if possible. Check the homepage, one service page, the contact page, and any revenue-critical page like checkout, booking, quote request, or login. If the problem is search traffic, check Google Search Console and the Google Search Status Dashboard before assuming your site is the only system having trouble.
Capture proof immediately. Take screenshots, copy error messages, note the time, and write down who first reported it. That record helps your developer, host, payment vendor, cyber insurer, or attorney if the incident grows.
Minutes 5 to 10: Name the incident owner
One person runs the response. Not five people in a group chat.
The owner does not need to be the best developer. They need to keep decisions moving, assign work, record what happened, and stop people from making conflicting changes. Atlassian defines incident metrics like MTTA and MTTR around how quickly teams acknowledge and recover from incidents, and those numbers are only useful when someone is actually managing the response clock from the start.
Write this down:
- Incident owner:
- Technical lead:
- Business approver:
- Customer communication owner:
- Vendor contact:
If you’re a small business, one person may hold two roles. That’s fine. Just don’t leave the roles blank.
Minutes 10 to 15: Set severity
Severity keeps the team from overreacting to small bugs and underreacting to real damage.
Use a four-level scale:
| Severity | Definition | Examples | Response target |
|---|---|---|---|
| SEV1 | Revenue, security, or public trust is actively at risk | Site offline, checkout down, hacked pages, customer data exposure | Immediate response |
| SEV2 | Major function broken, but workaround exists | Contact form failing, booking tool down, major page template broken | Same business day |
| SEV3 | Visible bug with limited business impact | Layout issue, broken image, noncritical integration failure | Within 2 business days |
| SEV4 | Minor issue or improvement | Typo, low-priority tracking cleanup | Normal queue |
Treat suspected security issues as SEV1 until proven otherwise. CISA’s ransomware guidance recommends coordinated response practices and regular backup testing because ransomware response is not something you want to invent under pressure after systems are already locked.
Minutes 15 to 25: Stop the bleeding
The first fix is not always the final fix. The first fix is whatever reduces harm without making recovery harder.
For a bad deploy, roll back to the last known good version. For a broken plugin update, disable the plugin if the site can function without it. For a hacked WordPress site, take it out of public view, preserve evidence, and get a clean backup ready before overwriting files. For a broken payment system, add a clear notice and a phone or invoice workaround. For a broken lead form, route calls to a phone number and add a temporary email link.
If the whole site must be unavailable during planned maintenance, Google recommends returning a 503 Service Unavailable status so search engines understand the downtime is temporary. Don’t show a normal 200 OK page that says “we’re down” if every important URL is actually unavailable.
Minutes 25 to 35: Communicate internally
Most incident waste comes from people asking the same questions in different places.
Create one source of truth. That can be a Slack channel, Teams thread, Google Doc, ticket, or project management card. The tool matters less than the discipline.
Use this internal update format:
Status: Investigating, identified, fixing, monitoring, or resolved
Impact: Who is affected and what they can’t do
Start time: When the issue began, or when it was first detected
Current action: What is being checked or changed now
Next update: Exact time for the next update
Owner: Person accountable for the next move
Atlassian’s incident communication guidance says effective downtime communication can protect trust when services break because people know what’s happening. Silence makes customers fill in the blanks, and they rarely fill them in kindly.
Minutes 35 to 45: Communicate externally if customers are affected
You don’t need a long apology while the facts are still moving. You need a calm, accurate update.
Use this customer-facing template:
We’re currently investigating an issue affecting [specific page or function]. The issue began around [time] and may prevent some visitors from [impact]. Our team is working on it now. We’ll post the next update by [time]. If you need immediate help, contact us at [phone/email].
If security or payment data may be involved, don’t guess. Say you’re investigating. Preserve logs. Contact the right legal, insurance, hosting, and security support before making public claims you may need to correct later.
Minutes 45 to 60: Verify the fix and watch for relapse
A fix is not done because one person refreshed the homepage.
Verify the user path that failed. Submit the contact form. Complete a test checkout. Check confirmation emails. Look at server logs. Confirm analytics events if tracking was affected. Test on mobile. Ask the original reporter to confirm if possible.
Then monitor for at least one normal traffic cycle. Some website incidents come back because of caching, cron jobs, rate limits, malware reinfection, DNS propagation, or delayed webhook failures.
The incident response plan template
Copy this template into a document your team can edit.
1. Critical website inventory
List the systems that matter before you need passwords at 9 p.m.
| System | Provider | Admin URL | Owner | Backup contact | Notes |
|---|---|---|---|---|---|
| Domain registrar | |||||
| DNS host | |||||
| Website host | |||||
| CMS | |||||
| CDN/WAF | |||||
| Email delivery | |||||
| Form tool | |||||
| Payment processor | |||||
| Analytics | |||||
| Backup service |
CISA tells small and medium businesses to write down backup procedures and make sure the team can recover systems, networks, and data from backups before a crisis. Your website inventory is the same idea applied to your public-facing revenue system.
2. Access rules
Every critical account should have a named owner, multi-factor authentication, and a backup admin. If the only domain registrar login belongs to a former employee or one developer’s personal email, that’s not a small admin detail. That’s a business continuity problem.
Document who can:
- Change DNS
- Restore backups
- Deploy code
- Disable plugins
- Access customer form submissions
- Edit payment settings
- Publish emergency website notices
- Contact hosting or security vendors
Keep emergency access in a password manager, not a spreadsheet. The plan should say where access lives, not expose the passwords themselves.
3. Incident categories
Define the most likely failures for your business.
For a local service company, the nightmare may be calls and quote forms going silent. For an ecommerce shop, checkout and shipping integrations matter most. For a B2B firm, a hacked page or broken lead routing system can damage trust before sales even knows leads stopped arriving.
Common website incident categories:
| Category | What to check first | Likely owner |
|---|---|---|
| Site offline | Hosting status, DNS, SSL, CDN, recent deploys | Developer or host |
| Forms not working | Form logs, email routing, CRM integration, spam filters | Marketing or web team |
| Checkout broken | Payment gateway, cart logs, fraud rules, recent plugin updates | Ecommerce owner |
| Hacked content | CMS users, file changes, server logs, search results | Security or developer |
| SEO traffic crash | Search Console, robots.txt, noindex tags, redirects, Google status | SEO or web team |
| Slow site | Host metrics, CDN, large files, third-party scripts | Developer |
| Bad launch | Redirect map, analytics, forms, key templates, DNS | Project owner |
4. Decision thresholds
Small teams lose time because nobody knows when to escalate. Decide now.
Escalate immediately when customer data may be exposed, payment pages are affected, the site is offline during business hours, Google-indexed pages are showing spam, DNS has been changed unexpectedly, or a restore could overwrite new orders or leads.
Escalation doesn’t mean panic. It means the issue is above normal ticket level.
5. Rollback and restore rules
Backups are not a plan unless you’ve tested restoring them.
Write down:
- Backup frequency
- Backup location
- How long backups are retained
- Who can restore
- What data could be lost during restore
- Last successful restore test date
- Pages or functions that must be checked after restore
CISA’s StopRansomware guide specifically recommends testing backup procedures regularly because untested backups fail when they’re needed most.
6. Customer communication templates
Prepare messages now, while nobody is upset.
Outage template
We’re aware of an issue affecting [website/function]. Our team is working on it now. If you need help while the site is unavailable, contact [phone/email]. Next update: [time].
Resolved template
The issue affecting [website/function] has been resolved. We confirmed [specific user action] is working again at [time]. If you’re still seeing a problem, please contact [phone/email].
Security investigation template
We’re investigating a potential security issue affecting [system/page]. We have restricted access while we review the issue. We’ll share confirmed updates as we have them. If you believe you’re affected, contact [phone/email].
Keep the language plain. Customers don’t need your stack trace. They need impact, workaround, and next update time.
What to check after the incident is fixed
The cleanup matters almost as much as the fix.
NIST’s incident response lifecycle includes preparation, detection and analysis, containment, eradication and recovery, and post-incident activity as separate phases. That last phase is where small businesses usually skip the work. Don’t.
Run a short post-incident review within 3 business days.
Answer these questions:
| Question | Why it matters |
|---|---|
| What happened? | Creates a shared factual record |
| When did it start? | Shows whether monitoring caught it quickly |
| How was it detected? | Reveals gaps in alerts, testing, or customer reporting |
| What was the customer impact? | Helps prioritize prevention work |
| What fixed it? | Prevents guessing next time |
| Could the fix have caused data loss? | Protects records, orders, and leads |
| What should change? | Turns the incident into a better system |
| Who owns each follow-up task? | Stops lessons from becoming meeting notes only |
Then check the basics that often get missed: Search Console coverage, sitemap access, robots.txt, analytics tracking, form delivery, call tracking, CRM routing, checkout events, page speed, broken redirects, and security scans.
The website incident roles your business needs
You don’t need a big IT department. You need named responsibility.
The business owner decides when revenue, legal, or customer trust risk is high enough to escalate. The web lead handles code, hosting, CMS, and recovery steps. The marketing lead handles public messaging, tracking, Search Console, and lead flow. The operations lead confirms phones, email, booking, payments, and customer workarounds. The vendor lead contacts hosting, DNS, payment, CRM, security, or software providers.
If your web support is outsourced, put response expectations in writing. A web maintenance plan that replies “within 2 business days” is not an incident response plan. That may be fine for a typo. It is not fine for checkout failure, malware, DNS problems, or a broken lead form.
Ask your vendor these questions before an incident:
- What qualifies as emergency support?
- Who responds after hours?
- What systems do you monitor?
- How fast do you usually respond to SEV1 incidents?
- Do you have access to restore backups?
- What actions require our approval?
- What happens if the problem is caused by hosting, DNS, or a plugin vendor?
- Will you provide a post-incident summary?
If the answers are vague, tighten the agreement.
Website incidents that deserve special handling
Hacked pages or malware
Don’t just delete the visible spam and move on. Look for the entry point, changed users, modified files, injected scripts, fake admin accounts, altered redirects, and search index contamination. If customer data might be affected, involve legal and security support before publishing a public statement.
Broken forms
Form failures are sneaky because the website looks fine. Test forms weekly and send submissions to a shared inbox or CRM, not one person’s email. Check spam filtering and CRM logs if lead volume suddenly drops.
SEO traffic collapse
Before rewriting pages, check technical causes: noindex tags, robots.txt, canonical errors, redirect chains, sitemap changes, server errors, manual actions, and Google-side incidents. The Google Search Status Dashboard should be part of your triage routine.
DNS and domain issues
DNS problems can make every other system look broken. Keep registrar access current, protect the domain with MFA, document DNS records, and restrict who can make changes. If your domain expires, email, website traffic, tracking, and customer trust can all fail together.
Payment or checkout failure
Treat checkout issues as SEV1 during selling hours. Add a temporary phone, invoice, or payment link workaround if possible. After the fix, reconcile orders, abandoned carts, failed payments, and customer support messages.
The simple maintenance rhythm that prevents bigger fires
A good incident plan is not just reactive. It changes how you maintain the site.
Weekly: test forms, checkout, booking, search box, lead routing, and key mobile pages. Monthly: review backups, plugin or CMS updates, security alerts, uptime reports, Search Console errors, and analytics tracking. Quarterly: run a restore test, review emergency contacts, confirm vendor access, check DNS records, and update communication templates.
This is not busywork. It’s cheaper to find a broken form on Tuesday morning than to discover after a slow month that 42 quote requests disappeared into a dead inbox.
FAQ
Do small businesses really need a website incident response plan?
Yes, if the website produces leads, sales, appointments, customer support, hiring applications, or local trust. The plan can be short, but the roles, access, backup process, and communication steps need to be written down.
Who should own website incident response?
One business-side owner should own the process, and one technical lead should own the fix. If you outsource development, your internal owner still needs authority to approve downtime messages, customer workarounds, payment changes, and vendor escalation.
How often should we test the plan?
Run a light test quarterly. Confirm access, restore steps, emergency contacts, form routing, and backup status. A one-hour tabletop exercise is enough for most small businesses.
What’s the first thing to do when the website goes down?
Confirm the problem from more than one device or network, capture screenshots and error messages, assign one incident owner, then check hosting, DNS, SSL, CDN, and recent changes. Don’t let multiple people make changes at once.
Should we tell customers about every website issue?
No. Tell customers when the issue affects their ability to buy, book, contact you, access their account, or trust the site. For minor internal bugs, document and fix them without creating noise.
Need a web team that doesn’t disappear when the site breaks?
A good website is not just design and launch day. It’s monitoring, maintenance, backups, recovery, and calm decision-making when something goes wrong.
If you want a website partner who treats uptime, lead flow, and business risk like they matter, start here. We’ll help you build a site that is easier to manage before the next incident tests it.
Richard Kastl
Founder & Lead EngineerRichard Kastl has spent 14 years engineering websites that generate revenue. He combines expertise in web development, SEO, digital marketing, and conversion optimization to build sites that make the phone ring. His work has helped generate over $30M in pipeline for clients ranging from industrial manufacturers to SaaS companies.