ISO 27001 A.8.6: Capacity Management

Control Definition

Organizations must track how their computing and supporting resources are being consumed, and tune, expand, or free up capacity based on both what current operations demand and what projected future requirements will need.

Control Objective

To ensure that systems have adequate capacity to meet current and future demands by monitoring resource utilization, projecting future capacity needs, and proactively planning capacity additions to prevent performance degradation and security incidents caused by resource exhaustion.

View official ISO 27002:2022 guidance

What This Really Means

Capacity management means monitoring how much of your computing resources (CPU, memory, storage, network bandwidth) are being used and planning to add more before you run out. Running out of capacity causes security problems: systems crash making them unavailable, logs stop being written hiding security incidents, backups fail due to insufficient storage, security monitoring tools miss alerts due to processing overload, and denial-of-service attacks succeed because systems have no capacity headroom.

Think of it like managing electricity capacity for a building: monitor power consumption, predict increases when adding floors or equipment, upgrade transformers before reaching limits, and maintain buffer capacity for peak loads. If you max out electrical capacity, circuit breakers trip causing blackouts. Similarly, IT systems need capacity planning: monitor server CPU/memory, predict growth as users increase, add capacity before hitting limits, and maintain buffer for traffic spikes.

This control requires you to establish capacity monitoring for critical systems (servers, databases, networks, cloud services, storage), set thresholds triggering capacity planning (alert when storage reaches 70%, plan expansion at 80%), project future capacity needs based on business growth and historical trends, implement capacity additions before exhaustion (provision new servers, upgrade network links, expand cloud resources), and document capacity management procedures. The goal is preventing availability and security incidents caused by insufficient capacity.

Why It Matters

Capacity exhaustion creates cascading security failures. When systems run out of resources, security controls break: authentication systems reject valid users, security logs are truncated or not written, backup jobs fail, monitoring tools cannot process alerts, and systems become unstable making incident response impossible.

Without capacity management, organizations face:

•Availability Incidents and Service Outages – Systems crash when capacity exhausted, causing business disruption and violating availability commitments in SLAs and ISO 27001 objectives
•Security Monitoring Blind Spots – SIEM cannot ingest logs when storage full, IDS drops packets when overloaded, security tools miss alerts during resource exhaustion—attackers exploit these blind spots
•Backup and Recovery Failures – Backups fail to complete due to insufficient storage; during ransomware attack, organization discovers months of backups unusable, leading to data loss and extended downtime
•Log Deletion and Audit Trail Loss – When disk space exhausted, systems delete old logs to free space; compliance requirements (CERT-In 180-day retention, ISO 27001 log preservation) violated, and forensic investigation impossible after incidents
•Performance Degradation Enabling Attacks – Overloaded systems respond slowly, making brute-force attacks easier (timeouts prevent account lockouts), allowing attackers to exploit race conditions, and masking malicious activity in general slowness

Cost-pressured organizations often run infrastructure at 90%+ utilization to minimize spend, leaving no buffer for spikes or incidents. This penny-wise, pound-foolish approach creates security and availability risks.

Implementation Guidance

Identify Critical Systems Requiring Capacity Monitoring

Determine which systems need capacity management: servers (application servers, databases, domain controllers, security appliances), storage systems (SAN, NAS, backup storage, log archives), network infrastructure (routers, switches, firewalls, internet bandwidth), cloud services (compute instances, managed databases, object storage, serverless quotas), and security systems (SIEM storage, IDS/IPS processing capacity, DLP bandwidth). Prioritize based on criticality: systems supporting business-critical applications, compliance requirements (log retention), or security functions (authentication, monitoring) are highest priority.

Implement Capacity Monitoring and Alerting

Deploy monitoring tools to track resource utilization: for on-premises infrastructure use monitoring platforms (Nagios, Zabbix, PRTG, Prometheus, Datadog), for cloud use native tools (AWS CloudWatch, Azure Monitor, Google Cloud Operations) plus third-party unified monitoring. Monitor key metrics: CPU utilization (average, peak, per-core), memory usage (used, available, swap), disk space (used %, free space, inode usage on Linux), network throughput (bandwidth utilization, packet loss, errors), and database capacity (table sizes, connection pool usage, query queue length). Set alerts: warning at 70-75% utilization, critical at 80-85%, and emergency at 90%+. Avoid alerting only at 95%—too late for planning.

Establish Capacity Thresholds and Response Procedures

Define actions triggered by capacity thresholds: (1) Green zone (0-70%)—normal operations, quarterly capacity review, (2) Yellow zone (70-85%)—capacity planning initiated, identify expansion options and budget requirements, timeline to reach critical capacity calculated, (3) Orange zone (85-95%)—immediate capacity planning, expedited procurement, temporary mitigations (log rotation, archive old data, increase cloud quotas), escalation to management for budget approval, (4) Red zone (95%+)—emergency procedures, disable non-critical services, emergency capacity addition, incident declared. Document procedures so operations team knows what to do when alerts fire.

Project Future Capacity Needs Based on Growth Trends

Capacity planning is predictive, not reactive: analyze historical utilization trends (monthly growth rate over 6-12 months), correlate with business metrics (user growth, transaction volume, data ingestion rates), factor in planned initiatives (new product launches, marketing campaigns, seasonal peaks like Diwali sales), and project when capacity thresholds will be reached. Use conservative assumptions (plan for 20% higher growth than historical average). Calculate lead time for capacity additions: cloud can scale quickly (hours), but physical servers may require 4-8 weeks (procurement, delivery, installation, testing). Initiate capacity expansion when projected to reach 80% in next 2-3 months accounting for lead time.

Implement Capacity Optimization and Efficiency Measures

Before adding capacity, optimize existing usage: identify underutilized resources (servers at 10% CPU can be decommissioned or consolidated via virtualization), implement compression and deduplication for storage (reduce backup storage by 50-70%), archive old data to cheaper storage tiers (move 1-year-old logs to object storage), optimize applications (inefficient database queries consuming excessive CPU/memory), and rightsize cloud resources (T-shirt size instances to actual usage patterns). Optimization extends existing capacity and reduces cost. Balance optimization efforts with timely expansion—do not delay critical capacity additions to squeeze out last 5% efficiency.

Document and Budget for Capacity Expansion

Formal capacity planning process: produce quarterly capacity reports showing current utilization, growth trends, projected capacity exhaustion dates, and recommended actions (add storage, upgrade network, expand cloud quotas). Include cost estimates for capacity additions and submit to budget planning. Maintain capacity roadmap: planned expansions over next 12-24 months aligned with business growth projections. Communicate capacity constraints to business stakeholders: if planning major initiative (customer onboarding campaign), IT must know in advance to provision capacity. Capacity management is partnership between IT and business.

Monitor Capacity of Security Systems Specifically

Security systems have unique capacity requirements: SIEM log storage must accommodate log retention policies (CERT-In requires 180 days in India, compliance may require years)—calculate daily log volume × retention period × growth factor. IDS/IPS must inspect traffic at line rate without dropping packets—monitor drop rates and upgrade before performance degrades. Security monitoring tools must process alerts in real-time—backlog of hours-old alerts defeats purpose. Backup systems must complete within backup window—monitor backup duration trends and expand capacity before backups exceed available time. Security cannot function without adequate capacity.

Audit Evidence

During your ISO 27001 certification audit, auditors will expect to see the following evidence to demonstrate compliance with A.8.6:

Documentation

Capacity management policy and procedures
List of critical systems with defined capacity thresholds
Capacity monitoring configuration and alerting rules
Quarterly capacity reports showing utilization trends and projections
Capacity expansion projects and budget approvals

Interviews

IT operations team about capacity monitoring processes
System administrators about how they respond to capacity alerts
Management about capacity planning and budget allocation

Observations

Review of capacity monitoring dashboards showing current utilization
Demonstration of capacity alerting when thresholds exceeded
Verification that critical systems have capacity headroom
Evidence of capacity planning based on historical trends

Practitioner Insights

A failure pattern I see across audits: SIEM storage fills up during a festive-season traffic spike, log ingestion silently stops for days, and an intrusion sails through the monitoring blind spot. Teams assume cloud storage auto-expands infinitely—it does not without configuration and budget approval. Always monitor security system capacity separately from general IT capacity because security failures have different consequences than performance slowness.

Surendra Pal Singh · CISO, DPO, CISA, ISO 27001, 27701, 42001 Lead Auditor

Capacity management is not just IT operations concern—it is security control. I see organizations where backup storage runs out, backups silently fail for weeks, and nobody notices until ransomware hits and they discover no recent backups exist. Treat backup capacity monitoring with same urgency as production storage. Failed backups = lost data = business impact.

Saundhi Chauhan · ISO 27001, 27701 Lead Auditor

Common Challenges & Solutions

Challenge

Cloud services offer automatic scaling, making capacity management seem unnecessary.

Solution

Cloud auto-scaling prevents immediate outages but does not eliminate capacity management: (1) auto-scaling costs money—unconstrained scaling during attack or bug can cause budget overruns (set billing alerts and quotas), (2) cloud quotas and limits exist (AWS has soft/hard limits on instances, Google Cloud has quota requests requiring approval), (3) some resources do not auto-scale (managed database storage requires manual expansion, route table entries have hard limits), (4) scaling events indicate underlying issues (sudden 10x spike may be attack or application bug, not legitimate growth). Monitor cloud capacity and costs; auto-scaling is safety net, not replacement for planning.

Challenge

Historical growth trends are unreliable due to volatile business conditions or startup phase with exponential growth.

Solution

When historical data insufficient or unreliable: use business-driven capacity planning (project capacity based on business plans: launching in 3 new markets will add X users, new product feature will increase transaction volume by Y%), scenario planning (model best-case, expected, worst-case growth and plan for worst-case), and maintain larger capacity buffer (instead of 20% buffer, keep 50% headroom when uncertainty high). For cloud workloads, use elastic architectures that can rapidly scale allowing time to observe actual growth patterns. Review and adjust projections monthly during high-growth phases instead of quarterly.

Challenge

Capacity alerts fire constantly (alert fatigue) causing operations to ignore or disable alerts.

Solution

Tune alerting thresholds and frequency: (1) use multi-level thresholds (warning at 75%, alert at 85%, critical at 95%) reducing noise from transient spikes, (2) require sustained threshold violation (alert only if above 85% for 30+ minutes, not momentary peaks), (3) alert on growth rate (disk filling 5% per day will hit critical in 3 days) in addition to absolute levels, (4) suppress alerts during known events (maintenance windows, scheduled batch jobs), (5) send different alert destinations based on severity (critical alerts page on-call, warnings email daily digest). Goal: alerts must be actionable and timely, not overwhelming.

Challenge

Budget constraints prevent capacity expansions even when monitoring shows critical need.

Solution

Build business case for capacity investment: quantify risk of capacity exhaustion (revenue loss from outage, compliance fines from log retention failure, recovery costs from backup failure, security incident costs from monitoring blind spots), compare risk to capacity investment (often capacity costs are fraction of potential incident costs), provide options at different price points (basic expansion meeting immediate need vs. optimal expansion with future headroom), and escalate to executive management with clear risk acceptance (if capacity not approved, document residual risk that leadership must accept). Capacity is not optional nice-to-have—it is risk management.

Challenge

Shadow IT and unapproved systems escape capacity monitoring, causing unexpected resource exhaustion.

Solution

Discover all systems consuming resources: use network scanning to find devices, cloud asset inventory to identify all instances/services, expense management to find unapproved cloud accounts (corporate cards used for AWS/Azure), and regular asset reconciliation comparing discovered systems to official inventory. Implement policies requiring capacity approval: all new systems must go through capacity planning, all cloud resources must be tagged with cost center/owner for monitoring. Integrate capacity monitoring with asset management—cannot monitor capacity of systems you do not know exist.

Frequently Asked Questions

What capacity utilization threshold should trigger capacity planning?

Common practice: initiate capacity planning at 70-75% utilization, approve and procure at 80-85%, complete expansion before reaching 90%. Thresholds vary by resource type: storage can run higher (85-90%) as expansion is relatively quick; CPU/memory should stay lower (70-80%) as performance degrades non-linearly at high utilization. For security systems (SIEM, backups, logs), be more conservative (start planning at 60-70%) because failures have compliance and incident response consequences. Account for lead time: if capacity expansion takes 2 months, initiate when projected to reach threshold in 3-4 months.

How do we monitor capacity of SaaS applications where we do not control infrastructure?

For SaaS services: monitor quota usage exposed by vendor (Salesforce storage limits, Office 365 mailbox quotas, Google Workspace user limits), track application-level metrics (number of users approaching license count, API rate limit consumption), review vendor service health dashboards for capacity issues, and include capacity in SLA reviews with vendor. Request capacity roadmap from vendor for critical services. For highly critical SaaS, negotiate contractual SLAs requiring vendor notification if approaching capacity constraints. Do not assume vendor handles all capacity—you must monitor usage against purchased capacity.

Should capacity monitoring be handled by IT operations team or security team?

Shared responsibility: IT operations owns general capacity management (servers, storage, network) because they manage infrastructure. Security team owns capacity for security systems (SIEM, IDS/IPS, DLP, backups, security logs) and must monitor independently to ensure security functions not compromised by capacity issues. Security should receive capacity alerts for security-relevant systems even if operations monitors them. Coordination required: security log growth affects storage capacity (operations concern) and security monitoring capability (security concern). Both teams must understand dependencies.

How does capacity management relate to ISO 27001 availability objectives?

Capacity management directly supports availability: insufficient capacity causes service outages violating availability objectives in Statement of Applicability and SLAs. ISO 27001 auditors expect: documented capacity monitoring for critical systems, evidence of capacity planning preventing exhaustion, capacity included in risk assessments (risk: inadequate storage causes backup failures), and capacity incidents tracked and reviewed. Capacity-related availability failures are control failures during audits. Maintain capacity metrics showing uptime and performance meeting objectives; use capacity management to prevent availability incidents proactively.

What capacity planning is needed for log retention to meet CERT-In and compliance requirements?

Calculate log storage requirements: identify all systems generating security logs (servers, network devices, applications, cloud services), measure daily log volume per system (use representative sampling period), multiply by retention period (CERT-In requires 180 days, compliance may require 1-7 years), add 50% buffer for growth and compression efficiency, and provision storage capacity. Example: 100 GB daily logs × 180 days retention × 1.5 buffer = 27 TB storage. Review quarterly: as systems added or log verbosity increases, storage needs grow. Use tiered storage: recent logs (30 days) on fast SSD for SIEM queries, older logs on cheaper object storage for compliance retention.

How do we handle sudden, unexpected capacity demands from security incidents or attacks?

Maintain emergency capacity headroom: (1) keep 20-30% unused capacity buffer on critical systems for unexpected spikes, (2) establish emergency capacity expansion procedures (approved vendors, pre-negotiated terms, expedited procurement), (3) use cloud burst capacity where on-premises workloads can temporarily shift to cloud during emergencies, (4) have temporary mitigation playbooks (what non-critical services can be disabled to free capacity during incident). DDoS attacks, malware outbreaks, and incident investigations often require rapid capacity expansion. During active incidents, normal procurement processes are too slow—have pre-approved emergency capacity budget and processes.

A.8.6 Capacity management

Control Definition

Control Objective

What This Really Means

Why It Matters

Implementation Guidance

Identify Critical Systems Requiring Capacity Monitoring

Implement Capacity Monitoring and Alerting

Establish Capacity Thresholds and Response Procedures

Project Future Capacity Needs Based on Growth Trends

Implement Capacity Optimization and Efficiency Measures

Document and Budget for Capacity Expansion

Monitor Capacity of Security Systems Specifically

Audit Evidence

Documentation

Interviews

Observations

Practitioner Insights

Common Challenges & Solutions

Challenge

Solution

Challenge

Solution

Challenge

Solution

Challenge

Solution

Challenge

Solution

Related Controls

Inventory of information and other associated assets

Information transfer

Information backup

Documented operating procedures

Frequently Asked Questions

Written By Expert Auditors

Related Reading

ISO 27001 Knowledge Hub

ISO 27001 Controls Library

ISO 27001 Certification Guide

ISO 27001 Cost Guide

ISO 27001 Consulting in India

Proof & Track Record

Get in touch

Quick Call

Send Requirements

A.8.6
Capacity management