M-209 cipher machine.

Operations Domain

Uptime Institute Data Center Site Infrastructure Tier Standard Topology

Feature Tier I Tier II Tier III Tier IV
Active components supporting IT load N N+1 N+1 2N or 2N + 1
The point is to always have N after any failure
Distribution paths 1 1 1 active and
1 alternate
2 simultaneously active
Concurrently maintainable Yes Yes
Fault tolerant Yes
Compartmentalization Yes
Continuous cooling Yes

Environmental controls

Temperature min: 18 °C (64.4 °F)
max: 27 °C (80.6 °F)
Humidity min: 40%
max: 60%
Raised floor height at least 24"

UPS and Generators

I've done work for Cummins, so here we go much further down the rabbit hole than is appropriate. Ignore this when preparing for the test:

  • LNG or Liquified Natural Gas is a cryogenic fluid, about 99% methane or CH4. It's good for bus engines, stored in a Dewar tank like a giant Thermos bottle. Good energy density, easy to handle and store.
  • CNG or Compressed Natural Gas might be vaporized LNG or maybe compressed pipeline gas, which might be more like 95-96% methane. It's stored at 5,000 PSI, so it needs sturdy tanks: either heavy steel or aluminum internal skin with carbon fiber overwrap, which is expensive.
  • Propane is CH3—CH2—CH3 and automotive spec propane is pretty pure. The bottles for grills and heaters are less pure, other hydrocarbons are in the mix. Propane is stored at 30-50 PSI, so the tanks still need periodic hydrostatic testing but it's relatively cheap.
  • LP or Liquified Petroleum gas is a mix of propane and others hydrocarbons, mostly heavier.
  • Gas engines (meaning LNG/CNG/propane, not gasoline) take longer than diesel to get to full rated power, so the data center UPS will have to support full load a little longer as the genset spins up. Maybe something like magnetically levitated flywheels spinning generators.

Time / Frequency Concepts

"About twice a year we have a major storage failure. We make backups nightly starting at 1 AM. Our goal is to get data restored within 1 hour. If we went 8 hours without data, our company would financially suffer. Over the past year, our data recovery process has averaged 41 minutes. While recovering one file system, we need at least 80% normal performance on the other unaffected file systems." For that story:

Maintenance Mode

Used when updating or reconfiguring the host, where the hypervisor runs.

Clustered Hosts and Resource Sharing

"Everyone gets a sandwich, and Elite customers get 2. No one can have more than 4 sandwiches. After everyone gets their promised sandwiches, we'll fairly distribute what's left over."

SDN Orchestration

The below is far deeper than you need to know for the test, but cloud services like Google Cloud and AWS and Microsoft Azure and so on must use SDN. Here's what the AWS dashboard shows you of the orchestration parts of a multi-VM deployment with network orchestration. Amazon calls this "CloudFormation". Here we're starting multiple:

AWS dashboard view of SDN (or software-defined networking) orchestration

Many thanks to Carter Elmore for the screenshot!