The Cloud and Security
Cloud computing is a hot technology area and behind a large and growing business sector. But cloud computing is not new. John McCarthy suggested that "computation may someday be organized as a public utility" in a speech at MIT's centennial in 1961.
And, what people refer to as "cloud computing" often is not computing in the traditional sense.
So, what is it?
What is "The Cloud?"
Networks have been drawn as abstract clouds for ages. When you draw a network map, you don't bother showing all the Ethernet cables. Each puffy cloud represents a local-area network with all its component hosts, Ethernet switches, and interconnecting cables. Routers are little boxes connecting these clouds together.
Now, for today's cloud computing, cloud services, cloud storage and so on, what people really mean is that the computers and storage devices are far away and owned by someone else. You may not have to think about them in much detail, and even if you try, you don't get to see the actual hardware. It definitely remains an abstraction.
Management loves this concept because, in their terms, they can trade off a capital cost against an operational cost.
In simple terms, with cloud computing you don't buy computers and storage, you rent it.
Larry Ellison complained, "The interesting thing about cloud computing is that we've redefined cloud computing to include everything that we already do.... I don't understand what we would do differently in the light of cloud computing other than change the wording of some of our ads."
Cloud computing is very much like the outsourcing and remote hosting that has been done for years and years. It's used for storage and network services, and to a growing extent for high performance computation.
The difference between what came before and today's cloud offerings is that some current cloud providers have enormous network bandwidth and reserve virtualized hosts, to the point that you can (almost) always add more hosts and you (almost) never notice delays or bandwidth limitations. What's more, it's "pay as you go", you deploy as many cloud-based hosts, or "instances" as you would like, and use them for as long or for as little time as you would like. You deploy your own cloud resources and you pay only for what you use. Most organizations are going with the impressively powerful and reasonably priced major cloud services such as Amazon Web Services.
Users typically start by using a web-based dashboard type of interface. Once they become serious, they typically migrate to command-line tools and API calls from their programs.
Where is this "Cloud"?
Manhattan offers some unusual opportunities to see just a little of the physical form of the cloud. At least you can see some buildings that play significant roles in the instrastructure of the Internet including major cloud computing sites for both Amazon and Google.
Amazon is the biggest player by far in cloud computing. They have operations in North America, South America, Australia, Asia, and Europe.
Amazon's US-East (Northern Virginia) geographic region is nominally based very close to Washington D.C., the home of their biggest customer. Their US-East (N VA) operation has its largest single point at Ashburn, Virginia, but it is distributed throughout many locations from the east coast to the Mississippi River, plus Texas. Amazon's other North American geographic regions are US East (Ohio), US-West (Northern California) with multiple sites in the San Francisco Bay area plus south through California, and US West (Oregon) which is probably also around Seattle and Vancouver, and Canada. Their Asian geographic regions are nominally based in Tokyo, Seoul, Beijing, Singapore, Mumbai, and Sydney. EU-West is nominally based in Dublin, London, and Frankfurt. São Paulo is their main South American site. In all of these regions they have multiple "edge locations" or geographically distributed facilities.
Google is another example of a very large and
geographically distributed cloud provider.
Other things we can consider as "the cloud" include
e-mail services run by major ISPs
and some large web-hosting operations like GoDaddy.
Major ISP mail servers may be near their headquarters,
mx1.comcast.net in eastern Pennsylvania.
GoDaddy, which I think really is fair to consider as
Platform-as-a-Service (or PaaS) if you are doing heavy
PHP or other server-side programming, is in Arizona.
For almost all of us, those services are "out there in the cloud", not distributed around as with Amazon but certainly abstracted far away and out of sight.
What is Unique or New About Cloud Security?
There is absolutely nothing unique or new about the technology involved in cloud security. The technology used for security in the cloud is exactly what you should already be using to protect your data in-house. Encryption to provide data confidentiality, hash functions to detect unauthorized modification or the violation of data integrity, a variety of technologies for user and host authentication (mostly cryptography) and access control (cryptography plus rule-based systems), and best practices to maintain availability of data and the computing and network resources.
The difference is that you are turning over control, and therefore visibility, of some of the processes to a provider. This is the only difference, but it is a significant one!
Exactly what control and visibility you surrender depends on what combination of cloud technologies you use. The more you give up, the harder it will be to achieve regulatory compliance.
My impression is that management often wants to not only avoid buying hardware, they also want to avoid paying for staff. That pushes them toward Software-as-a-Service (or SaaS) offerings like Google Apps, Google Docs, Microsoft Office 365, and so on. The problem is that the cloud provider takes complete control of these SaaS offerings, and you cannot even see what they are doing. You have no way of providing to an auditor that the right things are being done.
With Infrastructure-as-a-Service (or IaaS) you are effectively renting the use of servers in the cloud. You must provide the skilled system administration staff, but the benefit is that you are in almost complete control (outside of physical security and electrical and network connectivity, things that cloud providers tend to be much better at than the overwhelming majority of their customers). You have to have some skilled staff doing work, but you have control and visibility, and therefore you can show that you are doing the right things in the right way.
How secure is the cloud?
The cloud environment is new and at least a little different to most people, and therefore it seems threatening. People worry: Just how risky is the cloud?
The answer in most situations comes as a surprise, to the extent that many people are hard to convince:
Used correctly, the cloud probably provides better information security than you have in house.
Really. Let's look at availability first:
Think about this: When was the last time that Amazon was down? Or Google? And how long were they down?
Yes, in the period of December 2010 through November 2012, Amazon had five well-publicized outages, or at least degradations. However, these events seem to have been well publicized but poorly understood by much of the cloud market.
The first of these was a brief degradation of service on the first of December, 2010. A mirror of the Wikileaks archive of U.S. State Department cables, the so-called "cablegate" collection, was running on a server in Amazon's US-East region. This became publicized, and the U.S. Government, Amazon's largest customer, told them to halt that server. Amazon did so, and became (along with MasterCard, Visa, and a few other perceived "enablers" of the U.S. Government) a target of a DDOS attack.
The second was in April 2011, which Amazon has described in detail as being a "mirroring storm" at its core. This also affected the US-East region, specifically the Virginia site.
The third was in the first week of August, 2011, when the electric utility provider in Dublin, Ireland suffered a failure of a 110 kV 10 MW transformer, resulting in a total loss of power to all its customers. This included both Amazon's and Microsoft's primary data centers. Amazon has said that their backup generators failed due to faulty programmable logic controllers, the UPS batteries quickly drained, and power was lost to almost all EC2 instances and 58 percent of the EBS volumes in that Availability Zone. Amazon has also provided a detailed description of this event.
The fourth was caused by the June 2012 derecho, which left about a million customers without electrical power in Virginia and took down 911 service for a while. See Amazon's detailed description for more on this one.
The fifth, in October 2012, was yet another event that Amazon described as "a small number of Amazon Elastic Block Store (EBS) volumes in one of our five Availability Zones in the US-East Region began seeing degraded performance, and in some cases, became 'stuck' (i.e. unable to process further I/O requests)." The root cause was a bug in an operational data collection agent. Coupled with this, a server was replaced after a hardware failure but the accompanying DNS record change was not propagated to all internal DNS servers. That let to said data collection agent repeatedly attempting to contact the missing server. While the agent is tolerant of missing data, the bug caused a memory leak, consuming memory on the servers where this agent was installed.
Amazon very strongly suggests that customers deploy instances in and balance activity across multiple availability zones per region. Most customers who followed that recommendation noticed little to no service degradation in the first two events. The second did spread briefly to other Availability Zones due to resource depletion, something that Amazon says they have addressed in a partial redesign.
The third, the Dublin power loss, seems to have been the worst outage. It also degraded service in all EU-West Availability Zones because the EC2 management service is distributed across all zones.
Near-worst case: Some customers had EC2 servers running within one Availability Zone only, and they were unreachable for 24-48 hours.
Worst case: Some customers were using EBS storage within one Availability Zone only, and in the second event a small percentage of data was irretrievably lost. As Amazon reported, "Ultimately, 0.07% of the volumes in the affected Availability Zone could not be restored for customers in a consistent state."
Outcome for the cautious user: With servers in multiple Availability Zones, and S3 storage (which is automatically distributed across multiple Availability Zones), there was probably nothing noticeable in the first case, a minor degradation in the second event, and a more major service degratation in the third case.
Amazon's S3 or Simple Storage Service quotes 99.99% availability of data at any moment, and a long-term durability of 99.999999999%. This is done by redundant storage across multiple facilities within a geographic region, designed to withstand the concurrent loss of data in two facilities. That costs just US$ 0.14 per gigabyte per year. Actually that's the maximum cost, the price drops with increased volume.
Next, what about user authentication, which can lead to inappropriate access if you don't do it right?
Let's consider a standalone system at your facility. Let's say it runs Linux, for meaningful comparison to Amazon's most popular EC2 (or Elastic Compute Cloud) server platform.
User accounts and their authentication can be defined in a
local database, with the sensitive bits in
and the non-sensitive bits in
shadow file should be readable only by
and the operating system, so we should be OK there.
User accounts can also be stored in a remote database communicated over NIS/NIS+, LDAP or LDAP/S, and even SMB/CIFS from an Active Directory server. But unless you're using LDAP/S or encrypting all traffic with IPsec, anyone with a network connection can capture the network traffic and save the password hashes. Hash functions are not reversable, but with some awareness of what tiny subset of the possible password space is actually used by users, an off-line attack can find the corresponding passwords.
Then there are risks of physical access. Someone can boot a machine from media, at which point they completely own it. Or even steal the hardware. Yes, your servers are locked in a room. And your physical security is perfect? Congratulations!
Then there are all the network authentication methods using cleartext logins and passwords. Get the dsniff tool for easy capture and collection of logins and passwords from at least this list of application layer protocols listed on its manual page: FTP, Telnet, SMTP, HTTP, POP, poppass, NNTP, IMAP, SNMP, LDAP, Rlogin, RIP, OSPF, PPTP MS-CHAP, NFS, VRRP, YP/NIS, SOCKS, X11, CVS, IRC, AIM, ICQ, Napster, PostgreSQL, Meeting Maker, Citrix ICA, Symantec pcAnywhere, NAI Sniffer, Microsoft SMB, Oracle SQL*Net, Sybase and Microsoft SQL protocols.
All of these remote and local authentication protocols can use multiple methods of authentication thanks to PAM, or Pluggable Authentication Modules. So local authentication could involve some combination of the keyboard, a smart card reader, and a fingerprint scanner.
But here's the thing about PAM: It can make things more secure if you manage to get it exactly right. But you can make things far worse if you get it wrong, possibly far worse in a way you never notice until you're trying to figure out how you got hacked.
Summary so far: In-house systems require a lot of work to harden user authentication.
Let's compare that complicated mess to authentication
into a cloud based server.
I'll use Amazon EC2 as a good example of how to do this
1: Authenticate into SSH, using the SSH-2 protocol only, authenticating only via cryptographic key based authentication.
And... that's it!
There is no other way to get into the AWS EC2 server.
Sure, once you deploy it you can get in and open it up,
turning on Telnet and setting the administrator's password
to the literal word "
password" and other nonsense,
but what they give you is relatively easy to secure and
maintain because there are very few opportunities
for an attacker.
As far as you can tell, your server is on its own VLAN behind a router with packet filtering rules. Yes, really there's virtualization and multitenancy, but you have to look fairly hard to notice it.
Finally, information security in the cloud is exactly like information security anywhere else. Data confidentiality can be protected with encryption, data integrity violations can be detected with hash functions, and we have no cryptographic tools (and therefore no math, and therefore no numbers) for data availability and so we cannot formally compare availability levels.
Encryption is still encryption: select good ciphers and good protocols using them, and be very careful with your key management. Hash functions are still the tool to verify integrity; collisions are possible with any hash but the probability of two different files colliding under two hash functions is so unlikely that I haven't noticed anyone even trying to figure out just how unlikely it is.
There is no magic security dust, for the cloud or otherwise.
Remember that the cloud computing service model of SaaS, PaaS, and IaaS defines who has responsibility for the hardware and software at the cloud provider or service end. The customer is always responsible for everything at the client end.
|Maintained by||Software / Hardware|
|Provider|| Programming environment:
PHP, Perl, Python, .NET, Java, MySQL/SQL
| Operating system:
Linux, Windows, Solaris
Xen, VMware, KVM
|Hardware platform and virtualization are entirely maintained by the provider|
Computers, switches, routers, HVAC, facility
Google App Engine,
Microsoft SQL Azure,
Cloud Computing and the USA PATRIOT Act
The USA Patriot Act make cloud providers' promises of confidentiality a sad joke.
In June 2011, Microsoft's U.K. managing directory publicly admitted that Microsoft will hand over data stored in any of its world-wide facilities when asked to do so by the U.S. Government.
- Even if the customer is an EU company
- Even if the data subjects are EU citizens
- Even if the Microsoft data center is located within the E.U.
It gets worse, there is no guarantee that the cloud customers or data subjects would be notified if the demand was accompanied by a gag order, injunction, or U.S. National Security Letter (which almost certainly would be the case). Read the story here
The same is true, of course, for Amazon, Google, or any other U.S.-based company. Microsoft is simply the first one to admit this out loud.
Politico did a great story about the impact of the USA PATRIOT Act on U.S. corporations attempting to do business in the E.U., or really anywhere outside the U.S. with any concern about privacy and data confidentiality.
Wired also had a story on this.
The great Dark Reading mailing list discussed how further regulations being considered by the European Union may clarify information security requirements while making it even harder for potential customers to use U.S. businesses.
See the government surveillance page for more on the revelations of sweeping NSA wiretapping that started appearing in mid-2013.