Cloud Data Security Domain
Intellectual Property
- Copyright for expressions of ideas (books, movies, music). See "fair use" exceptions (e.g., class uses pages from a book)
- Trademark for specific words and logos
- Patent for inventions, processes, materials
- Trade secrets for things that can't be patented (recipes, client and supplier lists, etc)
- DMCA has been abused, "takedown notice" used to harass or DoS a site
Data Rights Management / Information Rights Management
- Can work via:
- Rudimentary Reference Check: enter a phrase or number from manual
- Online Reference Check: enter product key at installation, OS will check online later
- Local Agent Check: install a tool that does this, used with games
- Presence of Licensed Media: is CD in tray?
- Support-Based Licensing: pay annual support to get updates and patches
- DRM should provide:
- Persistent Protection — follow the content
- Dynamic Policy Control — allow creators and owners to modify permissions
- Automatic Expiration
- Continuous Auditing — allow for monitoring of use and access history
- Replication Restrictions — including screen-capture, screen-scraping, print, electronic copy
- Remote Rights Revocation
- Might provide more: control printing, add watermark if printed, prohibit copy/paste, prohibit screenshot
Data Storage Models
- Volume storage — like an attached disk,
often associated with IaaS
- File storage — the usual file system hierarchy
- Block storage — block device like a blank disk (e.g., AWS EC2)
- Object-based storage — like AWS S3,
metadata describing content,
usually associated with PaaS.
Accessible through an API.
Often described as part of a hierarchy,
but not a file system as with
volume storage.
- Structured — easy to include in a database
- Unstructured — messy pile of multimedia, email messages, photos, audio files, presentations
- Databases — usually with PaaS and SaaS
- CDN — Content Delivery Network, stream data to SaaS apps
- Raw storage — concept for provider — RDM (Raw Device Mapping) in VMware, or Pass-Through Disks in Microsoft Hyper-V
Database encryption
- File-level encryption (encrypt the DB file)
- Transparent encryption (runs within database)
- Application-level encryption (part of application accessing the DB)
Data Masking — hide, replace, or omit sensitive data
Approaches:
- Random substitution
- Hashing — replace with hash, which will distort format and other characteristics, cannot be reversed
- Algorithmic substitution (allows for two-way substitution, so this is very low-security encryption!)
- Shuffle (shuffle values within the same column)
- Masking (hide content with characters, credit card becomes XXXX XXXX XX65 4321). This is for internal use, for testing and training, not for printing receipts.
- Deletion (null value or delete it)
Methods:
- Static — new copy is created with masking
- Dynamic — on-the-fly, hide some data when records are accessed
Data Anonymization
Similar to masking, also remove indirect identifiers to prevent analysis figuring out what PII would have directly shown.
Used to analyze statistics on large collection containing PII.
Data Tokenization
Replace a sensitive data element with a token, a random value with shape and form of original. A tokenization application maps between the tokens and actual values. Needs a second database.
PCI DSS requires either encryption or tokenization of PII and card data.
Bit Splitting
Encrypt, split ciphertext and key across storage locations. With redundancy, your data survives individual drive failures, or seizures of some media by law enforcement.
Generate a random 256-bit key, encrypt your data with AES-CBC. For each 8-bit block of the ciphertext and the key, store:
- Bits 123456 at cloud #1
- Bits 345678 at cloud #2
- Bits 125678 at cloud #3
- Bits 123478 at cloud #4
You could reassemble the ciphertext and key with the data from any two clouds. That's all you need to understand for the test.
In the real world, each data center and its corporate headquarters would have to be in a separate country. And in the really real world with the US CLOUD Act, no more than one could be in the U.S. or another Five Eyes country, or any other country where the U.S. has strong influence. Chile, South Africa, India, and Singapore might work, as long as the cloud providers have their headquarters in those countries.
More advanced, possible but less likely to appear:
- SSMS (Secret Sharing Made Short) — encrypt data, use IDA (information dispersal algorithm) to split the data into fragments using erasure coding. Split the key, sign and distribute fragments of ciphertext and key to different cloud storage services. User must have m out of n fragments of data and key.
- AONT-RS (All-Or-Nothing Transform with Reed-Solomon) — similar approach
Cryptography
Here is a set of terms you should know.
- Plaintext or Cleartext
- Ciphertext or cryptogram
- Cryptosystem is the entire system — the cipher or algorithm, plus all the details of how the keys are generated, agreed upon or exchanged, and used.
- Cryptovariable is obviously the key, plus maybe an IV or other data.
- Initialization vector or IV means that even if you encrypt the same cleartext with the same key, the ciphertext will be different. Patterns won't leak through. Also called a nonce. A salt is similar, but it's used with password hashes.
- Session key means you use a new randomly generated key for each message, or for each encrypted file, or for each web or VPN session. Breaking one message gives your attacker just that, it does not give them anything else.
- Key space and work factor
- Symmetric versus asymmetric
-
Asymmetric ciphers include
- Elliptic-curve or ECC or just EC
- RSA
- El Gemal is another, but much less likely to appear.
-
Stream ciphers
- RC4 shouldn't be used any more!
- Salsa and ChaCha, used in TLS
-
Block ciphers
- AES aka Rijndael is the obvious one.
-
And modes of operation, for file-like data:
- Electronic Code Book or ECB is a bad idea.
- Cipher Block Chaining or CBC is commonly used for files, file systems, entire storage devices — AES-CBC.
-
And for stream-like data, where we want
authenticated encryption:
- Galois Counter Mode or GCM, used for TLS — AES-GCM.
- Counter Mode with CBC Message Authentication Mode Protocol or CCMP, used for 802.11i or Wi-Fi — AES-CCMP.
- Kerckhoff's Principle — if you think you have to keep the algorithm secret, then you're hiding a weakness. The strongest ciphers are those for which you don't worry about your adversary having a copy of the code. Or, really, having the entire cryptosystem except for the cryptovariable you used to convert your cleartext into ciphertext. See what I did there? Make sure you can use these terms in sentences, because the exam certainly does!
- Perfect Forward Secrecy or PFS, or just Forward Secrecy, with ephemeral keys. Let's say you use a long-term asymmetric key pair to negotiate symmetric session keys. Your adversary is recording ciphertext. If you have been using PFS, then if your long-term private key is exposed today, they still can't decrypt the old messages. "A breach today doesn't expose secrets from the past."
-
Hash functions
- SHA-2 is a family including SHA-256 and SHA-512. May be written "SHA-2" to mean one of them, "SHA-256" and "SHA-512", or "SHA-2-256" and "SHA-2-512". Less likely, SHA-224 and SHA-384 could show up.
- SHA-3 — there was a big scare when the weaknesses in MD5 and SHA-1 came out, and US NIST announced a contest to replace the SHA-2 family. Keccak, designed by a team including a member of the Rijndael/AES team (Go, Belgium!) won, but... It turns out we didn't need SHA-3 immediately after all. Some day it will be a drop-in replacement for the SHA-2 suite.
-
PKI or
Public Key Infrastructure
- Digital signatures give you Proof of Origin (sender identity) plus Proof of Content (message integrity), and that combination gives you Non-Repudiation.
- X.509v3, the standard format for a digital certificate
- You make a Certificate Signing Request or CSR to ask the CA to create a certificate containing your public key.
- Certificate Authority versus a Registration Authority
- Certificate Revocation List or CRL, which you might check via OCSP
- Certificate Practices Statement or CPS is the CA's rules for getting a certificate.
- Homomorphic encryption — Still under development, this would allow you to process data while it is encrypted. Ciphertext input, process that, ciphertext output. The person who had encrypted the input could then decrypt the output and see the correct answer. It sounds like science fiction, it largely was for ages, but it's real. See homomorphicencryption.org for an open industry / government / academic consortium.
-
M-of-N control for Key Escrow —
You must be able to recover your encrypted data.
So you must always have decryption keys available.
But you could lose them.
So you keep copies in "key escrow".
Since you cannot find one perfect 100% trustworthy
person available 100% of the time, you:
- Put the decryption keys into one storage area, locked shut (encrypted) with a master key.
- You choose two numbers M and N, and...
- Split the master key into N parts, giving each one to a reasonably trustworthy and reasonably available person. Not perfect, but good enough.
- Now any subset of M of them can build the master key and access the escrow storage.
Quantum Science
Quantum computing is offensive, a threat to break ciphers and expose secrets. A truly general-purpose quantum computer with enough stable qubits could run Shor's algorithm to quickly solve the now "too difficult" problems that protect asymmetric ciphers — factoring for RSA and discrete logarithm for ECC. Symmetric ciphers should (as far as we currently understand) be relatively safe, Grover's algorithm reduces a 256-bit cipher to the resistance of a 128-bit cipher against brute-force search.
Quantum cryptography is defensive, to protect secrets. It's really about QKD or Quantum Key Distribution, using single-photon signaling to transmit a key to be used in a conventional symmetric cipher. China is a world leader in this, see one of my "Just Enough Cryptography" pages for details on the Chinese quantum Internet.
Responsibility depending on type of cloud service
IaaS | PaaS | SaaS | |
Security GRC (Governance, Risk, and Compliance) |
Enterprise | Enterprise | Enterprise |
Data Security | Enterprise | Enterprise | Enterprise |
Application Security | Enterprise | Enterprise | Shared |
Platform Security | Enterprise | Shared | CSP |
Infrastructure Security | Shared | CSP | CSP |
Physical Security | CSP | CSP | CSP |
Shared because:
- IaaS — Provider hosts images, their standard offerings plus whatever you create and store. You maintain your virtual machines.
- PaaS — Provider maintains run-time libraries (Java, Python, PHP, Perl, .NET, etc) and the development environment. You create the code and back up your software.
- SaaS — Provider maintains the application. You provision users, possibly configure application options, and train users to use the application carefully.
Now That You Know Cryptography...
Exam Language TricksThere are some questions where knowing all the technology doesn't give you the correct answer. You must carefully analyze the English prose.
(ISC)2 isn't as bad as CompTIA about doing this, but they still do it on some questions. See my page with explanations and examples.
Example Question
Question: Your company has decided to start selling products through your website, accepting payment by credit and debit cards. You will do this in an public cloud setting and your staff will administer the servers' operating systems and applications. A secure tunnel connects your cloud server to the payment processing firm. Your staff must install client-side certificates on your VMs so they can automatically authenticate into the payment processor. All purchase records will be stored in your virtual private cloud, in object storage protected by encryption. (except, of course, not the CVV) The payment processor returns values which you store in the purchase records to support any later refunds. What do you need?
A: IaaS
B: TLS
C: X.509v3
D: AES-CBC
E: PCI-DSS
F: Tokenization