Storm clouds in France.

Secure Distributed Logging:
Syslog, TLS, and Amazon EC2 Cloud Servers

Collect Log Information Carefully

Government and industry regulations require that you collect and archive log data in a trustworthy way. This page shows how to securely collect log data using the industry standard Syslog protocol using a combination of the Rsyslog implementation and the Transport Layer Security (TLS) protocol. This provides security against the primary threats:

Rsyslog is the advanced logging solution.
Transport Layer Security (TLS).

Disclosure — We must prevent unauthorized access to the content of syslog messages, as some may be sensitive and useful to an attacker.

Modification — We must prevent modification of a syslog message in transit, as this could hide an attack or cause analysis to lead to undesirable events.

Masquerade — We must not accept syslog messages from unauthorized sources, and they must not be sent to unauthorized collection points.

We will also be secure against the secondary threat of message stream modification, in which individual messages might be deleted, re-transmitted, or re-ordered, as our Syslog messages will be sent through an encrypted stream protocol.

Two lesser malicious threats remain: denial of service in the form of a data flood and traffic analysis in the form of analyzing the mesh of Syslog sources and collectors.

Our concern about malicious denial of service is limited to two cases: one of our own system administrators going rogue (and in that case we have bigger things to worry about), and attacks by third parties against our publicly reachable Syslog collection servers.

As for traffic analysis, the only concerns would be surveillance in preparation for a denial of service attack, and attempts to discover system relationships we had hoped to keep covert. However, if our systems are all deployed throughout Amazon's Elastic Compute Cloud (EC2), then only Amazon and Internet backbone providers present a traffic analysis threat.

Data availability cannot be guaranteed, and without mathematical (that is, cryptographic) tools we cannot specify just how available a data set will be. So, to do what we can, we will use "live" distributed logging, sending log messages as the events happen to multiple geographically distributed collection points over channels cryptographically protected against disclosure, modification and masquerade.

Hosts in our Syslog TLS network: CA or Certificate Authority, message sources, and message collectors.

Here are the hosts. One of them, our Certificate Authority or CA, should not be out in the cloud. All of our cryptographic security relies on the CA machine's data confidentiality, data integrity, and user authentication. Be very careful with this one!

The log sources are the servers doing the company's work. The log collectors keep the company compliant.

Nothing is perfect, including the availability of Amazon Web Services. However, it seems that the significant outages have mostly been limited to a single availability zone. There was a DDOS attack against a range of IP addresses in the US-East region on 1 Dec 2010 in retaliation for Amazon pulling its service to Wikileaks, what Amazon has described as a "mirroring storm" with one availability zone in US-East on 21-25 Apr 2011, a power outage in one availability zone in EU-West on 6 Aug 2011, and a network connectivity outage for 25 minutes within US-East on 8 Aug 2011.

The log message sources and collectors could be anywhere, but in my example they are all instances in Amazon's EC2 cloud. The collectors are not co-located with each other or with any of the sources.

Amazon has a number of geographic regions at the time of writing this: US-East, US-West-1, US-West-2, EU-West, Southeast Asia, and Northeast Asia. These nominally correspond to Virginia, the San Francisco Bay area of California, Oregon, Dublin, Singapore, and Tokyo, respectively. Each region has multiple "Availability Zones", and while it isn't clear just what a given Availability Zone means in terms of geography, we can put our log collectors in geographic zones shared with no source or other collector.

Key distribution in our Syslog TLS network: CA or Certificate Authority, message sources, and message collectors.

The CA will first generate its own signing key and corresponding self-signed certificate. It will then generate key pairs and corresponding CA-signed certificates specifying the public keys. The CA's public key certificate and the host keys and certificates will be copied into place, as shown here.

There are three ways of verifying certificates with the GnuTLS library used by Rsyslog:

1. x509/name — Certificate validation and subject name authentication as described in IETF's Internet draft draft-ietf-syslog-transport-tls-12.

2. x509/fingerprint — Certificate fingerprint as described in IETF's Internet draft draft-ietf-syslog-transport-tls-12.

3. x509/certvalid — Certificate validation only.

The problem as I see it for x509/name is that I would need to create a unique certificate for each host deployed. But one feature of the cloud, to many people one of its strongest points, is the ease and speed of deployment, use, and termination. It's as easy to start arbitrarily many instances simultaneously as it is to start just one. Either way, it's just a matter of clicking through a few screens and entering either "1" or a larger number on one of them. Then you just wait perhaps another 20 to 30 seconds for your new host(s) to be copied into place and booted up. It would take me longer to generate a digital certificate for just one of them, much longer yet if the CA machine were not accessible across even an internal network, a likely situation if you're being truly careful about your core CA host.

The x509/fingerprint authentication does not seem to me to provide significant security beyond simply verifying that the certificate offered by the remote host is valid (based on a test with a CA public key believed to be valid) and then requiring the remote host to authenticate itself within the TLS handshakes.

So, I will show you how to set this up with x509/certvalid host authentication.

Secure TLS connections in our Syslog TLS network: CA or Certificate Authority, message sources, and message collectors.

TCP port 6514 has been allocated as the default port for Syslog over TLS. Each source host will initiate a TLS handshake with all configured collectors as soon as it happens to have something to log — which happens to be the event of starting its local Syslog service. The TCP connection will stay up, being re-established as needed due to network timeouts or the Syslog service being stopped on either the source or collector.

We will configure our sources to log one marker message every 60 seconds, providing a means of continuously verifying that event logs are being captured.

That's enough background, let's do this!

Step 1: Create the CA

We just need the certtool command. It should be on your system, or easily added. The steps for creating the CA are:

certtool is easily added if needed:
RHEL/CentOS Linux: gnutls-utils package
Mageia Linux: gnutls package
OpenBSD: gnutls package
Mac OS: It should already be there
Windows: Install any of the above

1. Generate a private key. The private key must be safely stored on the CA host so that no unauthorized parties (and that means pretty much everyone) can read it.

2. Generate a self-signed certificate from that. The self-signed certificate will be installed on our other hosts. It is self-signed in the sense that the CA host is effectively saying "This is my public key, and you should believe that it is my public key because I say it is, and I am the source of information about keys."

Start by setting aside some safe directory on the CA host. Now: Generate the private key, changing the RSA key length as appropriate. Some of these commands long so I will break them across multiple lines with backslashes. They may make a little more sense to the reader with options and their parameters isolated on lines together. Of course you don't need to type the backslashes but I have shown them so it works for copy-and-paste artists:

ca:$ certtool --generate-privkey	\
	--sec-param=normal		\
	--outfile ca-key.pem 

Your choices for sec-param are: low (1248), normal (2432), high (3248), and ultra (15424).

Generate the self-signed certificate:

ca:$ certtool --generate-self-signed	\
	--load-privkey ca-key.pem	\
	--outfile ca.pem

That command will ask you a sequence of questions. You will probably want to make it last for a long time before expiring. It uses units of days, 3650 would mean about ten years and 7300 would mean about twenty.

Answer the other questions as needed for your situation, but answer "y" to the question "Does the certificate belong to an authority?", specify -1 (no constraint) for the path length, and answer "y" to the question "Will the certificate be used to sign other certificates?"

Let's make sure that everything worked right:

ca:$ ls -l
total 8
-rw------- 1 yourlogin yourlogin 1675 today 11:13 ca-key.pem
-rw-r--r-- 1 yourlogin yourlogin 1367 today 11:17 ca.pem
ca:$ openssl x509 -in ca.pem -text
[ ... certificate details appear here ... ]
ca:$ openssl asn1parse -in ca.pem
[ ... certificate details appear here ... ] 

Step 2: Host Key and Certificate Generation

As I explained, I am going to use certificate validation for authentication and authorization. Furthermore, I am going to trust my system administrators not to be security risks for denial of service, which they could do by either copying key and certificate information to other hosts under their control or by giving that information away to others.

Finally, these keys and certificates will be used only for the Syslog/TLS project, not for any other host or user authentication.

Given those assumptions, I don't really need one unique key and certificate per host, I can use the same thing everywhere. So:

Generate a host private key, a 2048-bit RSA key:

ca:$ certtool --generate-privkey	\
	--bits 2048	\
	--outfile key.pem 

Generate a key signing request. Most of the Y/N questions will be N, the default, but answer Y to it being both a TLS web client and server certificate.

ca:$ certtool --generate-request	\
	--load-privkey key.pem		\
	--outfile request.pem

Now generate the certificate:

ca:$ certtool --generate-certificate	\
	--load-request request.pem	\
	--outfile cert.pem --load-ca-certificate ca.pem	\
	--load-ca-privkey ca-key.pem

Again, let's see what we have. We don't need the certificate request file any more, we'll get rid of it:

ca:$ rm request.pem
ca:$ ls -l
total 16
-rw------- 1 yourlogin yourlogin 1675 2011-10-05 11:13 ca-key.pem
-rw-r--r-- 1 yourlogin yourlogin 1367 2011-10-05 11:17 ca.pem
-rw-r--r-- 1 yourlogin yourlogin 1493 2011-10-05 11:26 cert.pem
-rw------- 1 yourlogin yourlogin 1675 2011-10-05 11:22 key.pem
File Contents
ca.pem Certificate with the CA's signing public key
key.pem Private key for this host
cert.pem CA-signed certificate with the other end's public key

The file ca-key.pem is the extremely sensitive one. It might make sense to run your CA on a host that is not connected to a network, using a USB thumbdrive to export data from it. And, of course, your CA should not run an operating system that does silly things like automatically run executable files found on removable media.

Move the files ca.pem, cert.pem, and key.pem to a workstation on your network. The file key.pem is the most sensitive of these, as an attacker with that file could masquerade as a legitimate source of log messages and launch a data-flooding denial of service attack. Leave it on the isolated and hardened CA system.

In theory, the same attacker could masquerade as a legitimate collector of log data, but this would require either taking the place of a log collector or misleading the administrator of a log source into configuring that system to also send log messages to the attacker's machine. These seem unlikely enough that I don't see the point in worrying about them. If you want to worry about them, you are going to have to use full name-based authentication and generate a unique certificate for every cloud server you deploy.

Step 3: Deploy the Hosts

I had downloaded the Amazon EC2 API tools, a collection of command-line tools that allow you do to things like upload (import) public keys. Generate a key pair with ssh-keygen, being careful to specify that the new keys should not be stored in the default location of ~/.ssh/ but instead somewhere like ~/.ec2. The files are id_rsa and id_rsa.pub for the private and public keys, respectively.

Then set up your environment for the EC2 API toolkit:

$ export JAVA_HOME=/usr
$ export EC2_HOME=/path/to/ec2-api-tools-1.4.4.2
$ export PATH=${PATH}:${EC2_HOME}/bin
$ export EC2_PRIVATE_KEY=~/.ec2/pk-FZK5RTU5SPP5RDHVTCXM3W6EE2CJOYYM.pem
$ export EC2_CERT=~/.ec2/cert-FZK5RTU5SPP5RDHVTCXM3W6EE2CJOYYM.pem
Key File(s) Identity Capabilities
~/.ec2/pk-*.pem Corporate AWS account Deploy, reboot and terminate cloud instances; issue new user / sysadmin credentials; modify firewall rule sets
~/.ec2/id_rsa
~/.ec2/id_rsa.pub
User on cloud instance Can elevate privileges to root by simply typing:
sudo bash
At that point, the user has unlimited privileges on that one instance.

Make sure that you understand the purposes of these two sets of keys:

Your corporate account with Amazon, and thus the ability to deploy servers and incur corporate costs, is used to authenticate to the AWS management interface. This could be the AWS Management Console web interface, or it could be the command-line EC2 API tools. For the web interface you interactively enter an e-mail address and password. The command-line interface uses the private key specified by the EC2_PRIVATE_KEY environment variable, which typically specifies ~/.ec2/pk-*.pem. A provisioner authorized to spend corporate money is given this key.

User identities are used to authenticate to deployed servers, as the system administrator on some images or as a privileged user who can trivially become the administrator on Amazon Linux images. The credentials might conveniently be ~/.ec2/id_rsa and ~/.ec2/id_rsa.pub for the private and public keys, respectively. Users trusted to administer individual cloud servers are given this key. In the interest of security, compartmentalize administrative access to your servers by generating multiple SSH key pairs with ssh-keygen, uploading all the public keys to the AWS cloud under meaningful names, and carefully distributing the private keys to your administrators.

The private key needs to be on your desktop, but the public key must be uploaded to AWS so it can be installed on newly deployed instances. Upload one SSH public key to all the AWS regions, and use the corresponding private key for all authentications into the cloud servers. Use something more imaginative than "keyname" for the key's name — this is the name of the user credentials you will specify when deploying a new instance.

$ ec2-describe-regions
REGION  eu-west-1       ec2.eu-west-1.amazonaws.com
REGION  us-east-1       ec2.us-east-1.amazonaws.com
REGION  ap-northeast-1  ec2.ap-northeast-1.amazonaws.com
REGION  us-west-1       ec2.us-west-1.amazonaws.com
REGION  ap-southeast-1  ec2.ap-southeast-1.amazonaws.com
$ for REGION in eu-west-1 us-east-1 ap-northeast-1 us-west-1 ap-southeast-1
> do
>   ec2-import-keypair keyname --region $REGION --public-key-file ~/.ec2/id_rsa.pub
> done

Or, to be fancy about it:

$ for REGION in $( ec2-describe-regions | awk '{print $2}' )
> do
>   ec2-import-keypair keyname --region $REGION --public-key-file ~/.ec2/id_rsa.pub
> done
Hosts in our Syslog TLS network: CA or Certificate Authority, message sources, and message collectors.

Deploy your cloud servers, both the Syslog sources and the collectors. Log in to the AWS Management Console, select a geographic region and go to the AMIs view. For the log collectors, select Amazon Images and then Amazon Linux in the pull-down menus. Then narrow the search by typing i386.manifest in the search box. You probably want a production image, not a release candidate or a beta test version, with either rc-N or beta in its name.

Select one and start to launch it. The smallest instance size will be more than enough for this.

It will probably be helpful if you give each machine a meaningful name in the screen where you can set up key-value pairs. Something like Virginia Collector or Dublin Logger. This happens on the second screen of the Instance Details section of the process.

As for the Security Group, Amazon's term for firewall rule set, the collectors only need to allow connections over SSH (TCP/22) and Syslog/TLS (TCP/6514).

Repeat as needed to deploy your cloud servers, which will be the log sources. Select whatever image you need to get the job done. Specify Security Groups (firewall rule sets) in terms of whatever they need to accomplish. They will be able to establish their own outbound TLS connections regardless of the inbound rule set. Also give them meaningful names, like Tokyo Server 1 and so on.

The rather long hostnames will be accessible through the AWS management console once you select Instances and then click Show/Hide and show Public DNS. Make a note of these somewhere, as it's a slow bother to change between the regions in the AWS management console. For my test shown here, I had the following. I put these in a text file so I could cat the file and then copy and paste hostnames into command lines:

Role Hostname Region
Source #1 ec2-176-32-73-88.ap-northeast-1.compute.amazonaws.com Tokyo
Source #2 ec2-175-41-156-226.ap-southeast-1.compute.amazonaws.com Singapore
Collector #1 ec2-50-19-188-9.compute-1.amazonaws.com US East
Collector #2 ec2-46-137-0-211.eu-west-1.compute.amazonaws.com Dublin
Key distribution in our Syslog TLS network: CA or Certificate Authority, message sources, and message collectors.

Step 4: Key Distribution

Given my simple approach to host authentication, I need the same three files on every cloud server. Of course, the private key on the other host will happen to be identical to the private key on this host, and the certificate this host offers will be identical to the certificate offered by the other host:

File Contents
ca.pem Certificate with the CA's signing public key
key.pem Private key for this host
cert.pem CA-signed certificate with the other end's public key

Copy the key and certificates to every cloud server. The Amazon Linux configuration prohibits root login and you must login as ec2-user using cryptographic authentication. I show IP addresses rather than those long host names here just to keep the code display width reasonable:

$ scp -i ~/.ec2/id_rsa ca.pem key.pem cert.pem ec2-user@176.32.73.88:
$ scp -i ~/.ec2/id_rsa ca.pem key.pem cert.pem ec2-user@175.41.156.226:
$ scp -i ~/.ec2/id_rsa ca.pem key.pem cert.pem ec2-user@50.19.188.9:
$ scp -i ~/.ec2/id_rsa ca.pem key.pem cert.pem ec2-user@46.137.0.211:

We are getting to the point where it becomes easy to lose track of what you are doing and where you are doing it. I started an X terminal for each cloud server, and gave the Syslog sources, the actual servers doing the work, one color, and the Syslog collectors another. Yellow and orange are a little garish, but they work. Adjust the colors to your taste:

$ xterm -fg black -bg yellow -sb &
$ xterm -fg black -bg yellow -sb &
$ xterm -fg black -bg orange -sb &
$ xterm -fg black -bg orange -sb &

In every one of those windows, become root and move the files into place. Note that you may not need to do the mkdir and chmod commands as the directory is probably there already (at least on Amazon Linux):

[ec2-user@ip-10-150-167-151 ~]$ sudo bash
[root@ip-10-150-167-151 ~]# mkdir /etc/pki/rsyslog
[root@ip-10-150-167-151 ~]# chmod 700 /etc/pki/rsyslog
[root@ip-10-150-167-151 ~]# mv ~ec2-user/*.pem /etc/pki/rsyslog/
[root@ip-10-150-167-151 ~]# chown root.root /etc/pki/rsyslog/*
[root@ip-10-150-167-151 ~]# ls -la /etc/pki/rsyslog
total 20
drwx------ 2 root     root     4096 today 15:35 .
drwxr-xr-x 8 root     root     4096 today 01:29 ..
-rw-r--r-- 1 ec2-user ec2-user 1367 today 15:33 ca.pem
-rw-r--r-- 1 ec2-user ec2-user 1493 today 15:33 rsyslog-cert.pem
-rw------- 1 ec2-user ec2-user 1675 today 15:33 rsyslog-key.pem

As shown in that example output, the EC2 hosts have even more cryptic hostnames than you might have anticipated! Each is behind a (virtual) NAT router on its own private VLAN. Their default hostnames are based on their NAT IP addresses or their Ethernet MAC address, something like ip-10-150-167-151 or domU-12-31-39-16-C9-69. Give all of your cloud servers more meaningful hostnames, then end and restart your root sessions so your prompts are meaningful. The immediate benefit is that this helps you for the next several minutes while you are trying to keep track of all those X terminals, More importantly, this host renaming is required for your logs, as they will be nearly meaningless without it. The Syslog service uses the current hostname value as the identifying field in the message. If you were to skip this step, all your log data would be in terms of names meaningful only within the context of that one NAT-hidden VLAN:

[root@ip-10-150-167-151 ~]# hostname server-japan-1
[root@ip-10-150-167-151 ~]# exit
[ec2-user@ip-10-150-167-151 ~]$ sudo bash
[root@server-japan-1 ~]# 

It would be a very good idea to also edit the file /etc/sysconfig/network and replace the current default localhost.localdomain with your more meaningful hostname. Also fix /etc/hosts while you're at it. This will re-apply your fix if your server reboots:

[root@server-japan-1 ~]# more /etc/sysconfig/network /etc/hosts
::::::::::::::
/etc/sysconfig/network
::::::::::::::
NETWORKING=yes
## I commented out the old line and added a useful one:
## HOSTNAME=localhost.localdomain
HOSTNAME=server-japan-1.subdomain.example.com
NOZEROCONF=yes
NETWORKING_IPV6=no
IPV6INIT=no
IPV6_ROUTER=no
IPV6_AUTOCONF=no
IPV6FORWARDING=no
IPV6TO4INIT=no
IPV6_CONTROL_RADVD=no
::::::::::::::
/etc/hosts
::::::::::::::
## I added a useful field:
127.0.0.1   localhost localhost.localdomain server-japan-1
## I added a useful line, although this is less critical:
10.150.167.151 server-japan-1.subdomain.example.com

Now go back up to where this page says "In every one of these windows" and do the key file moving and ownership change, and the immediate and permanent host renaming on all the other cloud servers. Alternatively, you might do this one step at a time in each of all those colored X terminals.

Step 5: Install the Needed Package on All Hosts

To skip ahead a bit, we're going to need a TLS shared library for Rsyslog. It's normally in /lib/rsyslog/lmnsd_gtls.so, but you may not have that file there. If not, you can ask the YUM package management system to figure out where to get it, and then install that. Answer "y" when it asks if it's OK to install the requested package plus its dependencies of gnutls and libtasn1 (or just run the second yum command with the -y option):

[root@server-japan-1 ~]# yum provides /usr/lib64/rsyslog/lmnsd_gtls.so
Loaded plugins: fastestmirror, priorities, security, update-motd
Loading mirror speeds from cached hostfile
 * amzn-main: packages.us-east-1.amazonaws.com
 * amzn-updates: packages.us-east-1.amazonaws.com
rsyslog-gnutls-4.6.2-3.11.amzn1.i686 : TLS protocol support for rsyslog
Matched from:
Filename    : /lib/rsyslog/lmnsd_gtls.so

[root@server-japan-1 ~]# yum install rsyslog-gnutls

Step 6: Configure the Log Collectors

Save a backup copy of /etc/rsyslog.conf somewhere, and then edit the file to make it look like the below. Yellow highlighted sections are what I added or changed. Blocks that were entirely commented out in the original have been deleted.

#rsyslog v3 config file

# if you experience problems, check
# http://www.rsyslog.com/troubleshoot for assistance

#### MODULES ####

$ModLoad imuxsock.so    # provides support for local system logging (e.g. via logger command)
$ModLoad imklog.so      # provides kernel logging support (previously done by rklogd)
#$ModLoad immark.so     # provides --MARK-- message capability


#### GLOBAL DIRECTIVES ####

# Use default timestamp format
$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat


############################################################################
## BEGIN ADDED CONFIGURATION
##
## Set up rsyslog/TLS on the server
##
## Load the TCP listen module, and use the gtls driver by default.
## This will automatically load /lib/rsyslog/lmnsd_gtls.so, no
## need to explicitly load it and trying to do so will cause
## error messages.
$ModLoad imtcp
$DefaultNetstreamDriver gtls

## Specify the certificate files
$DefaultNetstreamDriverCAFile    /etc/pki/rsyslog/ca.pem
$DefaultNetstreamDriverCertFile  /etc/pki/rsyslog/rsyslog-cert.pem
$DefaultNetstreamDriverKeyFile   /etc/pki/rsyslog/rsyslog-key.pem

## Because we can't predict hostnames, and because cloud servers
## are dynamically deployed and we don't want to have to create
## unique certificates for each one, we will validate the
## certificate itself but not insist that the name match.
$InputTCPServerStreamDriverAuthMode x509/certvalid

## How to accept connections
$InputTCPServerStreamDriverMode 1 # run driver in TLS-only mode

# Log every host in its own file
$template RemoteHost,"/var/log/servers/%HOSTNAME%"


############################################################################
## BEGIN THE ORIGINAL LOCAL RULE SET, MOVED TO THE END
$RuleSet local 

#### RULES ####

# Log all kernel messages to the console.
# Logging much else clutters up the screen.
kern.*                                                 /dev/console

# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none                /var/log/messages

# The authpriv file has restricted access.
authpriv.*                                              /var/log/secure

# Log all the mail messages in one place.
mail.*                                                  -/var/log/maillog

# Log cron stuff
cron.*                                                  /var/log/cron

# Everybody gets emergency messages
*.emerg                                                 *

# Save news errors of level crit and higher in a special file.
uucp,news.crit                                          /var/log/spooler

# Save boot messages also to boot.log
local7.*                                                /var/log/boot.log


############################################################################
## END THE ORIGINAL LOCAL RULE SET, BEGIN ADDED CONFIGURATION
##

############################################################################
##  Use the "local" RuleSet as default
$DefaultRuleSet local

############################################################################
## Define "remote" RuleSet, using the previous 'RemoteHost' template,
## Also keep a copy of everything in one file.
## Bind the RuleSet to the TCP listener and start the listener.
$RuleSet remote
*.info		?RemoteHost
*.info		/var/log/servers-all
$InputTCPServerBindRuleset remote
$InputTCPServerRun 6514  

Restart the service and verify that you see no error messages. Fix your configuration as needed until it restarts cleanly (I have broken the log lines for display here). Also make sure that it is listening for connections. If syslog-tls is not listed in your /etc/services file and this lsof command produces an error, use tcp:6514 instead.

[root@collector-virginia ~]# /etc/init.d/rsyslog restart
Shutting down system logger:                               [  OK  ]
Starting system logger:                                    [  OK  ]
[root@collector-virginia ~]# tail /var/log/messages 
[ ... several lines deleted ... ]
Oct  7 01:09:32 collector-virginia kernel: Kernel logging (proc) stopped.
Oct  7 01:09:32 collector-virginia rsyslogd: [origin software="rsyslogd"
	swVersion="4.6.2" x-pid="21686" x-info="http://www.rsyslog.com"]
	exiting on signal 15.
Oct  7 01:09:32 collector-virginia kernel: imklog 4.6.2, log source =
	/proc/kmsg started.
Oct  7 01:09:32 collector-virginia rsyslogd: [origin software="rsyslogd"
	swVersion="4.6.2" x-pid="21927" x-info="http://www.rsyslog.com"] (re)start
[root@collector-virginia ~]# lsof -i tcp:syslog-tls
COMMAND    PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
rsyslogd 21927 root 5u IPv4  38770      0t0  TCP *:syslog-tls (LISTEN)
rsyslogd 21927 root 6u IPv6  38771      0t0  TCP *:syslog-tls (LISTEN) 

Step 7: Configure the Log Sources

Save a backup copy of /etc/rsyslog.conf somewhere, and then edit the file to make it look like the below. Yellow highlighted sections are what I added or changed. Blocks that were entirely commented out in the original have been deleted.

#rsyslog v3 config file

# if you experience problems, check
# http://www.rsyslog.com/troubleshoot for assistance

#### MODULES ####

$ModLoad imuxsock.so    # provides support for local system logging (e.g. via logger command)
$ModLoad imklog.so      # provides kernel logging support (previously done by rklogd)
#$ModLoad immark.so     # provides --MARK-- message capability


#### GLOBAL DIRECTIVES ####

# Use default timestamp format
$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat


#### RULES ####

# Log all kernel messages to the console.
# Logging much else clutters up the screen.
kern.*                                                 /dev/console

# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none                /var/log/messages

# The authpriv file has restricted access.
authpriv.*                                              /var/log/secure

# Log all the mail messages in one place.
mail.*                                                  -/var/log/maillog

# Log cron stuff
cron.*                                                  /var/log/cron

# Everybody gets emergency messages
*.emerg                                                 *

# Save news errors of level crit and higher in a special file.
uucp,news.crit                                          /var/log/spooler

# Save boot messages also to boot.log
local7.*                                                /var/log/boot.log


############################################################################
## BEGIN ADDED CONFIGURATION
##
## Set up rsyslog/TLS on the client
##
## Load the TCP listen module, and use the gtls driver by default.
## This will automatically load /lib/rsyslog/lmnsd_gtls.so, no
## need to explicitly load it and trying to do so will cause
## error messages.
$ModLoad imtcp
$DefaultNetstreamDriver gtls

# Mark the log file every 60 seconds to ensure it's still alive.
$ModLoad immark.so
$MarkMessagePeriod 60

## Specify certificate files
$DefaultNetstreamDriverCAFile    /etc/pki/rsyslog/ca.pem
$DefaultNetstreamDriverCertFile  /etc/pki/rsyslog/rsyslog-cert.pem
$DefaultNetstreamDriverKeyFile   /etc/pki/rsyslog/rsyslog-key.pem

## Because we can't predict hostnames, and because cloud servers
## are dynamically deployed and we don't want to have to create
## unique certificates for each one, we will validate the
## certificate itself but not insist that the name match.
$ActionSendStreamDriverAuthMode x509/certvalid

## How and where to send messages:  Mode=1 means TLS-only.
$ActionSendStreamDriverMode 1
*.info    @@ec2-50-19-188-9.compute-1.amazonaws.com:6514
*.info    @@ec2-46-137-0-211.eu-west-1.compute.amazonaws.com:6514  

Restart the service and verify that you see no error messages. Fix your configuration as needed until it restarts cleanly (I have broken the log lines for display here). Also make sure that it is listening for connections. If syslog-tls is not listed in your /etc/services file and this lsof command produces an error, use tcp:6514 instead. Notice that I used the -n option here to keep the lsof output numeric, the full hostnames make the output enormously long:

[root@server-japan-1 ~]# /etc/init.d/rsyslog restart
Shutting down system logger:                               [  OK  ]
Starting system logger:                                    [  OK  ]
[root@server-japan-1 ~]# tail /var/log/messages 
[ ... several lines deleted ... ]
Oct  7 01:17:57 server-japan-1 kernel: Kernel logging (proc) stopped.
Oct  7 01:17:57 server-japan-1 rsyslogd: [origin software="rsyslogd"
	swVersion="4.6.2" x-pid="21079" x-info="http://www.rsyslog.com"]
	exiting on signal 15.
Oct  7 01:17:58 server-japan-1 kernel: imklog 4.6.2, log source =
	/proc/kmsg started.
Oct  7 01:17:58 server-japan-1 rsyslogd: [origin software="rsyslogd"
	swVersion="4.6.2" x-pid="21085" x-info="http://www.rsyslog.com"] (re)start
[root@server-japan-1 ~]# lsof -n -i tcp:syslog-tls
COMMAND    PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
rsyslogd 21085 root 5u IPv4  22211      0t0  TCP 10.150.167.151:53767->50.19.188.9:syslog-tls (ESTABLISHED)
rsyslogd 21085 root 6u IPv6  22214      0t0  TCP 10.150.167.151:33563->46.137.0.211:syslog-tls (ESTABLISHED) 

Step 8: Investigate Your Secure Distributed Logs

Back on both of your log collectors, run both these commands:

[root@collector-virginia ~]# ls -ltr /var/log
[root@collector-virginia ~]# ls -l /var/log/server

Verify that you see the unified log file /var/log/servers-all and a collection of files with names like:
/var/log/server/server-japan-1
/var/log/server/server-singapore-1
and so on.

Also verify that lsof -i tcp:syslog-tls shows that it has an established connection with each log source and it is still listening for more connections.

Run these commands, and let them run for a while. You should see all the log sources contributing MARK messages every 60 seconds in the unified file, and you should see that the individual files are also growing:

[root@collector-virginia ~]# tail -f /var/log/servers-all
[ ... log data appears here ... ]
^C
[root@collector-virginia ~]# tail -f /var/log/server/japan-server-1
[ ... log data appears here ... ]
^C
[root@collector-virginia ~]# tail -f /var/log/server/singapore-server-1
[ ... log data appears here ... ]
^C 

From your desktop system, make SSH connections to your cloud servers (the syslog sources). Also make some that intentionally fail, by using the wrong key, not using a key at all, or specifying a bogus user name.

Then verify that all of that activity was captured in each server's dedicated log and in the big unified one.

If it is ever the case that one of the files /var/log/server/* is more than 1 minute old, then some remote logging process has died.

Be aware that logs will automatically be rotated at 0400 UTC every Sunday morning. See the logrotate.conf file to modify how many old copies it keeps around, whether old versions are compressed with bzip2, and so on. Log files tend to have enough redundancy to compress to just about 5% of their original size. A scheduled job inside your organization could automatically pull down the newly archived compressed logs from the previous week at, say, 0500 UTC every Sunday morning. A cautious script would then verify that the download worked and that it contains the expected data.

Now you're ready to do some log analysis with awk and sed, or maybe with perl, or with higher level tools.


Cloud Security

Back to the Security Page Back to the Linux Page