Rack of Ethernet switches.

Visualizing Log Patterns with Color

Apache Logs in Color

Web server logs reveal patterns of activity by web crawlers. Some are indexing crawlers operated by search engines, some are mysterious. Another pattern is systematic blind searches for vulnerable server-side executables or other configuration problems. The pattern you want to see is the interested user who follows some path through the hyperlinks on your site, taking time to read the pages.

Maybe we could use color to help spot these patterns?

Maybe...

First, let's look at the result, then the explanation comes later. Here are the most recent client requests, starting most recent first. Your request for this page won't appear there as it isn't complete by the time this page was automatically generated with PHP. But if you reload the page you should see your initial request near the top.

34.228.38.35 US, United States 25/Apr/2019:01:47:40 /robots.txt
95.163.255.150 RU, Russian Federation 25/Apr/2019:01:47:38 /travel/japan/ise/
40.77.167.190 US, United States 25/Apr/2019:01:46:46 /
47.29.185.210 IN, India 25/Apr/2019:01:46:40 /
192.188.130.10 US, United States 25/Apr/2019:01:46:39 /cybersecurity/comptia/
54.224.214.196 US, United States 25/Apr/2019:01:46:09 /travel/france/normandy/arromanches.html
54.224.214.196 US, United States 25/Apr/2019:01:46:07 /robots.txt
157.55.39.212 US, United States 25/Apr/2019:01:45:55 /cybersecurity/cyberwar/
34.235.161.196 US, United States 25/Apr/2019:01:45:54 /travel/france/normandy/arromanches.html
68.151.142.18 CA, Canada 25/Apr/2019:01:45:12 /travel/france/normandy/pegasus-bridge.html
184.54.110.116 US, United States 25/Apr/2019:01:45:06 /open-source/performance-tuning/tcp.html
40.77.167.190 US, United States 25/Apr/2019:01:45:05 /open-source/performance-tuning/tcp.html
58.213.108.71 CN, China 25/Apr/2019:01:45:05 /open-source/performance-tuning/disks.html
82.37.83.176 GB, United Kingdom 25/Apr/2019:01:44:43 /turkish/nouns.html
1.127.106.113 AU, Australia 25/Apr/2019:01:44:29 /travel/belgium/bastogne-ardennes/
148.64.56.127 GB, United Kingdom 25/Apr/2019:01:44:02 /travel/usa/new-york-revolutionary/catholic-worker.html
209.193.50.98 US, United States 25/Apr/2019:01:43:49 /technical/dsl/
107.77.198.122 US, United States 25/Apr/2019:01:43:42 /technical/dsl/
107.3.199.97 US, United States 25/Apr/2019:01:42:39 /radio/tv-antenna.html
183.171.135.107 MY, Malaysia 25/Apr/2019:01:42:00 /travel/turkey/buses/
199.16.157.181 US, United States 25/Apr/2019:01:41:44 /travel/usa/new-york-jewish-les/?s=tweetbot
148.64.56.118 GB, United Kingdom 25/Apr/2019:01:41:40 /travel/usa/new-york-jewish-les/stanton-rivington.html
148.64.56.70 GB, United Kingdom 25/Apr/2019:01:41:15 /travel/france/jim-morrison-paris.html
54.36.148.245 FR, France 25/Apr/2019:01:40:58 /radio/probes.html
65.36.114.14 US, United States 25/Apr/2019:01:40:42 /travel/france/normandy/brecourt-manor.html
34.235.161.196 US, United States 25/Apr/2019:01:40:34 /travel/france/normandy/arromanches.html
34.235.161.196 US, United States 25/Apr/2019:01:40:33 /robots.txt
199.16.157.183 US, United States 25/Apr/2019:01:40:09 /travel/usa/new-york-jewish-les/food-along-houston.html?s=tweetbot
107.3.199.97 US, United States 25/Apr/2019:01:39:52 /radio/tv-antenna.html
173.168.222.62 US, United States 25/Apr/2019:01:39:42 /cybersecurity/cyberwar/israel.html
89.149.88.149 MD, Moldova, Republic of 25/Apr/2019:01:39:38 /technical/samsung-galaxy/linux.html
54.36.150.62 FR, France 25/Apr/2019:01:39:29 /text/reports/3alice.1.txt
123.125.71.87 CN, China 25/Apr/2019:01:38:14 /technical/
67.72.99.20 NL, Netherlands 25/Apr/2019:01:37:27 /ads.txt
96.250.4.133 US, United States 25/Apr/2019:01:37:24 /travel/usa/new-york-jewish-les/stanton-rivington.html
173.168.222.62 US, United States 25/Apr/2019:01:37:14 /cybersecurity/cyberwar/history.html
40.77.167.190 US, United States 25/Apr/2019:01:37:13 /technical/hp-printer-ready-message.html
173.168.222.62 US, United States 25/Apr/2019:01:36:34 /cybersecurity/cyberwar/
119.109.47.243 CN, China 25/Apr/2019:01:36:15 /open-source/letsencrypt-tls-cert-godaddy.html
98.167.55.86 US, United States 25/Apr/2019:01:34:57 /turkish/verbs.html
66.249.79.155 US, United States 25/Apr/2019:01:33:43 /ads.txt
68.151.142.18 CA, Canada 25/Apr/2019:01:33:23 /travel/france/normandy/omaha-beach.html
174.255.66.25 US, United States 25/Apr/2019:01:33:08 /technical/dsl/
67.36.61.30 US, United States 25/Apr/2019:01:32:24 /travel/usa/us-wash-masonic.html
207.38.87.148 US, United States 25/Apr/2019:01:30:52 /cybersecurity/root-password.html
208.64.194.249 US, United States 25/Apr/2019:01:30:49 /open-source/performance-tuning/tcp.html
24.177.194.31 US, United States 25/Apr/2019:01:30:46 /turkish/background.html
66.249.83.27 US, United States 25/Apr/2019:01:30:28 /
207.46.13.131 US, United States 25/Apr/2019:01:30:26 /turkish/nouns.html?ref=SeksDE.Com
207.46.13.131 US, United States 25/Apr/2019:01:30:25 /cybersecurity/crypto/digital-signatures-and-certificates.html
24.177.194.31 US, United States 25/Apr/2019:01:30:19 /turkish/
162.237.207.112 US, United States 25/Apr/2019:01:29:45 /cybersecurity/isc2-ccsp/operations.html
74.67.42.31 US, United States 25/Apr/2019:01:29:10 /travel/uk/ben-nevis/
150.95.186.38 JP, Japan 25/Apr/2019:01:29:08 /technical/samsung-galaxy/secret-code.html
150.95.186.38 JP, Japan 25/Apr/2019:01:29:07 /robots.txt
162.104.210.23 US, United States 25/Apr/2019:01:29:02 /open-source/multiboot-windows-openbsd/
162.237.207.112 US, United States 25/Apr/2019:01:28:50 /cybersecurity/isc2-ccsp/
210.227.113.138 JP, Japan 25/Apr/2019:01:28:19 /networking/netstat-s.html
66.249.79.151 US, United States 25/Apr/2019:01:28:17 /cybersecurity/root-password.html
207.46.13.131 US, United States 25/Apr/2019:01:27:30 /fun/teaching/nsa-vs-spyware.html
207.46.13.131 US, United States 25/Apr/2019:01:27:25 /oliver-cromwell/
46.229.161.131 US, United States 25/Apr/2019:01:26:44 /cybersecurity/monitoring.html
148.64.56.112 GB, United Kingdom 25/Apr/2019:01:26:13 /travel/usa/us-wash-ah.html
157.55.39.212 US, United States 25/Apr/2019:01:26:12 /travel/japan/kamakura/path-yagura.html
107.77.227.203 US, United States 25/Apr/2019:01:26:01 /travel/usa/new-york-revolutionary/catholic-worker.html
46.4.83.150 DE, Germany 25/Apr/2019:01:25:59 /fun/strange-signs.html
131.220.6.132 DE, Germany 25/Apr/2019:01:25:50 /
162.237.207.112 US, United States 25/Apr/2019:01:25:32 /cybersecurity/isc2-ccsp/
3.213.113.87 US, United States 25/Apr/2019:01:25:11 /cybersecurity/cyberwar/
68.151.142.18 CA, Canada 25/Apr/2019:01:25:06 /travel/france/normandy/utah-beach.html
157.55.39.212 US, United States 25/Apr/2019:01:25:05 /travel/uk/the-road-to-the-isles/steall-glen-nevis.html
107.77.227.203 US, United States 25/Apr/2019:01:24:53 /travel/usa/new-york-revolutionary/anarchists.html
68.151.142.18 CA, Canada 25/Apr/2019:01:24:51 /travel/france/normandy/
68.151.142.18 CA, Canada 25/Apr/2019:01:24:35 /travel/france/normandy/
216.154.55.209 CA, Canada 25/Apr/2019:01:23:57 /technical/dsl/
46.229.168.147 US, United States 25/Apr/2019:01:23:28 /networking/terminology.html
140.186.17.4 US, United States 25/Apr/2019:01:23:18 /open-source/sendmail-ssl.html
208.180.190.222 US, United States 25/Apr/2019:01:23:07 /networking/switch-programming.html
66.249.79.153 US, United States 25/Apr/2019:01:22:48 /travel/uk/lee-ho-fook/
46.229.168.149 US, United States 25/Apr/2019:01:22:08 /technical/samsung-galaxy/memory-upgrade.html
100.27.22.32 US, United States 25/Apr/2019:01:21:45 /travel/france/mont-saint-michel-saint-malo/mont-saint-michel.html
162.243.127.7 US, United States 25/Apr/2019:01:21:33 /technical/samsung-galaxy/linux.html
103.5.140.134 JP, Japan 25/Apr/2019:01:21:32 /travel/france/mont-saint-michel-saint-malo/mont-saint-michel.html
66.249.80.59 US, United States 25/Apr/2019:01:20:35 /technical/dsl/
5.255.250.17 US, United States 25/Apr/2019:01:20:20 /travel/russia/hospital.html
157.55.39.212 US, United States 25/Apr/2019:01:19:59 /turkish/
42.236.99.154 CN, China 25/Apr/2019:01:19:14 /
42.236.99.242 CN, China 25/Apr/2019:01:19:07 /
42.236.101.194 CN, China 25/Apr/2019:01:19:03 /
68.151.142.18 CA, Canada 25/Apr/2019:01:18:54 /travel/france/normandy/
40.77.167.190 US, United States 25/Apr/2019:01:18:19 /travel/china/guangzhou-2.html
157.55.39.212 US, United States 25/Apr/2019:01:18:12 /technical/samsung-galaxy/secret-code.html
66.249.79.153 US, United States 25/Apr/2019:01:17:21 /open-source/sendmail-ssl.html
172.243.245.221 US, United States 25/Apr/2019:01:16:36 /travel/france/jim-morrison-paris.html
148.64.56.121 GB, United Kingdom 25/Apr/2019:01:16:00 /technical/current-box.html
66.249.79.155 US, United States 25/Apr/2019:01:14:40 /travel/usa/new-york-skate-manhattan/
68.151.142.18 CA, Canada 25/Apr/2019:01:14:39 /travel/belgium/bastogne-ardennes/ardennes-forest.html
66.249.83.29 US, United States 25/Apr/2019:01:14:32 /travel/usa/new-york-mcgees/
144.76.14.17 DE, Germany 25/Apr/2019:01:13:39 /
54.36.150.69 FR, France 25/Apr/2019:01:13:09 /travel/france/loire-valley/val-de-loire.html
24.169.13.45 US, United States 25/Apr/2019:01:12:41 /radio/
108.44.122.242 US, United States 25/Apr/2019:01:12:39 /travel/usa/us-wash-ah.html

Here's what's going on.

Each line above is a request from a client, extracted from Apache's /var/www/logs/access_log file. The client IP address, timestamp, and requested path were selected with awk and the client IP address converted to a country if possible with geoiplookup.

The first 3 octets or first 24 bits of the IP address are used to specify the hue, with chroma at 75% and intensity at 100%. The resulting red, green, and blue values are scaled to the range of 0-255 and printed as two-character hexadecimal in an HTML style string.

Low-numbered /8 networks appear as red, 20.0.0.0/8 through 40.0.0.0/8 are orange shifting to yellow, 50.0.0.0/8 through 110.0.0.0/8 are shades of green, the /16 networks 130.0.0.0/16 through about 180.0.0.0/16 are shades of blue, then it's shades of purple into magenta for the /24 networks 192.0.0.0/24 and up through 223.255.255.0/24.

The HTML file on the server has a line where PHP uses passthru() to call the following shell script:

#!/bin/sh

# Initial pipeline:
# tail		Just the last 200 (or slightly less after the grep)
# grep		... just the requests out of that
# cat | sort	... put into reverse order
# sed		... remove the quotes and square brackets
# awk		... print the IP address twice, timestamp, and requested path
# sed		... remove the first 3 dots to split first version of IP
#			address into octets, and remove any characters that
#			could cause trouble when inserted into this page
# I need to use the client IP address, field #5 at that point, to call
# geoiplookup.  So, send the initial pipeline into a while loop that
# assigns variables, sets a new variable, and then echoes the resulting
# collection into awk.
tail -200 /var/www/logs/access_log |
	grep 'GET.*200' |
	cat -n | sort -nr |
	sed -e 's/"/ /g' -e 's/\[//g' -e 's/\]//g' |
	awk '{print $2, $2, $5, $8}' |
	sed -e 's/\./ /' -e 's/\./ /' -e 's/\./ /' -e 's/[<>]//g' |
	while read IP1 IP2 IP3 IP4 CLIENTIP TIMESTAMP URL
	do
		COUNTRY=$( geoiplookup $CLIENTIP |
				sed 's/.*Edition: //' |
				sed 's/IP Address not found/Unknown/' )
		echo $IP1 $IP2 $IP3 $IP4 $CLIENTIP $COUNTRY $TIMESTAMP $URL |
		awk '{
			ip1 = $1;
			ip2 = $2;
			ip3 = $3;
			chroma = 0.75;
			hue = 6*(ip1*255*255 + ip2*255 + ip3)/(255*255*255);
			if (hue%2 > 1) {
				x = chroma*(1.0 - (hue%2 - 1));
			} else {
				x = chroma*(1.0 - (1 - hue%2));
			}
			if (hue < 1.0) {
				r = chroma;
				g = x;
				b = 0;
			} else if (hue < 2.0) {
				r = x;
				g = chroma;
				b = 0;
			} else if (hue < 3.0) {
				r = 0;
				g = chroma;
				b = x;
			} else if (hue < 4.0) {
				r = 0;
				g = x;
				b = chroma;
			} else if (hue < 5.0) {
				r = x;
				g = 0;
				b = chroma;
			} else {
				r = chroma;
				g = 0;
				b = x;
			}
			r = (r + 0.25)*255;
			g = (g + 0.25)*255;
			b = (b + 0.25)*255;

			printf("<div class=\"col-12 textleft\" ");
			printf("style=\"color:#000; background:#%02x%02x%02x;\"> ", r, g, b);
			for (i = 5; i <= NF; i++) {
				printf("%s ", $i);
			}
			printf("</div>\n");
		}'
	done 

Other Pages