Rack of Ethernet switches.

Visualizing Log Patterns with Color

Nginx and Apache Logs in Color

Web server logs reveal patterns of activity by web crawlers. Some are indexing crawlers operated by search engines, some are mysterious. Another pattern is systematic blind searches for vulnerable server-side executables or other configuration problems. The pattern you want to see is the interested user who follows some path through the hyperlinks on your site, taking time to read the pages.

Maybe we could use color to help spot these patterns?


First, let's look at the result, then the explanation comes later. Here are the most recent client requests, starting most recent first. Your request for this page won't appear there as it isn't complete by the time this page was automatically generated with PHP. But if you reload the page you should see your initial request near the top. US, United States 12/Apr/2024:13:25:48 /robots.txt US, United States 12/Apr/2024:13:25:41 /travel/uk/edinburgh/ FR, France 12/Apr/2024:13:24:50 /travel/usa/venice-california/ UA, Ukraine 12/Apr/2024:13:24:20 /travel/japan/katakana-hiragana/ US, United States 12/Apr/2024:13:24:10 /cybersecurity/ CH, Switzerland 12/Apr/2024:13:24:09 /travel/usa/new-york-hp-lovecraft/?s=mb IR, Iran, Islamic Republic of 12/Apr/2024:13:23:47 /travel/usa/new-york-hp-lovecraft/?s=mb DE, Germany 12/Apr/2024:13:23:46 /travel/usa/new-york-hp-lovecraft/?s=mb DE, Germany 12/Apr/2024:13:23:36 /travel/belgium/art-nouveau-architecture/?s=mb IN, India 12/Apr/2024:13:22:57 /travel/turkey/istanbul/?s=mb Unknown 12/Apr/2024:13:22:39 /travel/usa/new-york-jewish-les/synagogues.html Unknown 12/Apr/2024:13:22:38 /robots.txt CN, China 12/Apr/2024:13:22:37 /travel/japan/tokyo-yasukuni-shrine/ US, United States 12/Apr/2024:13:22:28 /cybersecurity/crypto/?s=mb US, United States 12/Apr/2024:13:22:25 /travel/usa/earthworks/merom-site/ IN, India 12/Apr/2024:13:22:11 /travel/usa/new-york-marvel/?s=mb US, United States 12/Apr/2024:13:22:05 /travel/usa/poe-in-new-york/ US, United States 12/Apr/2024:13:22:02 /travel/uk/avebury/?s=mb US, United States 12/Apr/2024:13:21:50 /travel/usa/new-york-roosevelts/?s=mb US, United States 12/Apr/2024:13:21:15 /travel/france/school-lunch-menus/ Unknown 12/Apr/2024:13:20:57 /travel/usa/north-carolina/?s=mb UA, Ukraine 12/Apr/2024:13:20:56 /travel/japan/katakana-hiragana/ SK, Slovakia 12/Apr/2024:13:20:54 /travel/usa/new-york-roosevelts/?s=mb GB, United Kingdom 12/Apr/2024:13:20:49 /travel/japan/katakana-hiragana/ US, United States 12/Apr/2024:13:20:45 /travel/usa/new-york-roosevelts/?s=mb UA, Ukraine 12/Apr/2024:13:20:20 /travel/japan/katakana-hiragana/ US, United States 12/Apr/2024:13:20:14 /travel/belgium/bastogne-ardennes/bastogne.html US, United States 12/Apr/2024:13:20:08 /travel/usa/poe-in-new-york/ US, United States 12/Apr/2024:13:20:02 /travel/usa/north-carolina/?s=tb US, United States 12/Apr/2024:13:20:02 /robots.txt US, United States 12/Apr/2024:13:20:02 /robots.txt US, United States 12/Apr/2024:13:19:45 /cybersecurity/availability/cloud-archiving.html?s=mc US, United States 12/Apr/2024:13:19:44 /travel/usa/new-york-roosevelts/?s=mb CN, China 12/Apr/2024:13:19:31 /open-source/ FR, France 12/Apr/2024:13:19:20 /travel/uk/ben-nevis/ US, United States 12/Apr/2024:13:19:11 /travel/uk/glen-nevis/?s=mb FR, France 12/Apr/2024:13:18:42 /travel/usa/new-york-roosevelts/?s=mb IS, Iceland 12/Apr/2024:13:18:32 /travel/usa/new-york-hp-lovecraft/?s=mb US, United States 12/Apr/2024:13:18:13 /open-source/torrent-magnet-links.html US, United States 12/Apr/2024:13:18:10 /travel/usa/poe-in-new-york/ IT, Italy 12/Apr/2024:13:18:10 /travel/usa/new-york-roosevelts/?s=mb IN, India 12/Apr/2024:13:18:04 /travel/japan/kofun/emperor-kaika.html?s=mb US, United States 12/Apr/2024:13:17:44 /travel/usa/us-wash-masonic.html?s=mb US, United States 12/Apr/2024:13:17:34 /travel/japan/tokyo-asakusa/hoppy-street.html?s=mb US, United States 12/Apr/2024:13:17:19 /open-source/sendmail-ssl.html SG, Singapore 12/Apr/2024:13:17:16 /travel/usa/new-york-roosevelts/?s=mb US, United States 12/Apr/2024:13:17:11 /travel/usa/new-york-roosevelts/?s=mb AP, Asia/Pacific Region 12/Apr/2024:13:17:09 /travel/uk/Index.html CA, Canada 12/Apr/2024:13:17:07 /travel/usa/new-york-roosevelts/?s=mb US, United States 12/Apr/2024:13:17:07 /travel/usa/new-york-roosevelts/?s=mb US, United States 12/Apr/2024:13:17:04 /travel/usa/new-york-roosevelts/?s=mb ZA, South Africa 12/Apr/2024:13:17:04 /travel/usa/new-york-roosevelts/?s=mb US, United States 12/Apr/2024:13:17:02 /travel/usa/new-york-roosevelts/?s=mb NL, Netherlands 12/Apr/2024:13:17:00 /travel/usa/new-york-roosevelts/?s=mb FR, France 12/Apr/2024:13:17:00 /travel/usa/new-york-roosevelts/?s=mb

Here's what's going on.

Each line above is a request from a client, extracted from Nginx's /var/www/logs/httpd-access.log file. The client IP address, timestamp, and requested path were selected with awk and the client IP address converted to a country if possible with geoiplookup.

Geolocate IP

You can use a service such as Abstract's IP geolocation to check if the conversion was successful, or if the client IP address is the exit portal of a VPN.

The first 3 octets or first 24 bits of the IP address are used to specify the hue, with chroma at 75% and intensity at 100%. The resulting red, green, and blue values are scaled to the range of 0-255 and printed as two-character hexadecimal in an HTML style string.

Low-numbered /8 networks appear as red, through are orange shifting to yellow, through are shades of green, the /16 networks through about are shades of blue, then it's shades of purple into magenta for the /24 networks and up through

The HTML file on the server has a line where PHP uses passthru() to call the following shell script:


# Initial pipeline:
# tail		Just the last 200 (or slightly less after the grep)
# grep		... just the requests out of that
# cat | sort	... put into reverse order
# sed		... remove the quotes and square brackets
# awk		... print the IP address twice, timestamp, and requested path
# sed		... remove the first 3 dots to split first version of IP
#			address into octets, and remove any characters that
#			could cause trouble when inserted into this page
# I need to use the client IP address, field #5 at that point, to call
# geoiplookup.  So, send the initial pipeline into a while loop that
# assigns variables, sets a new variable, and then echoes the resulting
# collection into awk.
tail -200 /var/www/logs/access_log |
	grep 'GET.*200' |
	cat -n | sort -nr |
	sed -e 's/"/ /g' -e 's/\[//g' -e 's/\]//g' |
	awk '{print $2, $2, $5, $8}' |
	sed -e 's/\./ /' -e 's/\./ /' -e 's/\./ /' -e 's/[<>]//g' |
		COUNTRY=$( geoiplookup $CLIENTIP |
				sed 's/.*Edition: //' |
				sed 's/IP Address not found/Unknown/' )
		awk '{
			ip1 = $1;
			ip2 = $2;
			ip3 = $3;
			chroma = 0.75;
			hue = 6*(ip1*255*255 + ip2*255 + ip3)/(255*255*255);
			if (hue%2 > 1) {
				x = chroma*(1.0 - (hue%2 - 1));
			} else {
				x = chroma*(1.0 - (1 - hue%2));
			if (hue < 1.0) {
				r = chroma;
				g = x;
				b = 0;
			} else if (hue < 2.0) {
				r = x;
				g = chroma;
				b = 0;
			} else if (hue < 3.0) {
				r = 0;
				g = chroma;
				b = x;
			} else if (hue < 4.0) {
				r = 0;
				g = x;
				b = chroma;
			} else if (hue < 5.0) {
				r = x;
				g = 0;
				b = chroma;
			} else {
				r = chroma;
				g = 0;
				b = x;
			r = (r + 0.25)*255;
			g = (g + 0.25)*255;
			b = (b + 0.25)*255;

			printf("<div class=\"col-12 textleft\" ");
			printf("style=\"color:#000; background:#%02x%02x%02x;\"> ", r, g, b);
			for (i = 5; i <= NF; i++) {
				printf("%s ", $i);

Other Pages