Rack of Ethernet switches.

Textual Analysis for Network Attack Recognition
Real Attack Data Patterns

Real Data and Common Patterns

The following data is based on collecting syslog data from some of the Linux hosts on one /24 subnet at a major U.S. university for 12 consecutive months. None of the hosts were intended as "honeypot" systems, they were all in laboratory use for a mixture of compute service and desktop use.

Not all systems were contributing syslog data throughout the period. Some joined the project laters, others had their operating systems re-installed by graduate students who did not always reconfigure the local syslog service.

The below table includes links to the reports, click on a month name if you want to see a rather large detailed report.

Month Contibuting Target Hosts Attacking Hosts Attack Sequences Captured Report Size
October 3 100 229 285 kbytes
November 3 347 741 1.5 Mbytes
December 3 113 294 364 kbytes
January 4 120 289 404 kbytes
February 10 105 750 440 kbytes
March 10 129 844 588 kbytes
April 9 124 824 540 kbytes
May 9 759 2126 4.0 Mbytes
June 9 183 778 624 kbytes
July 12 171 799 616 kbytes
August 12 192 838 736 kbytes
September 10 955 1965 6.9 Mbytes
Summary average = 7.83 total = 3298 total = 10477 17.4 Mbytes

Each monthly report starts with a table ranking the attacking hosts in decreasing order of number of attacks detected. Then a large table shows a summary of the set of attacks from one host. The attack descriptions are in order of their start time. Here is one early example from October:

Attacker Target Start End Password guesses for:
root non-root invalid users all users
71.170.120.217
pool-71-170-120-217.dllstx.fios.verizon.net.

OrgName: Verizon Internet Services Inc.
Address: 1880 Campus Commons Dr
City: Reston
StateProv: VA
Country: US
NetName: VIS-BLOCK
NetHandle: NET-71-169-192-0-1
Parent: NET-71-0-0-0-0
ipanema Oct 1 08:09:05 Oct 1 08:10:22
77 seconds
15 13 / 13 140 / 120 168 / 134
0.46 sec/guess
copacabana Oct 1 08:09:06 Oct 1 08:10:23
77 seconds
15 13 / 13 140 / 120 168 / 134
0.46 sec/guess
total:
2 targets
336 probes
Oct 1 08:09:05 Oct 1 08:10:23
78 seconds
30 26 / 13 280 / 120 336 / 134
0.23 sec/guess

The attacking host was at IP address 71.170.120.217. A DNS PTR lookup sucessfully resolved that to the fully-qualified domain name pool-71-170-120-217.dllstx.fios.verizon.net, and in case that failed as it frequently did, a whois lookup was also performed on the IP address.

That attack hit two target hosts, ipanema and copacabana, listed in order of when each part of the attack started.

This was a multi-threaded vertical attack, based on the almost completely overlapping periods. The attack made 15 guesses for the root password, 1 guess each for the passwords of 13 accounts that happened to exist on the system, and a total of 140 guesses for the passwords of 120 accounts that did not exist. The precise sequence of guesses is as follows, with root marked in red, existing accounts marked in yellow, and the remaining invalid accounts marked in green.

staff sales recruit alias office samba tomcat webadmin spam virus cyrus oracle michael ftp test webmaster postmaster postfix postgres paul root guest admin linux user david web apache pgsql mysql info tony core newsletter named visitor ftpuser username administrator library test root root admin guest master root root root root root admin admin admin admin root root test test webmaster username user root admin test root root root danny alex brett mike alan data www-data http httpd pop nobody root backup info shop sales web www wwwrun adam stephen richard george john news angel games pgsql mail adm ident webpop susan sunny steven ssh search sara robert richard party amanda rpm operator sgi sshd users admins admins bin daemon lp sync shutdown halt uucp smmsp dean unknown securityagent tokend windowserver appowner xgridagent agent xgridcontroller jabber amavisd clamav appserver mailman cyrusimap qtss eppc telnetd identd gnats jeff irc list eleve proxy sys zzz frank dan james snort radiomail harrypotter divine popa3d aptproxy desktop workshop mailnull nfsnobody rpcuser rpc gopher

As these types of attack go, this one was rather aggressive with two guesses per second on each target host.

The opposite extreme of timing appears to occur in the very next attack sequence captured, coming from Jiangsu Province in the People's Republic of China.

This appears to be an unusually subtle attack, spread out over one week making just 24 guesses on one target and 14 on another. That means about 6.2 hours between guesses on target ipanema and about 11 hours on target copacabana.

The sequences appeared to start almost ten minutes apart and continue for a week, indicating another multi-threaded vertical attack, but then stopping within one second. I guessed that the attacking host was shut down then or the attack process otherwise killed off.

However, further investigation of the captured log data showed that this was a further complication of this study, one that required further analytic software development! This one compromised host was used as an attack platform for two separate and similar but non-identical attacks, one lasting about 35 minutes on the morning of October 1st, and the second lasting about 45 minutes on the morning of October 7th.

Target Start End Password guesses for:
root non-root invalid users all users
218.3.120.196


inetnum: 218.3.120.192 - 218.3.120.223
netname: ZHENJIANG-DY-E_EDUCATION-CENTER
descr: Danyang E_Education Center
descr: Zhenjiang City
descr: Jiangsu Province
country: CN
address: No.18,Dianli Road,Zhenjiang 212007
address: XINMINDONG ROAD,DANYANG
ipanema Oct 1 09:19:11 Oct 7 08:14:53
514542 seconds
    24 / 5 24 / 5
22371.39 sec/guess
copacabana Oct 1 09:28:50 Oct 7 08:14:52
513962 seconds
    14 / 3 14 / 3
39535.54 sec/guess
total:
2 targets
38 probes
Oct 1 09:19:11 Oct 7 08:14:53
514542 seconds
    38 / 5 38 / 5
13906.54 sec/guess
A cormorant fishman on a small boat on the Li River passing through Yangshuo in China.

A cormorant fisherman on the Li River at Yangshuo, in south-eastern China, the source of so many of these attacks.

One of the early stages of log analysis was re-designed, to detect and separate different attacks from the same attacking host against the same target within one general period of time. Investigation indicated that guesses within one attack sequence occur within 10 seconds of each other, while distinct attack sequences from one attacking host as noticed so far have been separated by a few days. An arbitrary threshold of 300 seconds (5 minutes) was found useful to detect the start of a new sequence.

Once the sequences were separated, their significant differences were obvious. They were all based on the small set {admin, guest, mysql, test, webmaster}, but the sequences varied in length, order, and members:

Date  Target  Sequence
1 Oct  ipanema   test admin test admin test admin guest mysql webmaster test admin guest
1 Oct  copacabana   test admin guest test admin
7 Oct  ipanema   test test admin guest test admin guest test test admin test admin
7 Oct  copacabana   test test test admin test admin test admin guest

Moving on into October, here is the first instance in these logs of what will become a familiar pattern:

Target Start End Password guesses for:
root non-root invalid users all users
140.124.181.244


inetnum: 140.117.0.0 - 140.138.255.255
netname: TANET-BNETA
descr: imported inetnum object for MOEC
country: TW
address: Ministry of Education computer Center
address: 12F, No 106, Sec. 2, Heping E. Rd., Taipei
address: Taipei Taiwan
inetnum: 140.124.0.0 - 140.124.255.255
netname: T-NTUT.EDU.TW-NET
descr: National Taipei University of Technology
descr: Taipei Taiwan
address: National Taipei University of Technology
copacabana Oct 24 15:54:44 Oct 24 15:55:16
32 seconds
3   6 / 4 9 / 5
4.00 sec/guess
ipanema Oct 24 15:54:44 Oct 24 15:55:14
30 seconds
3   6 / 4 9 / 5
3.75 sec/guess
xoanon Oct 24 15:54:45 Oct 24 15:55:15
30 seconds
3   6 / 4 9 / 5
3.75 sec/guess
total:
3 targets
27 probes
Oct 24 15:54:44 Oct 24 15:55:16
32 seconds
9   18 / 4 27 / 5
1.23 sec/guess

Another multi-threaded vertical attack, not overly aggressive on any one host. A period of 3.5 to 4 seconds per guess seems to be pretty typical across all the attacks logged on this set of targets.

The interesting feature of this attack is the sequence of accounts guessed:
test guest admin admin user root root root test
That sequence had shown up again and again, and was the original motivation for the investigation resulting in this collection of web pages! Here are the instances of the "9/5" attack seen just in the first four months of data collection. With the exception of the attack on November 5, these attacks were identical — multi-threaded vertical attacks with one guess per target every 3 to 4 seconds, guessing passwords for identical sequences of logins.

Instances of the "9/5" or "test guest admin admin user root root root test" attack
Attacker Start time Targets Notes
140.124.181.244
Ministry of Education Computer Center, Taipei, Taiwan
Oct 24 15:54:44 xoanon, ipanema, copacabana
200.226.124.15
cheeseegg.ig.com.br
Internet Group do Brasil Ltda
Oct 30 22:08:33 xoanon, ipanema, copacabana This host returns for another attack on November 23!
218.1.65.233
China Telecom, Room 805, 61 North Si Chuan Road, Shanghai, PRC
Nov 5 15:45:24 xoanon, ipanema, copacabana This is some variation, as it spread the guesses over 18 days on xoanon and copacabana. It only attempted:
test test test guest
on copacabana, and:
test guest admin
on ipanema, only attacking ipanema during the final 19 seconds of the overall attack sequence.
60.248.162.135
60-248-162-135.HINET-IP.hinet.net
Chunghwa Telecom Co., Ltd. Data-Bldg 6F, No.21, Sec.21, Hsin-Yi Rd., Taipei Taiwan
Nov 10 12:58:19 xoanon, ipanema, copacabana
221.4.182.146
CNC Group Guangdong province network, PRC
Nov 20 19:22:34 xoanon, ipanema, copacabana
200.226.124.15
cheeseegg.ig.com.br
Internet Group do Brasil Ltda
Nov 23 21:08:32 xoanon, ipanema, copacabana This host had already done this attack against these same targets on October 30!
212.87.231.34
kpts.pcz.czest.pl
Institute of Computer and Information Science , Technical University of Czestochowa, Poland
Nov 27 14:38:23 xoanon, ipanema, copacabana
80.55.184.58
wc58.internetdsl.tpnet.pl
IDSL customer, Tychy company, Warszawa, Poland
Dec 2 06:38:21 ipanema, copacabana
59.120.195.104
59-120-195-104.HINET-IP.hinet.net
CHTD, Chunghwa Telecom Co., Ltd. Data-Bldg 6F, No.21, Sec.21, Hsin-Yi Rd., Taipei, Taiwan
Dec 13 08:18:51 ipanema, copacabana
62.115.65.34
62-115-65-34.customer.teliacarrier.com
Ariave Satcom LTD, CallSat Telecom, 122 Athalassas Ave, Nicosia, Cyprus
Jan 4 17:37:16 ipanema
66.0.90.52
Piedmont Municipal, apparently near Atlanta GA, USA
Jan 18 19:21:08 xoanon, ipanema, copacabana
140.116.214.87
Ministry of Education computer Center, Taipei, Taiwan
Jan 19 12:00:53 xoanon, ipanema, copacabana
222.124.169.163
Pt. Telekomunikasi Indonesia Jakarta, Indonesia
Jan 21 07:55:48 xoanon, ipanema, copacabana
211.203.181.6
Hanaro Telecom Co, Seoul, Korea
Jan 28 22:14:42 xoanon, ipanema, copacabana

A relatively simple attack like this "9/5" sequence can be noticed by simply looking at a table summarizing attacks. The problem is that many times the attacks are very similar but they are not identical. Here is an example of that phenomenon in an attack from Korea:

Target Start End Password guesses for:
root non-root invalid users all users
125.245.59.159


inetnum: 125.240.0.0 - 125.247.255.255
netname: PUBNETPLUS
descr: DACOM-PUBNETPLUS
descr: DACOM Bldg, 65-228. Hangangro3ga. Yongsan-gu, SEOUL, 140-716
descr: Allocated to KRNIC Member.
descr: If you would like to find assignment
descr: information in detail please refer to
descr: the KRNIC Whois Database at:
descr: "http://whois.nic.or.kr/english/index.html"
country: KR
address: 65-228, 3Ga, Hangang-ro, Yongsan-gu, Seoul
inetnum: 125.240.0.0 - 125.251.255.255
netname: PUBNETPLUS-KR
ipanema Oct 16 01:36:43 Oct 16 01:43:02
379 seconds
26 4 / 4 45 / 30 75 / 35
5.12 sec/guess
copacabana Oct 16 01:36:43 Oct 16 01:43:32
409 seconds
26 4 / 4 49 / 34 79 / 39
5.24 sec/guess
xoanon Oct 16 01:36:57 Oct 16 01:48:19
682 seconds
27 10 / 10 89 / 69 126 / 80
5.46 sec/guess
total:
3 targets
280 probes
Oct 16 01:36:43 Oct 16 01:48:19
696 seconds
79 18 / 11 183 / 69 280 / 80
2.49 sec/guess

As usual, a multi-threaded vertical attack. It's interesting that two of these three started within one second while the third started 14 seconds later. I would guess that many other threads were started during that time, attacking other targets not observed in this syslog collection.

Notice that the first two threads terminated early. Not simultaneously, which would have suggested that the attacking host had been shut down or the attack processes all killed. The longer attack against target xoanon continued for about five minutes more.

Let's consider the attack against target xoanon as the intended pattern. Red background indicates logins attacked on all three targets, yellow indicates logins attacked on xoanon and copacabana only, and blue indicates logins attacked on xoanon only:
root fluffy admin test guest webmaster mysql oracle library info shell linux unix webadmin ftp test root admin guest master apache root root network word root root root root root root root root root root root root root root admin admin admin admin root root test test webmaster user username username user root admin test root root root root danny sharon aron alex brett mike alan data www-data http httpd nobody root backup info shop sales web www wwwrun adam stephen richard george michael john david paul news angel games pgsql pgsql mail adm ident resin mikael mike suva webpop technicom susan sunsun root sunny steven ssh search sara robert richard postmaster party michael amanda mysql rpm operator sgi Aaliyah Aaron Aba Abel Jewel sshd users

Based on Observations So Far, What We Should Expect:

Many (but not all) attacks are multi-threaded vertical scans.

Sequences against targets may be identical, but frequently one target list terminates early.

Less frequently, we may see guesses skipped within the list.

So, we are looking to discover clusters of similar lists. They will be highly similar, possibly identical, at least to some point. Members of a cluster likely terminate or skip entries at different positions in the sequence.


To The Security Page