Networking Fundamentals and Terminology
What are these mysterious terms?
What's a LAN? WLAN? WAN? Wi-Fi?
Switch? Router? Firewall?
Networking seems to have its own language, and an
awful lot of acronyms.
Let's understand the language first, or we will always be lost!
Like any technical area, networking has its own terminology
for a reason.
The instrument panel of an airliner has more than just three
instruments labeled "How Fast", "How High", and "How Much
Further".
Similarly, in networking we need to speak far more
precisely than "The thing could not send this stuff."
The point of this page is to help you understand.
Some terms are highlighted here to help you find the
explanations you need.
I'm not trying to make you memorize terminology
or speak a certain way, I'm just trying to help you
understand the cryptic and complex terminology of networking.
What is a network?
First, a network. That's just devices connected together so they can communicate. A little more formally, we can talk about nodes on one link. That is, devices (which could be from tiny gadgets up to big servers) speaking a common language, or protocol, connected together for fairly direct intercommunication. Originally the connections were wires and the signals were electrical. Now we also have wireless networks with radio signals, and optical networks with light pulses.
Amazon
ASIN: 1119471605
We used to say "computer networks" as if they were completely separate from telephone networks and telegraph networks and other networks. Well, originally that was the case. But for many years now, all the networks have been blended together at the large scale for communicating between distant pairs of cities.
Communication is all packets of data now — your e-mails, phone calls, text messages, and so on — and it all travels as streams of packets across the world-wide Internet. Now this blending process is spreading out to the edges, where you may notice that your smart phone can seamlessly switch between using the GSM "phone network" and a wireless "computer network" and you see that the Internet is what its designers intended, an enormous interconnected network of networks.
Local-Area Network (LAN)
Networks are built in different sizes. A local-area network or LAN might connect everything within a small building. Or maybe just part of one floor of a building, maybe just one room, or even just part of one room. Everything is connected to everything else on a LAN. One computer on this LAN could talk directly to any other computer on the LAN with no controlling device between.
No controlling device, that is.
Let's say you had 10 computers or other devices in a network,
and you wanted to directly connect each one to every other one.
There would have to be nine cables connected to each one,
and a total of 45 cables.
If you had 100 computers, each would need 99 connections
and we would have to run 4,950 cables!
You need n*(n-1)/2
cables to make a fully
connected mesh of n
devices,
and this would not be practical.
So, every computer connects to one central device that not only serves as a common connection point, but it also cleans up the signal at the very least. Now the number of cables is equal to the number of devices and we have a manageable situation.
Network Cables
On LANs you are going to mostly see Ethernet cables. There are four pairs of wires inside the jacket, each pair twisted together to make a transmission line that carries high-frequency signals without too much loss. Not zero loss, but good enough, more on this in a moment.
Amazon
ASIN: B00212NO6W
The connector looks similar to a telephone modular plug, although it's larger and has more electrical contacts. The connector is informally known as an RJ45, but to be a little pedantic it's really an 8P8C modular connector. 8 positions (pins), 8 contacts (or conductors).
Amazon
ASIN: B00GBBSWAC
You will hear people casually say Cat 5, which refers to just one of several "Category" ratings. These days you commonly see Category 5, 5e, 6, and 7 cabling. These support signaling at usefully high network speeds. 100 Mbps means 100 million bits per second, or 100 megabits per second. 1000 Mbps might be written 1 Gbps and pronounced as 1000 megabits or 1 gigabit per second. And yes, 10 Gbps, 40 Gbps and 100 Gbps cabling exists. You will see 10 GbE, 40 GbE and 100 GbE to refer to Ethernet cabling for those data rates.
Higher data speeds require the cable to be four higher quality transmission lines, able to carry higher frequency signals with less loss and less cross-talk (that is, interference from the signals on another pair).
Optical fiber can also be used, but you don't see it nearly as often. At least you the typical reader of a page explaining basic terminology won't see it, as it isn't commonly used to connect out to the desktops.
Fiber might be used within a server room or to interconnect crucial network infrastructure, but the cost of the interfaces and connections means that the end users' computers are going to have Ethernet ports and not optical ones. You can buy Ethernet cables at the same WalMart where the hillbillies buy their trucker hats and XXXXL clothes, but you aren't going to find optical interconnects there.
The top device here is barely visible because of all of its network connections. Most of them are Ethernet cables — the thick blue, green and grey cables, plus the one thick yellow cable coming in from the left. The two thin yellow lines and the thin orange lines coming from the right are optical connections.
These devices are Ethernet switches, which brings us to...
Repeaters, Hubs and Switches
Signals degrade as they travel. That central device might do nothing more than clean up the signals, make sure that the voltages (or waveforms, or light pulses) are back to the proper levels (or phase shifts, or whatever) to make precise "0" versus "1" signaling. If the device only cleans up the signals, we call it a repeater if it only has two ports, or a hub if it has several. A repeater would be placed every so often along a fiber line run under an ocean. A hub could be at the center of a LAN in an office.
If you connect too many devices to one hub, none of them can get a word in edgewise. You can't easily buy hubs these days, because they aren't useful outside an introductory network class. (nor can you easily buy quill pens, buggy whips, or spittoons) Today you buy a switch instead of a hub. Here are pictures of a rather old switch and a newer one. A picture above showed several switches in a rack, one of them with both Ethernet and fiber ports.
Amazon
ASIN: B00K4DS5KU
Amazon
ASIN: B00EAM5KE0
Data moves in packets, and each packet starts with a header specifying the hardware addresses of the interfaces it's coming from and going to. A switch looks at the destination hardware address, and if it remembers which port connects to that devices, just sends the packet out that port. If it doesn't recognize the destination address, it sends it out every port (other than the one it came from). At the same time it looks at the source address, so when a future packet needs to go to the device this packet is coming from, it knows where to send it.
The terminology is that a switch forwards Ethernet frames based on their hardware address, also known as their physical address, or their MAC (Media Access Control) address. These hardware addresses are useful for this, as they are should be unique. With Ethernet it's a 48-bit value. The first half indicates the manufacturer ("Intel made the chips in this network interface") and the second half is a serial number that the manufacturer should make unique to each device they make.
Now we have a network! But it's just a local-area network, a LAN. It has limitations. First, you can only connect so many devices to one network. Switches let you interconnect many more devices than hubs, but they still have limits.
Second, you have to have timing constraints on a network. Any device needs to make sure that it's OK for them to transmit a packet, they're forced to be polite and listen before talking. Signals can travel no faster than the speed of light, so if your cable to the switch is so many meters long, you are forced to wait at least a very specific amount of time to give another similarly connected device the opportunity to realize that your last packet is done and its OK for it to transmit its packet. So, a network cable can only be so long. With Ethernet, what we almost always use for LAN cabling, that limit is 100 meters.
Connecting Networks with Routers
Routers are devices that we can use to connect two networks. The hardware addresses used by switches are useful for the work switches do, but they don't scale to large systems. There isn't any logic to them, they are unique but they are essentially random.
IP addresses are logical. They're binary patterns, 32 bits for IP version 4 (or IPv4) and 128 bits for IP version 6 (or IPv6). The first part of the IP address answers the question "Which network?", and the rest of it answers the question "Which device on that network?" Something called the netmask specifies where the network part (the netid) ends and the rest (the hostid) starts.
Routers use the network part of the IP address to figure out how to forward packets. Yes, of course there's a context-specific term for packets when we're thinking about them at this level, they're called datagrams.
Anyway, if a router is directly connected to the destination network, it's directly connected to the destination host. It asks that host to respond with its hardware address if it isn't already known, and then it sends the Ethernet frame through the switch.
If the router isn't directly connected to the destination network, it consults its internal routing table to figure out what the next hop should be. That's going to be some other router. If the destination is very far away at all, this router probably won't know the details and will simply use its default route, which ought to send it in the direction of some router that can figure out the details.
Getting somewhat ahead of the story, default routes tend to point toward the backbone of the Internet. The routers on the Internet backbone have to know where everything is, at least in general terms. So if I'm sending data from my home in Indiana to somewhere in Europe, my poor router doesn't have to know the detailed topology of all the networks in Europe. It just says "Not here!" and sends it to the next hop, which would be some router that Comcast operates. Comcast's router won't know either, so it hands it toward the backbone. Before long, the packet reaches a router that knows that the address belongs to an enormous block of addresses that is, generally speaking, at the other end of a fiber running under the Atlantic. When it comes out over there, the packet will be forwarded through routers that have increasingly specific information about smaller blocks of addresses, and it will soon be routed to its destination.
We have multiple networks. They are separate in the physical sense, they are based around physically separate switch-based interconnections. They are also separate in the logical sense, each is assigned a unique block of IP addresses. Information about that block, that netid, is propagated out to the routers interested in recording its details in their routing tables.
We have interconnected those multiple networks with routers. This network-of-networks is called an internetwork or an internet (with a little "i"). You can use two switches, a router, and some cables and build an internet in your home. Now, to be the Internet, with a big "I", well, that's the big one. But the technology is the same.
If the informality of that explanation bothered you, you should read my more formal description of IP routing logic. Meanwhile, on to the next piece.
Wide-Area Networks
So far I just told you about Ethernet cables and how they can't be any longer than 100 meters. So let's see, one cable from here to the switch, and another from the switch to the other device, which could be a router. I suppose if we could run the cables in perfectly straight lines and we had a router every 200 meters...
We need different physical-layer technology to make long point-to-point connections!
Now we are talking about wide-area network connections, or WAN links in what will be a network-of-networks.
These are collections of physically long point-to-point connections, typically a network of long links with the nodes at major cities. These can be based on copper wire, and the connection from a customer site to the ISP might still be, but fiber is preferable.
Here are DS3 and OC3 router interfaces on a Lucent Packetstar AX 600 and a Cisco 7204. This is the point in a router room where a customer's network connects to the link to an ISP connecting them into the Internet.
Here you see 32 Avenue of the Americas just south of Canal Street in Manhattan, New York. It is one of the nodes in the WAN links making up the backbone of the Internet. Click here to see more pictures of the Internet infrastructure itself, at least in Manhattan.
It gets referred to as The Hub for its large meet-me room, where several backbone routing providers can interconnect their networks. The floors of the building have 14 to 19 foot ceilings. CoreSite advertises 50,000 square feet of data center here, with up to 25 kW of electrical power per cabinet backed up by UPS and generators.
32 Avenue of the Americas was completed in 1932 and became the AT&T Long Lines Headquarters, housing switching equipment and control systems for AT&T's North American and Transatlantic networks. AT&T reached Denver by 1911, and supported coast to coast telegraph and voice communication in 1914. Television signals were first relayed across the continent in 1951 on the Long Lines network. Technology progressed from open-wire copper lines in the beginning to coaxial cable, then microwave links, and then fiber optic lines. Be sure to look at my page showing the telecommunications infrastructure in Manhattan if you are interested in large-scale Internet nodes.
The two 120-foot communications masts at the top have microwave antennas providing line of sight connectivity to thousands of buildings throughout the five boroughs of New York City and across the Hudson into New Jersey. And that brings us to wireless...
Wireless: WLAN and Wi-Fi
You can build a LAN with Ethernet cables and a switch. However, those cables can be a hassle. The short version is, "Wi-Fi is how I can get on the Internet at the coffee shop and the library and ..." A few more details follow.
A wireless local-area network, a wireless LAN or simply a WLAN, is a LAN running over microwave radio signals instead of copper wire. This is what the IEEE generally classifies as 802.11 networking (wired Ethernet is 802.3 but you don't see that mentioned as often). Different letters such as 802.11a, 802.11b, 802.11h, 802.11n and so on, specify the microwave frequency (2.4 vs 5.5 GHz), the modulation scheme and the data bit rate. All the electrical engineer stuff, but then the IEEE is the Institute of Electrical and Electronics Engineers, after all.
As you might imagine, "802.11n WLAN with 802.11i security" is the sort of alphanumeric soup that everyone except electrical engineers find impenetrable. So, Wi-Fi was coined as a sort of retro-hip reference to "Hi-Fi", a pre-Reagan term for audio equipment.
Wireless networking needs a wireless access point, which people will shorten to AP or WAP. The typical home or small office access point is like those seen in the picture above or the ads at left. It's actually fairly sophisticated on a small scale, as it includes:
- A microwave radio interface to form a wireless LAN.
- A small Ethernet switch, maybe 4 ports or so.
- A router that also supports firewall rules and network address translation or NAT, as described below.
Often the access point is a little box running Linux or BSD Unix.
Amazon
ASIN: B000BTL0OA
Amazon
ASIN: B00R2AZLD2
The wireless access point is intended to provide easy and reliable connection within your office or home, which means that with a better antenna and some patience, it can be used from further away than intended.
Coffee shops and restaurants will use free Wi-Fi as a mechanism to entice people to visit. Hotels will provide it. In order to keep people from sitting in the parking lot and mooching, they might have a security system that requires people to first interact with a web server to enter the secret code given out at the sales counter. Or they might just dial back the transmit power on the access point and let the laws of physics limit what can be done.
It should be pretty obvious that wireless networking opens up all sort of security holes. Everything is being sent back and forth over the air. There are mechanisms for securing Wi-Fi connections.
The only one worth considering is called WPA2 as in Wireless Protected Access, Version 2 (because they didn't get it right the first time). This is also known as 802.11i. This uses good stuff — TKIP or Temporal Key Integrity Protocol (time-limited session keys) for AES or Advanced Encryption Standard cryptography, handled with Diffie-Hellman key agreement.
The sad and scary thing is that when I travel to the
Washington DC area and stay in hotels where people who work
at government agencies and major defense contractors also stay,
there's not a bit of WPA2 to be seen on the air
but plenty of unsecured network traffic all the same.
When I instead stay in the $20-30 per night hostel,
they secure their wireless networks with
WPA2 using keys that look like:
0x4d317934ccc4cb87cf191e5513f3d4d5b081cff1.
TCP, UDP, and Ports
Network services take many forms. The Network Time Protocol or NTP, which can tightly synchronize system clocks, and the Domain Name Service or DNS, which can look up matches between host names and IP addresses, need only short messages. Others, like the Hypertext Transfer Protocol or HTTP, which downloads web pages to your browser, deal with data too large to fit into a single packet. Some, like Secure Shell or SSH, provide a remote connection to a command line for an arbitrary length of time.
So, we have the User Datagram Protocol or UDP for datagram or message service, and the Transmission Control Protocol or TCP for connection-oriented service. UDP is like postcards — they're simple, cheap to send, and they will probably get to the other end in pretty much the order they were sent, but no guarantee. TCP is like a phone call — it's a connection that takes a 3-part handshake to establish and another 3-part handshake to cleanly shut down, but the programs at the two ends have guaranteed 2-way communication for the duration of the connection.
We need further distinctions. Imagine a web server. We want our browser to connect to it and download web pages in both insecure (HTTP) and secure (HTTPS) ways, and we want to both upload new pages and connect through a command line to do administrative work (both those with SSH). Those are three types of reliable connections, three TCP services. And, we want it to keep its system clock in sync with the world (NTP), and for it to be able to look up matches between host names and IP addresses (DNS). Those are two types of datagrams or messages, two UDP services.
How can this work?
Both TCP and UDP use the concept of numbered ports to distinguish between the programs running at each end. An SSH server process listens for connections on TCP port 22. A web server process, like Apache or Nginx, listens on TCP port 80 for HTTP, and if it runs HTTPS, it listens for that on TCP port 443. NTP uses UDP port 123, and DNS uses UDP port 53. TCP and UDP headers have 16-bit fields for ports, so port numbers can range from 0 through 65,535.
The client also uses ports.
When your browser was loading this page it needed to
get the page itself, the file
terminology.html
,
from my server.
But that file told your browser that it also needed
to get the large picture at the top of the page,
and some style and Javascript components
needed to make the page easier to read and navigate,
all of that also coming from my server.
And, it should download the
ads from Google's and Amazon's servers,
the Google Translate gadget from a different Google server,
and the Facebook and Twitter buttons from their servers.
So, the browser had asked the operating system for multiple sockets or TCP endpoints. A chunk of this page arrives at your computer because the routers along the way were looking at the packet's IP address. Once it's at your computer, your operating system processes it with the TCP or UDP module as appropriate (TCP in the case of web traffic), and looks at the port number to figure out which process gets it. Maybe you're running both Firefox and Chrome, which of those two browser processes gets this segment of the TCP stream? Then the browser knows that it's the next segment of this page (and not part of the picture, or the Google translate button, or a social media button) because of which port (or socket) and thus which stream it belongs to.
The TCP and UDP port number assignments are pretty arbitrary.
The file /etc/services
lists many of the large
list of "well-known" or pre-defined services.
The most popular services use port numbers below 1024.
It is supposed to be the case that only privileged processes,
those running as the administrator root
or
as the operating system itself, are allowed to open these
low-numbered "privileged ports".
You should feel confident that you aren't connecting to some
Trojan horse of a service that's trying to trick you into
sending your password to an intruder's process.
Microsoft, of course, did not follow that rule in the design of the Windows operating systems.
The client will get some random high-numbered port. It is supposed to be significantly large, the operating system should not simply start at 1025 and count up for client ports the way Windows does.
Firewalls can use port numbers to distinguish between specific services and tell which end is the service and which is the client. That allows a firewall to enforce policy by distinguishing between allowed and disallowed traffic types.
Firewalls and Network Address Translation or NAT
Routers are very good at their assigned task, forwarding packets to where their destination IP addresses specify. The problem is, those packets might have been sent by people trying to cause trouble.
Firewalls attempt to impose restrictions causing your security policy to overrule the purely logical rules of IP routing. I say "attempt" because while logic is straightforward, it's not really practical to enforce your desires for network control, partly because those rules tend to be a bit vague. "Don't let my employees look at bad web sites!", you say. Ummm, precisely what do you mean by "bad"? I'm going to need a very precise specification for "good" versus "bad", and that isn't going to be practical.
Amazon
ASIN: B0042WCFI2
Let's assume for a moment that we're in an imaginary world where you only hired people who behave nicely and waste no work time bidding on ephemera, narrating every bodily function, prattling tersely but endlessly, carrying on affairs, or otherwise abusing their Internet access. We need to protect these rare benighted souls who are useful.
So, a firewall can impose policy decisions like these:
- Host #1 inside can connect to any outside server using any application protocol (web browser, sending and receiving mail, and lots more). We trust the person who is supposed to use that computer.
- Host #2 inside can connect to any outside server using any of the application protocols in this list, but not others. We're pretty trusting of this person.
- Host #3 inside can connect to any of the outside servers in this list using any application protocol. We're pretty trusting of this person, although in a slightly different way than the person using host #2.
-
Host #4 inside cannot connect to anything outside.
We want to protect this system a little
more than the others.
- Some companies or government agencies would use this approach to control someone they didn't trust. But why were these people hired in the first place?!
- All the other hosts can connect only to these listed combinations of outside servers and application protocols. This is our default policy.
- No connections are allowed from the outside world into our protected organization.
Network address translation or NAT is a way for a router to hide an entire network or, for that matter, internal internetwork. Everything inside, behind the NAT router, gets a private IP address like 10.*.*.* or 192.168.*.* (or, less often, 172.16.*.* through 172.31.*.*). That provides just 65,534 usable host addresses in the case of 192.168.*.*, 1,048,574 addresses for 172.(16-31).*.*, and up to 16,777,214 addresses for the 10.*.*.* network block.
Click here for more details on NAT, but the short version is that hosts inside can do whatever they want among themselves, and in connecting to the outside world, but there simply is no way to connect from the outside to anything inside.
Your border router can impose more policy rules, only allowing certain machines inside to make limited connections to the outside world, imposing those limits in terms of the protocols used or the remote servers outside.
SOHO — Small Office / Home Office
Install your own DSL lineAt home or in a small office, you very likely have a small device connecting you to the Internet. Your ISP or Internet Service Provider is probably the phone company or a cable TV company. The technology might involve cable, telephone lines called DSL for Digital Subscriber Line, or maybe optical fiber.
The small box that is your interface out to the world has circuitry to speak the physical signals of the exterior side, a small router that runs NAT, and probably a small Ethernet switch with three to five ports. You might call that box a DSL router, or a cable modem, or a FiOS router, or similar.
Some people would point at it and say "That's the Internet." That's like pointing at a lamp and saying "That's the electrical power grid", the sort of nonsense this page tries to correct.
Above and below are a DSL router on the Frontier Communications network in the U.S. A small-diameter white cable runs to an RJ-45 jack that leads out to the NID or Network Interface Device, the plastic box on the exterior wall of the house. It's an RJ-11 or voice telephone type port on the router. When I installed DSL in the house, of course I used Category 6 cable with RJ-45 jacks in the rooms. That way a future owner could put an Ethernet switch at the central point and have their NID there. For now, there's a Frontier NID connected to that central point. An RJ-11 (or 6P2C) plug like what's on the other end of the white cable snaps into an RJ-45 (or 8P8C) socket, using the central six conductors in that 8-conductor socket.
It has four Ethernet ports, one of which is marked "uplink" and can be used to connect another switch.
Wait, where are the names like Google and Amazon?
Routers are very good with 32-bit IPv4 and 128-bit IPv6 addresses. Switches are very good with 48-bit Ethernet hardware addresses. Neither of them have any clue about names and implied meaning.
People are good with words, names and meaning. But most of us find binary patterns or their representations pretty meaningless.
You surely have some idea of what these mean:
-
www.google.com
-
www.amazon.com
-
en.wikipedia.org
But what about these?
-
173.194.75.99
10101101 11000010 01001011 01100011
-
2607:f8b0:400c:c01::69
0010011000000111 1111100010110000 0100000000001100 0000110000000001 0000000000000000 0000000000000000 0000000000000000 0000000001101001
Those are one of the multiple 32-bit IPv4 addresses used for
www.google.com
in base 10 and base 2 or binary,
and then the 128-bit IPv6 address used for the same
fully-qualified domain name in base 16 or hexadecimal
and then base 2 or binary.
DNS or the
Domain Name System
is a hierarchical distributed naming system for networked
devices.
The hierarchy of the names reflects organization —
www.google.com
is a machine named www
,
presumably a web server, one of many machines owned by
google.com
, an organization google
which is a type of a company as in .com
.
There is some meaning to us humans in that name.
DNS is also a network protocol used by DNS servers to exchange and provide information about the mapping between human-friendly names and router-friendly IP addresses. All routers need are the IP addresses, contained in the headers of every datagram. But the human users of the Internet need meaningful names, we have to have DNS so the names we specify can be mapped to IP addresses.