Exploring sysfs and the PCI Buses
What's in /sys
, and Why?
The /sys
file system is a feature of
the Linux kernel that makes kernel information
available to user processes.
It's an in-memory file system, meaning that kernel data
structures appear to the user as a tree of directories,
files, and symbolic links.
The files are mostly ASCII files, most of which contain
one short (if cryptic) value.
The file system itself resides in RAM, not on disk.
The simple utilities cd
,
ls
,
tree
,
find
,
cat
, and
echo
allow a user to explore, examine, and modify kernel objects
and their attributes.
What you find here corresponds to hardware in your system:
SATA and SCSI controllers and disks, Ethernet interfaces,
and much more.
If you can explore and manipulate sysfs or
/sys
, then you can inventory hardware,
discover how the kernel is using the hardware, and,
where possible, tune the kernel for performance.
We will start with a tour of /dev/
and some
of the devices, and then consider the devices that aren't
in /dev
.
Then we can explore the PCI, USB, and SCSI buses with
lspci
,
lsusb
, and
lsscsi
.
Then we will be ready to go into /sys
.
Devices in Linux /dev
A device is detected by a kernel module, sometimes called
a "device driver".
The module might be
compiled into the monolithic kernel
/boot/vmlinuz-release
loaded by the boot loader.
Or it might be a loadable module which the running kernel
has loaded from its collection rooted at
/lib/modules/release
.
The device name is assigned by the module that detected it.
For example, serial ports are named ttyS0
,
ttyS1
, ttyS2
, and so on.
Disks, both SATA and SCSI, are named
sda
,
sdb
,
sdc
, and so on.
Any partitions (or slices) of those disks
defined by either a GPT (or GUID Partition Table) or the
legacy IBM PC MBR Partition Table are numbered.
Partitions of the first disk are
sda1
,
sda2
,
sda3
, and so on.
Install the
kernel source
or add the package kernel-doc
or similar
to get the kernel documentation.
The kernel source puts this in
/usr/src/linux-release/Documentation
.
The kernel-doc
package will probably put it in
/usr/share/doc/kernel-doc-release
.
See the file devices.txt
at the top of
whichever hierarchy you get.
Or just
read devices.txt
online
at kernel.org.
Unlike the Solaris operating system, where you have to do a
"configuration boot" to scan and detect devices, Linux starts
every time with an empty /dev
directory
and populates it as the devices are detected.
So, /dev
contains only the devices
that have been detected.
It is possible that a device is present but the kernel has
not yet detected it.
Possibly because the appropriate kernel module is not loaded,
or possibly because the device was inserted after booting
and its module needs to be told to re-scan.
We will see how to do that below.
When you first look at the contents of /dev
,
it may seem like a lot of confusing clutter.
Refer to that
devices.txt
file to make sense of the names.
You will see that the kernel uses
major and minor device numbers
for the special files representing devices.
In the output of ls
they appear
as "major, minor" where
you expect to find the file size.
Generally, the major number indicates which kernel module
has detected the devices while the minor number indicates
which specific device of that type, possibly out of many.
# ls -l /tmp/foo /dev/sda /dev/ttyS0 -rw-r--r--. 1 root root 1382848 Mar 20 14:51 /tmp/foo brw-rw----. 1 root disk 8, 0 Mar 25 09:12 /dev/sda crw-rw----. 1 root dialout 4, 64 Mar 25 09:12 /dev/ttyS0
The file type is either c
for "character"
(or unbuffered) or b
for "block"
(or buffered).
The terms "character" and "block" are historical and can be
misleading, what really matters is unbuffered versus buffered.
Unbuffered "character" devices provide direct access to
the device,
but some (for example, a character device
referring to a disk) will enforce block boundaries and
will not let you read or write a byte at a time.
Buffered "block" devices are accessed through a
buffer or cache which can greatly improve
performance,
and you can read or write in blocks of any size
including a single byte at a time.
The downside is that after you write data into a buffered
device you know that the data is in the buffer but you
don't necessarily know if the contents of that buffer
have been flushed to the device.
Finally, some of what you find in /dev
are pseudo-devices not corresponding to hardware.
These provide functions provided by the kernel.
For example, /dev/zero
is an endless source
of all-zero bytes
and /dev/null
is the "bit bucket", it accepts
and discards all input.
The random devices /dev/random
and
/dev/urandom
provide streams of pseudo-random
numbers, see
my page on random devices
for the difference between the two, and the
hardware random number generator
provided on some Linux SoC platforms like the
Raspberry Pi.
Devices not in Linux /dev
After you examine /dev
long enough
to have some idea of what you're looking at,
you realize that many critical pieces of system hardware
are not represented here.
Maybe you didn't know where to look — the mouse is
probably under /dev/input
, the keyboard
likely has an unexpected name, and USB ports are under
/dev/bus/usb/*
— but where are things
like the Ethernet interfaces?
And we see the SATA disks, but where are the
SATA controllers?
Exploring the PCI Buses with lspci
An Ethernet interface is connected to some PCI bus. Let's see what's on our PCI buses, or at least on my system's. Use these same commands on your system to see what you have.
# lspci 00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx0 port B) (rev 02) 00:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port B) 00:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port D) 00:0b.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (NB-SB link) 00:0d.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx1 port B) 00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40) 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller 00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller 00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller (rev 42) 00:14.2 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia (Intel HDA) (rev 40) 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 LPC host controller (rev 40) 00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to PCI Bridge (rev 40) 00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller 00:15.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB700/SB800/SB900 PCI to PCI bridge (PCIE port 0) 00:15.1 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB700/SB800/SB900 PCI to PCI bridge (PCIE port 1) 00:15.2 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB900 PCI to PCI bridge (PCIE port 2) 00:15.3 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB900 PCI to PCI bridge (PCIE port 3) 00:16.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller 00:16.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 10h Processor HyperTransport Configuration 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 10h Processor Address Map 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 10h Processor DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 10h Processor Miscellaneous Control 00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 10h Processor Link Control 01:00.0 VGA compatible controller: NVIDIA Corporation GT218 [GeForce 210] (rev a2) 01:00.1 Audio device: NVIDIA Corporation High Definition Audio Controller (rev a1) 04:00.0 PCI bridge: ASMedia Technology Inc. Device 1182 05:03.0 PCI bridge: ASMedia Technology Inc. Device 1182 05:07.0 PCI bridge: ASMedia Technology Inc. Device 1182 06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 07) 07:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 07) 08:05.0 USB controller: NEC Corporation OHCI USB Controller (rev 43) 08:05.1 USB controller: NEC Corporation OHCI USB Controller (rev 43) 08:05.2 USB controller: NEC Corporation uPD72010x USB 2.0 Controller (rev 04) 09:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA Controller (rev 01) 0a:00.0 USB controller: ASMedia Technology Inc. ASM1042A USB 3.0 Host Controller 0b:00.0 USB controller: ASMedia Technology Inc. ASM1042A USB 3.0 Host Controller 0c:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 09)
Each line starts with the PCI bus address formatted as
bus:slot.function
.
Buses are numbered to 0f,
slots 00 to 1f,
and functions 1 to 7.
My system has a single PCI domain.
If it had more than one I could have used the -D
option to get the PCI domain inserted as 4-character
hexadecimal in the range 000-ffff.
The resulting output is
domain:bus:slot.function
,
for example, 0000:00:00.0
.
This motherboard has buses 00 through 0c. Bus 01 has a device in slot 00 with video and audio controllers as functions 0 and 1 respectively. Buses 02 and 03 have nothing connected. Bus 04 has a bridge, and bus 05 has a pair of bridges at slots 03 and 07. Buses 06 and 07 have one Ethernet controller each in their slot 00. Bus 08 has a USB 2.0 controller in slot 05, functions 0, 1, and 2. Bus 09 has a SATA controller in slot 00. Buses 0a and 0b have USB 3.0 controllers in their slot 00, and 0c has an Ethernet controller in its slot 00.
Bus 00 has everything else: There are PCI bridges in slots 02, 04, 0b, and 0d. Slot 14 has a multi-function devices, with an SMBus controller, audio device, ISA bridge, PCI bridge, and USB controller as functions 0, 2, 3, 4, and 5 respectively. Slot 15 has a multi-function device, four PCI bridges as functions 0, 1, 2, and 3. Slot 16 has a pair of USB controllers as functions 0 and 2. Slot 18 has a multi-function host bridge.
All the PCI buses and bridges can make the above output
confusing.
We could ask for a tree representation of just the buses
with -t
(which will include the PCI domain).
Or, more usefully, for the PCI bus structure in tree form
populated with the devices populating the slots and functions,
by specifying both -t
for tree and
-v
for relatively verbose.
The PCI and PCI Express bridge endpoints explicitly listed in the above output appear below only as counted bridge endpoints [01] through [0c].
# lspci -vt -[0000:00]-+-00.0 Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx0 port B) +-02.0-[01]--+-00.0 NVIDIA Corporation GT218 [GeForce 210] | \-00.1 NVIDIA Corporation High Definition Audio Controller +-04.0-[02]-- +-0b.0-[03]-- +-0d.0-[04-07]----00.0-[05-07]--+-03.0-[06]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller | \-07.0-[07]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller +-11.0 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] +-12.0 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller +-12.2 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller +-13.0 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller +-13.2 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller +-14.0 Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller +-14.2 Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia (Intel HDA) +-14.3 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 LPC host controller +-14.4-[08]--+-05.0 NEC Corporation OHCI USB Controller | +-05.1 NEC Corporation OHCI USB Controller | \-05.2 NEC Corporation uPD72010x USB 2.0 Controller +-14.5 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller +-15.0-[09]----00.0 ASMedia Technology Inc. ASM1062 Serial ATA Controller +-15.1-[0a]----00.0 ASMedia Technology Inc. ASM1042A USB 3.0 Host Controller +-15.2-[0b]----00.0 ASMedia Technology Inc. ASM1042A USB 3.0 Host Controller +-15.3-[0c]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller +-16.0 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller +-16.2 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller +-18.0 Advanced Micro Devices, Inc. [AMD] Family 10h Processor HyperTransport Configuration +-18.1 Advanced Micro Devices, Inc. [AMD] Family 10h Processor Address Map +-18.2 Advanced Micro Devices, Inc. [AMD] Family 10h Processor DRAM Controller +-18.3 Advanced Micro Devices, Inc. [AMD] Family 10h Processor Miscellaneous Control \-18.4 Advanced Micro Devices, Inc. [AMD] Family 10h Processor Link Control
Let's look closer at my Ethernet interfaces.
# lspci | grep Ethernet 06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 07) 07:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 07) 0c:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 09)
All have a Realtek RTL8111/8168/8411 chipset. The two marked as "rev 07" at PCI bus addresses 06:00.0 and 07:00.0 are on an expansion card plugged into a PCIe slot, notice that they share one serial number and are on a Realtek card. The one marked "rev 09" at PCI bus address 0c:00.0 is on the motherboard. Let's see more details of just those devices:
# lspci -v -s 06:00.0 06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 07) Subsystem: Realtek Semiconductor Co., Ltd. Device 0123 Flags: bus master, fast devsel, latency 0, IRQ 42 I/O ports at b000 [size=256] Memory at fe200000 (64-bit, non-prefetchable) [size=4K] Memory at d2200000 (64-bit, prefetchable) [size=16K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [70] Express Endpoint, MSI 01 Capabilities: [b0] MSI-X: Enable- Count=4 Masked- Capabilities: [d0] Vital Product Data Capabilities: [100] Advanced Error Reporting Capabilities: [140] Virtual Channel Capabilities: [160] Device Serial Number 01-00-00-00-68-4c-e0-00 Kernel driver in use: r8169 Kernel modules: r8169 # lspci -v -s 07:00.0 07:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 07) Subsystem: Realtek Semiconductor Co., Ltd. Device 0123 Flags: bus master, fast devsel, latency 0, IRQ 43 I/O ports at a000 [size=256] Memory at fe100000 (64-bit, non-prefetchable) [size=4K] Memory at d2100000 (64-bit, prefetchable) [size=16K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [70] Express Endpoint, MSI 01 Capabilities: [b0] MSI-X: Enable- Count=4 Masked- Capabilities: [d0] Vital Product Data Capabilities: [100] Advanced Error Reporting Capabilities: [140] Virtual Channel Capabilities: [160] Device Serial Number 01-00-00-00-68-4c-e0-00 Kernel driver in use: r8169 Kernel modules: r8169 # lspci -v -s 0c:00.0 0c:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 09) Subsystem: ASUSTeK Computer Inc. P8 series motherboard Flags: bus master, fast devsel, latency 0, IRQ 41 I/O ports at c000 [size=256] Memory at d2304000 (64-bit, prefetchable) [size=4K] Memory at d2300000 (64-bit, prefetchable) [size=16K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [70] Express Endpoint, MSI 01 Capabilities: [b0] MSI-X: Enable- Count=4 Masked- Capabilities: [d0] Vital Product Data Capabilities: [100] Advanced Error Reporting Capabilities: [140] Virtual Channel Capabilities: [160] Device Serial Number 95-00-00-00-68-4c-e0-00 Kernel driver in use: r8169 Kernel modules: r8169
USB and lsusb
The USB system has a similar tool, lsusb
.
I have the usual routine items plugged in:
- Genesys Logic multi-memory-card reader,
- Terminus Technology "media dashboard" with SATA, memory card reader slots, and USB ports
- Ralink wireless LAN adaptor
- Logitech trackball
- A rather old Memorex MEM48U flat-bed scanner
I also have these plugged in:
- Lexar USB stick
- My phone
- An external disk, which appears as the USB-PATA interface in its case
We see the hexadecimal vendor:product ID, and then
lsusb
also decodes that for us.
We can get the USB bus in tree form, and we can
ask for more details for a specific device by
specifying its bus:device.
# lsusb Bus 007 Device 003: ID 05e3:0745 Genesys Logic, Inc. Bus 007 Device 002: ID 1a40:0201 Terminus Technology Inc. FE 2.1 7-port Hub Bus 007 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 012 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 006 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 005 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 003 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 002 Device 002: ID 148f:5370 Ralink Technology, Corp. RT5370 Wireless Adapter Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 011 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 008 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 014 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 013 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 004 Device 020: ID 14cd:6600 Super Top M110E PATA bridge Bus 004 Device 018: ID 05dc:c753 Lexar Media, Inc. JumpDrive TwistTurn Bus 004 Device 019: ID 04e8:6860 Samsung Electronics Co., Ltd GT-I9100 Phone [Galaxy S II], GT-I9300 Phone [Galaxy S III], GT-P7500 [Galaxy Tab 10.1] , GT-I9500 [Galaxy S 4] Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 010 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 009 Device 006: ID 046d:c404 Logitech, Inc. TrackMan Wheel Bus 009 Device 024: ID 05d8:4005 Ultima Electronics Corp. MEM48U Bus 009 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub # lsusb -t /: Bus 14.Port 1: Dev 1, Class=root_hub, Driver=ohci-pci/2p, 12M /: Bus 13.Port 1: Dev 1, Class=root_hub, Driver=ohci-pci/3p, 12M /: Bus 12.Port 1: Dev 1, Class=root_hub, Driver=ohci-pci/4p, 12M /: Bus 11.Port 1: Dev 1, Class=root_hub, Driver=ohci-pci/2p, 12M /: Bus 10.Port 1: Dev 1, Class=root_hub, Driver=ohci-pci/5p, 12M /: Bus 09.Port 1: Dev 1, Class=root_hub, Driver=ohci-pci/5p, 12M |__ Port 3: Dev 24, If 0, Class=Vendor Specific Class, Driver=, 12M |__ Port 5: Dev 6, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M /: Bus 08.Port 1: Dev 1, Class=root_hub, Driver=ehci-pci/5p, 480M /: Bus 07.Port 1: Dev 1, Class=root_hub, Driver=ehci-pci/4p, 480M |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/7p, 480M |__ Port 2: Dev 3, If 0, Class=Mass Storage, Driver=usb-storage, 480M /: Bus 06.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 5000M /: Bus 05.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 480M /: Bus 04.Port 1: Dev 1, Class=root_hub, Driver=ehci-pci/5p, 480M |__ Port 1: Dev 19, If 0, Class=Vendor Specific Class, Driver=, 480M |__ Port 1: Dev 19, If 1, Class=Communications, Driver=cdc_acm, 480M |__ Port 1: Dev 19, If 2, Class=CDC Data, Driver=cdc_acm, 480M |__ Port 1: Dev 19, If 3, Class=Vendor Specific Class, Driver=usbfs, 480M |__ Port 4: Dev 18, If 0, Class=Mass Storage, Driver=usb-storage, 480M |__ Port 5: Dev 20, If 0, Class=Mass Storage, Driver=usb-storage, 480M /: Bus 03.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 5000M /: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 480M |__ Port 1: Dev 2, If 0, Class=Vendor Specific Class, Driver=rt2800usb, 480M /: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=ehci-pci/5p, 480M
The description of the Lexar thumb drive goes on for some length, that of the Samsung smart phone is about two and a half times as long. You can get a lot of information from this tool if you really need it.
# lsusb -v -s 004:018 Bus 004 Device 018: ID 05dc:c753 Lexar Media, Inc. JumpDrive TwistTurn Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 2.00 bDeviceClass 0 (Defined at Interface level) bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 64 idVendor 0x05dc Lexar Media, Inc. idProduct 0xc753 JumpDrive TwistTurn bcdDevice 1.03 iManufacturer 1 Lexar iProduct 2 USB Flash Drive iSerial 3 20120221185614484FE0 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 32 bNumInterfaces 1 bConfigurationValue 1 iConfiguration 0 bmAttributes 0x80 (Bus Powered) MaxPower 200mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 8 Mass Storage bInterfaceSubClass 6 SCSI bInterfaceProtocol 80 Bulk-Only iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x01 EP 1 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0200 1x 512 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0200 1x 512 bytes bInterval 0 Device Qualifier (for other device speed): bLength 10 bDescriptorType 6 bcdUSB 2.00 bDeviceClass 0 (Defined at Interface level) bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 64 bNumConfigurations 1 Device Status: 0x0000 (Bus Powered)
Disks and lsscsi
There used to be two separate sets of drivers
for IDE, more modernly called PATA, and SCSI.
PATA disks were named hd[a-z]*
and SCSI disks
were sd[a-z]*
.
Then SATA started to appear, there were new drivers and
mergings of existing subsystems, and now
all disks are named sd[a-z]*
by the
libATA library.
There is one kernel module for IDE interfaces and
many for the specific SCSI and SATA chip sets,
but a disk is a disk is a disk.
I have no SCSI interfaces, but lsscsi
shows me all the disk-like storage devices.
Each SCSI(-like) device is listed with its 4-tuple of
scsi host, channel, target number, and LUN in square brackets,
either actual SCSI details or how lsscsi
sees it.
Storage connected by Fibre Channel and IEEE 1394 will
also appear here.
Adding the -s
option yields that last column
of sizes.
# lsscsi -s [0:0:0:0] disk ATA WDC WD1001FALS-0 0K05 /dev/sda 1.00TB [1:0:0:0] disk ATA WDC WD1001FALS-0 0K05 /dev/sdb 1.00TB [2:0:0:0] disk ATA WDC WD20EARS-00M AB51 /dev/sdc 2.00TB [3:0:0:0] disk ATA WDC WD40EZRX-00S 0A80 /dev/sdd 4.00TB [4:0:0:0] cd/dvd ATAPI iHAS324 E 4L16 /dev/sr0 - [7:0:0:0] disk ATA SSD2SC120G1CS175 1101 /dev/sde 120GB [16:0:0:0] disk Lexar USB Flash Drive 8.07 /dev/sdf 8.01GB
Disks sda
, sdb
, sdc
,
and sdd
are the SATA magnetic disks in my system.
Device sr0
is a DVD writer.
Device sde
is a solid-state drive.
Device sdf
is a USB-connected storage device.
The systool
command is one last bus-examination
program.
At this point it becomes obvious that
something else is going on.
You can ask for information about a specific bus,
or about a class of devices with -c
,
and you can ask for the path to those devices with
-p
.
# systool -b scsi Bus = "scsi" Device = "0:0:0:0" Device = "16:0:0:0" Device = "1:0:0:0" Device = "2:0:0:0" Device = "3:0:0:0" Device = "4:0:0:0" Device = "7:0:0:0" Device = "8:0:0:0" Device = "host0" Device = "host1" Device = "host16" Device = "host2" Device = "host3" Device = "host4" Device = "host5" Device = "host6" Device = "host7" Device = "host8" Device = "target0:0:0" Device = "target16:0:0" Device = "target1:0:0" Device = "target2:0:0" Device = "target3:0:0" Device = "target4:0:0" Device = "target7:0:0" Device = "target8:0:0" # systool -c net Class = "net" Class Device = "em1" Device = "0000:0c:00.0" Class Device = "lo" Class Device = "p3p1" Device = "0000:06:00.0" Class Device = "p7p1" Device = "0000:07:00.0" Class Device = "wlp10s0u1" Device = "2-1:1.0" # systool -p -c net Class = "net" Class Device = "em1" Class Device path = "/sys/devices/pci0000:00/0000:00:15.3/0000:0c:00.0/net/em1" Device = "0000:0c:00.0" Device path = "/sys/devices/pci0000:00/0000:00:15.3/0000:0c:00.0" Class Device = "lo" Class Device path = "/sys/devices/virtual/net/lo" Class Device = "p3p1" Class Device path = "/sys/devices/pci0000:00/0000:00:0d.0/0000:04:00.0/0000:05:03.0/0000:06:00.0/net/p3p1" Device = "0000:06:00.0" Device path = "/sys/devices/pci0000:00/0000:00:0d.0/0000:04:00.0/0000:05:03.0/0000:06:00.0" Class Device = "p7p1" Class Device path = "/sys/devices/pci0000:00/0000:00:0d.0/0000:04:00.0/0000:05:07.0/0000:07:00.0/net/p7p1" Device = "0000:07:00.0" Device path = "/sys/devices/pci0000:00/0000:00:0d.0/0000:04:00.0/0000:05:07.0/0000:07:00.0" Class Device = "wlp10s0u1" Class Device path = "/sys/devices/pci0000:00/0000:00:15.1/0000:0a:00.0/usb2/2-1/2-1:1.0/net/wlp10s0u1" Device = "2-1:1.0" Device path = "/sys/devices/pci0000:00/0000:00:15.1/0000:0a:00.0/usb2/2-1/2-1:1.0"
The network interface devices have appeared at last,
Ethernet interfaces em1
,
p3p1
,
and p7p1
,
wireless LAN interface wlp10s0u1
,
and virtual loopback device lo
.
But what is going on in the output of that last command?
Welcome to sysfs
The /sys
file system is a memory-resident file
system.
It isn't stored on a disk, it is made up of kernel data
structures forming a user-friendly directory tree.
You may think that it's large, complicated, and confusing,
but be glad you have /sys
and you aren't
stuck trying to figure it out by using a debugger
on the running kernel.
It's much more user-friendly than the alternative,
like how LDAP is Lightweight in comparison to X.500,
or SNMP is Simpler than whatever enormous and proprietary
protocols came before.
# df /sys Filesystem 1K-blocks Used Available Use% Mounted on sysfs 0 0 0 - /sys
The file system hierarchy beneath /sys
maps
between internal kernel constructs and file system
objects in this way:
Internal | File System |
Kernel objects | Directories |
Object attributes | Regular files |
Object relationships | Symbolic links |
There is a lot of data under /sys
,
42,843 files, symbolic links, and directories on my system.
But we can look at its overall structure without becoming
overwhelmed, if we are careful.
# find /sys | wc -l 42843 # find /sys -type f | wc -l 34180 # find /sys -type d | wc -l 5865 # find /sys -type l | wc -l 2798
Let's see the eleven major subsystems known to sysfs,
and then we will look a little deeper into the
block
,
bus
,
class
, and
devices
subsystems, looking at them in an order that lets the
story make a little more sense.
# ls -F /sys block/ class/ devices/ fs/ kernel/ power/ bus/ dev/ firmware/ hypervisor/ module/
/sys/devices
The devices
directories contains a hierarchy
of all the devices detected by the system.
At the level of /sys/devices
the subdirectories
are categories.
Many are rather complex or obscure, but the cpu
category is an easier place to start.
# ls -F /sys/devices LNXSYSTM:00/ ibs_fetch/ pci0000:00/ software/ virtual/ breakpoint/ ibs_op/ platform/ system/ cpu/ msr/ pnp0/ tracepoint/ # tree -F /sys/devices/cpu /sys/devices/cpu/ |-- events/ | |-- branch-instructions | |-- branch-misses | |-- cache-misses | |-- cache-references | |-- cpu-cycles | `-- instructions |-- format/ | |-- cmask | |-- edge | |-- event | |-- inv | `-- umask |-- perf_event_mux_interval_ms |-- power/ | |-- autosuspend_delay_ms | |-- control | |-- runtime_active_time | |-- runtime_status | `-- runtime_suspended_time |-- rdpmc |-- subsystem -> ../../bus/event_source/ |-- type `-- uevent 4 directories, 20 files
Remember that directories are objects, files are object attributes, and symbolic links are object relationships.
The tree under /sys/devices/pci*
is especially
messy as its long path names describe the full path through
PCI buses (and bridges to other buses) and controllers.
/sys/bus
The bus
subdirectory contains a subdirectory
for each supported bus type.
# ls -F /sys/bus acpi/ event_source/ memory/ platform/ workqueue/ clockevents/ hdaudio/ mipi-dsi/ pnp/ xen/ clocksource/ hid/ node/ rapidio/ xen-backend/ container/ i2c/ parport/ scsi/ cpu/ machinecheck/ pci/ serio/ edac/ mei/ pci_express/ usb/
Let's go a little deeper into the PCI and PCI Express
bus hierarchies,
which we have already examined with lspci
.
The devices
subdirectories contains links
to the /sys/devices
tree,
and the drivers
subdirectories are trees of
information about those named kernel modules in use for
these devices.
This gets very large,
so we will look at just the first two layers.
Notice how this shows that PCI buses 1 and 4 are reached
through the PCI bus bridges at 0000:00:04.0 and 0000:00:0c.0,
respectively (we saw the bridges in the lspci
output).
# tree -L 2 -F /sys/bus/pci* /sys/bus/pci |-- devices/ | |-- 0000:00:00.0 -> ../../../devices/pci0000:00/0000:00:00.0/ | |-- 0000:00:02.0 -> ../../../devices/pci0000:00/0000:00:02.0/ | |-- 0000:00:04.0 -> ../../../devices/pci0000:00/0000:00:04.0/ | |-- 0000:00:0b.0 -> ../../../devices/pci0000:00/0000:00:0b.0/ | |-- 0000:00:0d.0 -> ../../../devices/pci0000:00/0000:00:0d.0/ | |-- 0000:00:11.0 -> ../../../devices/pci0000:00/0000:00:11.0/ | |-- 0000:00:12.0 -> ../../../devices/pci0000:00/0000:00:12.0/ | |-- 0000:00:12.2 -> ../../../devices/pci0000:00/0000:00:12.2/ | |-- 0000:00:13.0 -> ../../../devices/pci0000:00/0000:00:13.0/ | |-- 0000:00:13.2 -> ../../../devices/pci0000:00/0000:00:13.2/ | |-- 0000:00:14.0 -> ../../../devices/pci0000:00/0000:00:14.0/ | |-- 0000:00:14.2 -> ../../../devices/pci0000:00/0000:00:14.2/ | |-- 0000:00:14.3 -> ../../../devices/pci0000:00/0000:00:14.3/ | |-- 0000:00:14.4 -> ../../../devices/pci0000:00/0000:00:14.4/ | |-- 0000:00:14.5 -> ../../../devices/pci0000:00/0000:00:14.5/ | |-- 0000:00:15.0 -> ../../../devices/pci0000:00/0000:00:15.0/ | |-- 0000:00:15.1 -> ../../../devices/pci0000:00/0000:00:15.1/ | |-- 0000:00:15.2 -> ../../../devices/pci0000:00/0000:00:15.2/ | |-- 0000:00:15.3 -> ../../../devices/pci0000:00/0000:00:15.3/ | |-- 0000:00:16.0 -> ../../../devices/pci0000:00/0000:00:16.0/ | |-- 0000:00:16.2 -> ../../../devices/pci0000:00/0000:00:16.2/ | |-- 0000:00:18.0 -> ../../../devices/pci0000:00/0000:00:18.0/ | |-- 0000:00:18.1 -> ../../../devices/pci0000:00/0000:00:18.1/ | |-- 0000:00:18.2 -> ../../../devices/pci0000:00/0000:00:18.2/ | |-- 0000:00:18.3 -> ../../../devices/pci0000:00/0000:00:18.3/ | |-- 0000:00:18.4 -> ../../../devices/pci0000:00/0000:00:18.4/ | |-- 0000:01:00.0 -> ../../../devices/pci0000:00/0000:00:02.0/0000:01:00.0/ | |-- 0000:01:00.1 -> ../../../devices/pci0000:00/0000:00:02.0/0000:01:00.1/ | |-- 0000:04:00.0 -> ../../../devices/pci0000:00/0000:00:0d.0/0000:04:00.0/ | |-- 0000:05:03.0 -> ../../../devices/pci0000:00/0000:00:0d.0/0000:04:00.0/0000:05:03.0/ | |-- 0000:05:07.0 -> ../../../devices/pci0000:00/0000:00:0d.0/0000:04:00.0/0000:05:07.0/ | |-- 0000:06:00.0 -> ../../../devices/pci0000:00/0000:00:0d.0/0000:04:00.0/0000:05:03.0/0000:06:00.0/ | |-- 0000:07:00.0 -> ../../../devices/pci0000:00/0000:00:0d.0/0000:04:00.0/0000:05:07.0/0000:07:00.0/ | |-- 0000:08:05.0 -> ../../../devices/pci0000:00/0000:00:14.4/0000:08:05.0/ | |-- 0000:08:05.1 -> ../../../devices/pci0000:00/0000:00:14.4/0000:08:05.1/ | |-- 0000:08:05.2 -> ../../../devices/pci0000:00/0000:00:14.4/0000:08:05.2/ | |-- 0000:09:00.0 -> ../../../devices/pci0000:00/0000:00:15.0/0000:09:00.0/ | |-- 0000:0a:00.0 -> ../../../devices/pci0000:00/0000:00:15.1/0000:0a:00.0/ | |-- 0000:0b:00.0 -> ../../../devices/pci0000:00/0000:00:15.2/0000:0b:00.0/ | `-- 0000:0c:00.0 -> ../../../devices/pci0000:00/0000:00:15.3/0000:0c:00.0/ |-- drivers/ | |-- agpgart-intel/ | |-- agpgart-sis/ | |-- agpgart-via/ | |-- ahci/ | |-- ehci-pci/ | |-- intel_mid_gpio/ | |-- k10temp/ | |-- mei_me/ | |-- nouveau/ | |-- ohci-pci/ | |-- parport_pc/ | |-- pcieport/ | |-- piix4_smbus/ | |-- r8169/ | |-- serial/ | |-- shpchp/ | |-- snd_hda_intel/ | |-- tsi721/ | |-- xen-platform-pci/ | `-- xhci_hcd/ |-- drivers_autoprobe |-- drivers_probe |-- rescan |-- resource_alignment |-- slots/ `-- uevent /sys/bus/pci_express |-- devices/ | |-- 0000:04:00.0:pcie18 -> ../../../devices/pci0000:00/0000:00:0d.0/0000:04:00.0/0000:04:00.0:pcie18/ | |-- 0000:05:03.0:pcie28 -> ../../../devices/pci0000:00/0000:00:0d.0/0000:04:00.0/0000:05:03.0/0000:05:03.0:pcie28/ | `-- 0000:05:07.0:pcie28 -> ../../../devices/pci0000:00/0000:00:0d.0/0000:04:00.0/0000:05:07.0/0000:05:07.0:pcie28/ |-- drivers/ | |-- aer/ | |-- pcie_pme/ | `-- pciehp/ |-- drivers_autoprobe |-- drivers_probe `-- uevent 71 directories, 8 files
/sys/class
This contains a list of every device class, or functional type or category of device. Let's investigate two of them.
# ls -F /sys/class ata_device/ dma/ i2c-adapter/ net/ regulator/ tty/ ata_link/ dmi/ ieee80211/ pci_bus/ rfkill/ vc/ ata_port/ drm/ input/ phy/ rtc/ vtconsole/ backlight/ extcon/ iommu/ pktcdvd/ scsi_device/ watchdog/ bdi/ firmware/ leds/ power_supply/ scsi_disk/ wmi/ block/ gpio/ mei/ ppdev/ scsi_host/ bsg/ graphics/ mem/ printer/ sound/ devcoredump/ hidraw/ misc/ pwm/ thermal/ devfreq/ hwmon/ msr/ rapidio_port/ tpm/ # tree -F /sys/class/net /sys/class/net |-- em1 -> ../../devices/pci0000:00/0000:00:15.3/0000:0c:00.0/net/em1/ |-- lo -> ../../devices/virtual/net/lo/ |-- p3p1 -> ../../devices/pci0000:00/0000:00:0d.0/0000:04:00.0/0000:05:03.0/0000:06:00.0/net/p3p1/ |-- p7p1 -> ../../devices/pci0000:00/0000:00:0d.0/0000:04:00.0/0000:05:07.0/0000:07:00.0/net/p7p1/ `-- wlp10s0u1 -> ../../devices/pci0000:00/0000:00:15.1/0000:0a:00.0/usb2/2-1/2-1:1.0/net/wlp10s0u1/ 5 directories, 0 files
/sys/block
This directory contains symbolic links to the details about all block devices currently recognized by the kernel.
# tree /sys/block /sys/class/block |-- loop0 -> ../../devices/virtual/block/loop0/ |-- loop1 -> ../../devices/virtual/block/loop1/ |-- loop2 -> ../../devices/virtual/block/loop2/ |-- loop3 -> ../../devices/virtual/block/loop3/ |-- loop4 -> ../../devices/virtual/block/loop4/ |-- loop5 -> ../../devices/virtual/block/loop5/ |-- loop6 -> ../../devices/virtual/block/loop6/ |-- loop7 -> ../../devices/virtual/block/loop7/ |-- ram0 -> ../../devices/virtual/block/ram0/ |-- ram1 -> ../../devices/virtual/block/ram1/ |-- ram10 -> ../../devices/virtual/block/ram10/ |-- ram11 -> ../../devices/virtual/block/ram11/ |-- ram12 -> ../../devices/virtual/block/ram12/ |-- ram13 -> ../../devices/virtual/block/ram13/ |-- ram14 -> ../../devices/virtual/block/ram14/ |-- ram15 -> ../../devices/virtual/block/ram15/ |-- ram2 -> ../../devices/virtual/block/ram2/ |-- ram3 -> ../../devices/virtual/block/ram3/ |-- ram4 -> ../../devices/virtual/block/ram4/ |-- ram5 -> ../../devices/virtual/block/ram5/ |-- ram6 -> ../../devices/virtual/block/ram6/ |-- ram7 -> ../../devices/virtual/block/ram7/ |-- ram8 -> ../../devices/virtual/block/ram8/ |-- ram9 -> ../../devices/virtual/block/ram9/ |-- sda -> ../../devices/pci0000:00/0000:00:11.0/ata1/host0/target0:0:0/0:0:0:0/block/sda/ |-- sda1 -> ../../devices/pci0000:00/0000:00:11.0/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda1/ |-- sdb -> ../../devices/pci0000:00/0000:00:11.0/ata2/host1/target1:0:0/1:0:0:0/block/sdb/ |-- sdb1 -> ../../devices/pci0000:00/0000:00:11.0/ata2/host1/target1:0:0/1:0:0:0/block/sdb/sdb1/ |-- sdc -> ../../devices/pci0000:00/0000:00:11.0/ata3/host2/target2:0:0/2:0:0:0/block/sdc/ |-- sdc1 -> ../../devices/pci0000:00/0000:00:11.0/ata3/host2/target2:0:0/2:0:0:0/block/sdc/sdc1/ |-- sdd -> ../../devices/pci0000:00/0000:00:11.0/ata4/host3/target3:0:0/3:0:0:0/block/sdd/ |-- sdd1 -> ../../devices/pci0000:00/0000:00:11.0/ata4/host3/target3:0:0/3:0:0:0/block/sdd/sdd1/ |-- sde -> ../../devices/pci0000:00/0000:00:15.0/0000:09:00.0/ata8/host7/target7:0:0/7:0:0:0/block/sde/ |-- sde1 -> ../../devices/pci0000:00/0000:00:15.0/0000:09:00.0/ata8/host7/target7:0:0/7:0:0:0/block/sde/sde1/ |-- sde2 -> ../../devices/pci0000:00/0000:00:15.0/0000:09:00.0/ata8/host7/target7:0:0/7:0:0:0/block/sde/sde2/ |-- sde3 -> ../../devices/pci0000:00/0000:00:15.0/0000:09:00.0/ata8/host7/target7:0:0/7:0:0:0/block/sde/sde3/ |-- sdf -> ../../devices/pci0000:00/0000:00:16.2/usb7/7-1/7-1.2/7-1.2:1.0/host8/target8:0:0/8:0:0:0/block/sdf/ |-- sdg -> ../../devices/pci0000:00/0000:00:13.2/usb4/4-4/4-4:1.0/host16/target16:0:0/16:0:0:0/block/sdg/ |-- sdg1 -> ../../devices/pci0000:00/0000:00:13.2/usb4/4-4/4-4:1.0/host16/target16:0:0/16:0:0:0/block/sdg/sdg1/ `-- sr0 -> ../../devices/pci0000:00/0000:00:11.0/ata5/host4/target4:0:0/4:0:0:0/block/sr0/ 40 directories, 0 files
Look at the paths corresponding to those disk devices . We could use this device path for the first SATA disk:
# tree /sys/devices/pci0000:00/0000:00:11.0/ata1/host0/target0:0:0/0:0:0:0/block/sda/ [... 97 lines of output not shown ...]
However, there are much more convenient ways to refer to this device is all we want to do is learn about the first SATA drive and we're not concerned about specifying it by PCI bus and controller path:
# tree /sys/block/sda [... 97 lines of output not shown ...] # tree /sys/class/block/sda [... 97 lines of output not shown ...]
Those names are equivalent because of this:
# find /sys -name sda /sys/devices/pci0000:00/0000:00:08.0/ata1/host0/target0:0:0/0:0:0:0/block/sda /sys/block/sda /sys/class/block/sda # ls -ldF $( find /sys -name sda ) lrwxrwxrwx 1 root root 0 Mar 21 10:38 /sys/block/sda -> ../devices/pci0000:00/0000:00:08.0/ata1/host0/target0:0:0/0:0:0:0/block/sda/ lrwxrwxrwx 1 root root 0 Mar 21 10:38 /sys/class/block/sda -> ../../devices/pci0000:00/0000:00:08.0/ata1/host0/target0:0:0/0:0:0:0/block/sda/ drwxr-xr-x 11 root root 0 Mar 26 08:53 /sys/devices/pci0000:00/0000:00:08.0/ata1/host0/target0:0:0/0:0:0:0/block/sda/
The drawback to using the much simpler names is
that we can't count on the disks to be detected, and thus
named, in the same order every time.
However, when the kernel detects a disk or disk partition
with an Ext4, XFS, or Btrfs file system, it looks for
labels and UUIDs in the header for the file system
and creates symbolic links under /dev/disk/by*
.
Let's see how that works.
The external disk is usually plugged into my Blu-Ray player,
which like most of them,
runs Linux.
So, I put an Ext4 file system labeled "BluRay Player"
on that disk.
Symbolic link names can't contain literal "/" and so we see
"\\x2f
" and "\\x20
" for the ASCII
codes 0x2f and 0x20 for "/" and a blank space.
# tree -F /dev/disk /dev/disk |-- by-id/ | |-- ata-ATAPI_iHAS324_E_3524706_2L8422503739 -> ../../sr0 | |-- ata-SSD2SC120G1CS1754D117-551_PNY13150000742110219 -> ../../sde | |-- ata-SSD2SC120G1CS1754D117-551_PNY13150000742110219-part1 -> ../../sde1 | |-- ata-SSD2SC120G1CS1754D117-551_PNY13150000742110219-part2 -> ../../sde2 | |-- ata-SSD2SC120G1CS1754D117-551_PNY13150000742110219-part3 -> ../../sde3 | |-- ata-WDC_WD1001FALS-00J7B0_WD-WMATV8393587 -> ../../sda | |-- ata-WDC_WD1001FALS-00J7B0_WD-WMATV8393587-part1 -> ../../sda1 | |-- ata-WDC_WD1001FALS-00J7B0_WD-WMATV8828069 -> ../../sdb | |-- ata-WDC_WD1001FALS-00J7B0_WD-WMATV8828069-part1 -> ../../sdb1 | |-- ata-WDC_WD20EARS-00MVWB0_WD-WMAZA3949022 -> ../../sdc | |-- ata-WDC_WD20EARS-00MVWB0_WD-WMAZA3949022-part1 -> ../../sdc1 | |-- ata-WDC_WD40EZRX-00SPEB0_WD-WCC4E3LHE6AS -> ../../sdd | |-- ata-WDC_WD40EZRX-00SPEB0_WD-WCC4E3LHE6AS-part1 -> ../../sdd1 | |-- usb-Generic_STORAGE_DEVICE_000000000903-0:0 -> ../../sdf | |-- usb-Lexar_USB_Flash_Drive_20120221185614484FE0-0:0 -> ../../sdg | |-- usb-Lexar_USB_Flash_Drive_20120221185614484FE0-0:0-part1 -> ../../sdg1 | |-- wwn-0x50014ee057b9a2e5 -> ../../sda | |-- wwn-0x50014ee057b9a2e5-part1 -> ../../sda1 | |-- wwn-0x50014ee057d3c9a5 -> ../../sdb | |-- wwn-0x50014ee057d3c9a5-part1 -> ../../sdb1 | |-- wwn-0x50014ee0ad3ea3bd -> ../../sdc | |-- wwn-0x50014ee0ad3ea3bd-part1 -> ../../sdc1 | |-- wwn-0x50014ee2b674b342 -> ../../sdd | |-- wwn-0x50014ee2b674b342-part1 -> ../../sdd1 | |-- wwn-0x5f8db4c135110219 -> ../../sde | |-- wwn-0x5f8db4c135110219-part1 -> ../../sde1 | |-- wwn-0x5f8db4c135110219-part2 -> ../../sde2 | `-- wwn-0x5f8db4c135110219-part3 -> ../../sde3 |-- by-label/ | |-- BluRay\\x20Player -> ../../sdf1 | |-- LEXAR -> ../../sdg1 | |-- \\x2fhome -> ../../sdd1 | |-- \\x2fhome2 -> ../../sda1 | |-- \\x2fhome3 -> ../../sdb1 | `-- \\x2fhome4 -> ../../sdc1 |-- by-partlabel/ | `-- primary -> ../../sdb1 |-- by-partuuid/ | |-- 25c51d6b-1bd2-4f2b-8aa9-3f86758d4c05 -> ../../sde3 | |-- 5dd49874-05b7-4eb8-9e4d-656ac46cdabd -> ../../sde2 | |-- d1aecf0d-91ed-4e10-92db-91136f7c0990 -> ../../sdd1 | |-- d2d717e0-9356-4183-9f10-4ecd94956f0d -> ../../sde1 | `-- e6affcfb-6920-4046-8507-15ee5239e5b3 -> ../../sdb1 |-- by-path/ | |-- pci-0000:00:13.2-usb-0:4:1.0-scsi-0:0:0:0 -> ../../sdg | |-- pci-0000:00:13.2-usb-0:4:1.0-scsi-0:0:0:0-part1 -> ../../sdg1 | `-- pci-0000:00:16.2-usb-0:1.2:1.0-scsi-0:0:0:0 -> ../../sdf `-- by-uuid/ |-- 220c8a49-f75f-4af3-8618-c374599edb96 -> ../../sde2 |-- 36B5-4947 -> ../../sdg1 |-- 3C9205C077D3E2A8 -> ../../sdf1 |-- 5fd32dac-b2e9-411c-bff3-a109f29734aa -> ../../sdb1 |-- 5fd4396d-d2d3-4520-9dff-adfb8bfac2c5 -> ../../sdd1 |-- 6a0701ca-2759-43aa-b461-2c750917d017 -> ../../sda1 |-- 77aa6e61-df08-4480-a3ae-73bf40116336 -> ../../sde3 |-- EA43-35E5 -> ../../sde1 `-- b16973c5-3ad7-4a60-bb95-fdeb3f3a57d2 -> ../../sdc1 6 directories, 52 files
So, I can use the links in /dev/disks/*
to get the current name for the device holding a
file system with a specific label or UUID, and then look up
or manipulate details of that device under
/sys/block/sdX
.
Examining One Device in sysfs
As explained elsewhere on my
How Linux Boots
and
TCP/IP Network Commands
pages, the Ethernet interface names eth0
and
eth1
are the old-fashioned ones.
You will only have those if you have older BIOS firmware.
Use the dmidecode
command to see your
SMBIOS version.
The new interface naming convention requires
SMBIOS 2.6 or later.
Ethernet devices can have names
like eno1
or em1
(onboard),
ens1
(PCIe hotplug),
enp4s1
or p4p1
(PCI bus address)
or enx0011951E8EB6
(MAC address).
Look at the name of my WLAN interface,
wlp10s0u1
!
The "wl
" means it's a wireless LAN interface,
"p10s0
" means that the USB controller is at
PCI bus address 0a:00.0,
and "u1
" means it's USB device 1.
On another system it was
wlp0s2f1u4
.
The "wl
" means it's a wireless LAN interface,
"f1u4
" means it's USB bus 1 and device 4 through a
USB controller, and
"p2s0
" means that the USB controller is at
PCI bus address 00:02.0.
Ethernet devices can have names
like eno1
(onboard),
ens1
(PCIe hotplug),
enp4s1
(PCI bus address)
or enx0011951E8EB6
(MAC address).
All that is likely to leave you wondering what
network interface device names will be on a system!
You can no longer assume that
Ethernet is ethN
and WLAN is wlanN
.
My experience is that the new network device names are
consistent but they are not predictable.
To see the network device names, simply list the directory:
# ls /sys/class/net em1 lo p3p1 p7p1 wlp10s0u1 # tree /sys/class/net /sys/class/net |-- em1 -> ../../devices/pci0000:00/0000:00:15.3/0000:0c:00.0/net/em1 |-- lo -> ../../devices/virtual/net/lo |-- p3p1 -> ../../devices/pci0000:00/0000:00:0d.0/0000:04:00.0/0000:05:03.0/0000:06:00.0/net/p3p1 |-- p7p1 -> ../../devices/pci0000:00/0000:00:0d.0/0000:04:00.0/0000:05:07.0/0000:07:00.0/net/p7p1 `-- wlp10s0u1 -> ../../devices/pci0000:00/0000:00:15.1/0000:0a:00.0/usb2/2-1/2-1:1.0/net/wlp10s0u1 5 directories, 0 files
Let's investigate that p3p1
device.
It's accessed via the first PCI bus 0000:00,
via a PCI bridge at 0000:00:0d.0,
to PCI bus 06 and address 0000:06:00.0.
That PCI-connected device is at
/sys/devices/pci0000:00/0000:00:04.0/0000:01:0a.0
within sysfs, let's see the full hierarchy rooted there.
# lspci | egrep '00:0d.0|06:00.0' 00:0d.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx1 port B) 06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 07) # tree /sys/devices/pci0000:00/0000:00:0d.0/0000:04:00.0/0000:05:03.0/0000:06:00.0 /sys/devices/pci0000:00/0000:00:0d.0/0000:04:00.0/0000:05:03.0/0000:06:00.0 |-- broken_parity_status |-- class |-- config |-- consistent_dma_mask_bits |-- d3cold_allowed |-- device |-- dma_mask_bits |-- driver -> ../../../../../../bus/pci/drivers/r8169 |-- driver_override |-- enable |-- irq |-- local_cpulist |-- local_cpus |-- modalias |-- msi_bus |-- msi_irqs | `-- 42 |-- net | `-- p3p1 | |-- addr_assign_type | |-- addr_len | |-- address | |-- broadcast | |-- carrier | |-- carrier_changes | |-- dev_id | |-- dev_port | |-- device -> ../../../0000:06:00.0 | |-- dormant | |-- duplex | |-- flags | |-- gro_flush_timeout | |-- ifalias | |-- ifindex | |-- iflink | |-- link_mode | |-- mtu | |-- name_assign_type | |-- netdev_group | |-- operstate | |-- phys_port_id | |-- phys_port_name | |-- phys_switch_id | |-- power | | |-- autosuspend_delay_ms | | |-- control | | |-- runtime_active_time | | |-- runtime_status | | `-- runtime_suspended_time | |-- proto_down | |-- queues | | |-- rx-0 | | | |-- rps_cpus | | | `-- rps_flow_cnt | | `-- tx-0 | | |-- byte_queue_limits | | | |-- hold_time | | | |-- inflight | | | |-- limit | | | |-- limit_max | | | `-- limit_min | | |-- tx_maxrate | | |-- tx_timeout | | `-- xps_cpus | |-- speed | |-- statistics | | |-- collisions | | |-- multicast | | |-- rx_bytes | | |-- rx_compressed | | |-- rx_crc_errors | | |-- rx_dropped | | |-- rx_errors | | |-- rx_fifo_errors | | |-- rx_frame_errors | | |-- rx_length_errors | | |-- rx_missed_errors | | |-- rx_over_errors | | |-- rx_packets | | |-- tx_aborted_errors | | |-- tx_bytes | | |-- tx_carrier_errors | | |-- tx_compressed | | |-- tx_dropped | | |-- tx_errors | | |-- tx_fifo_errors | | |-- tx_heartbeat_errors | | |-- tx_packets | | `-- tx_window_errors | |-- subsystem -> ../../../../../../../../class/net | |-- tx_queue_len | |-- type | `-- uevent |-- numa_node |-- power | |-- autosuspend_delay_ms | |-- control | |-- runtime_active_time | |-- runtime_status | |-- runtime_suspended_time | |-- wakeup | |-- wakeup_abort_count | |-- wakeup_active | |-- wakeup_active_count | |-- wakeup_count | |-- wakeup_expire_count | |-- wakeup_last_time_ms | |-- wakeup_max_time_ms | `-- wakeup_total_time_ms |-- remove |-- rescan |-- reset |-- resource |-- resource0 |-- resource2 |-- resource4 |-- resource4_wc |-- subsystem -> ../../../../../../bus/pci |-- subsystem_device |-- subsystem_vendor |-- uevent |-- vendor `-- vpd 14 directories, 109 files
Notice all the LAN networking parameters in the tree
below net/p3p1
, let's see what form they take.
And instead of that unwieldy multi-part PCI bus path,
let's use the symbolic links with their, well, more
symbolic names!
# ls -lR /sys/class/net/p3p1/ /sys/class/net/p3p1/: total 0 -r--r--r-- 1 root root 4096 Dec 9 22:58 addr_assign_type -r--r--r-- 1 root root 4096 Dec 9 22:58 addr_len -r--r--r-- 1 root root 4096 Dec 9 22:58 address -r--r--r-- 1 root root 4096 Dec 9 22:58 broadcast -rw-r--r-- 1 root root 4096 Dec 9 22:58 carrier -r--r--r-- 1 root root 4096 Dec 9 22:58 carrier_changes -r--r--r-- 1 root root 4096 Dec 9 22:58 dev_id -r--r--r-- 1 root root 4096 Dec 9 22:58 dev_port lrwxrwxrwx 1 root root 0 Dec 9 22:09 device -> ../../../0000:06:00.0 -r--r--r-- 1 root root 4096 Dec 9 22:58 dormant -r--r--r-- 1 root root 4096 Dec 9 22:58 duplex -rw-r--r-- 1 root root 4096 Dec 9 22:58 flags -rw-r--r-- 1 root root 4096 Dec 9 22:58 gro_flush_timeout -rw-r--r-- 1 root root 4096 Dec 9 22:58 ifalias -r--r--r-- 1 root root 4096 Dec 9 22:58 ifindex -r--r--r-- 1 root root 4096 Dec 9 22:58 iflink -r--r--r-- 1 root root 4096 Dec 9 22:58 link_mode -rw-r--r-- 1 root root 4096 Dec 9 22:58 mtu -r--r--r-- 1 root root 4096 Dec 9 22:58 name_assign_type -rw-r--r-- 1 root root 4096 Dec 9 22:58 netdev_group -r--r--r-- 1 root root 4096 Dec 9 22:58 operstate -r--r--r-- 1 root root 4096 Dec 9 22:58 phys_port_id -r--r--r-- 1 root root 4096 Dec 9 22:58 phys_port_name -r--r--r-- 1 root root 4096 Dec 9 22:58 phys_switch_id drwxr-xr-x 2 root root 0 Dec 9 22:58 power -rw-r--r-- 1 root root 4096 Dec 9 22:58 proto_down drwxr-xr-x 4 root root 0 Dec 9 22:58 queues -r--r--r-- 1 root root 4096 Dec 9 22:58 speed drwxr-xr-x 2 root root 0 Dec 9 22:58 statistics lrwxrwxrwx 1 root root 0 Dec 9 22:09 subsystem -> ../../../../../../../../class/net -rw-r--r-- 1 root root 4096 Dec 9 22:58 tx_queue_len -r--r--r-- 1 root root 4096 Dec 9 22:58 type -rw-r--r-- 1 root root 4096 Dec 9 22:58 uevent /sys/class/net/p3p1/power: total 0 -rw-r--r-- 1 root root 4096 Dec 9 22:58 autosuspend_delay_ms -rw-r--r-- 1 root root 4096 Dec 9 22:58 control -r--r--r-- 1 root root 4096 Dec 9 22:58 runtime_active_time -r--r--r-- 1 root root 4096 Dec 9 22:58 runtime_status -r--r--r-- 1 root root 4096 Dec 9 22:58 runtime_suspended_time /sys/class/net/p3p1/queues: total 0 drwxr-xr-x 2 root root 0 Dec 9 22:58 rx-0 drwxr-xr-x 3 root root 0 Dec 9 22:58 tx-0 /sys/class/net/p3p1/queues/rx-0: total 0 -rw-r--r-- 1 root root 4096 Dec 9 22:58 rps_cpus -rw-r--r-- 1 root root 4096 Dec 9 22:58 rps_flow_cnt /sys/class/net/p3p1/queues/tx-0: total 0 drwxr-xr-x 2 root root 0 Dec 9 22:58 byte_queue_limits -rw-r--r-- 1 root root 4096 Dec 9 22:58 tx_maxrate -r--r--r-- 1 root root 4096 Dec 9 22:58 tx_timeout -rw-r--r-- 1 root root 4096 Dec 9 22:58 xps_cpus /sys/class/net/p3p1/queues/tx-0/byte_queue_limits: total 0 -rw-r--r-- 1 root root 4096 Dec 9 22:58 hold_time -r--r--r-- 1 root root 4096 Dec 9 22:58 inflight -rw-r--r-- 1 root root 4096 Dec 9 22:58 limit -rw-r--r-- 1 root root 4096 Dec 9 22:58 limit_max -rw-r--r-- 1 root root 4096 Dec 9 22:58 limit_min /sys/class/net/p3p1/statistics: total 0 -r--r--r-- 1 root root 4096 Dec 9 22:58 collisions -r--r--r-- 1 root root 4096 Dec 9 22:58 multicast -r--r--r-- 1 root root 4096 Dec 9 22:58 rx_bytes -r--r--r-- 1 root root 4096 Dec 9 22:58 rx_compressed -r--r--r-- 1 root root 4096 Dec 9 22:58 rx_crc_errors -r--r--r-- 1 root root 4096 Dec 9 22:58 rx_dropped -r--r--r-- 1 root root 4096 Dec 9 22:58 rx_errors -r--r--r-- 1 root root 4096 Dec 9 22:58 rx_fifo_errors -r--r--r-- 1 root root 4096 Dec 9 22:58 rx_frame_errors -r--r--r-- 1 root root 4096 Dec 9 22:58 rx_length_errors -r--r--r-- 1 root root 4096 Dec 9 22:58 rx_missed_errors -r--r--r-- 1 root root 4096 Dec 9 22:58 rx_over_errors -r--r--r-- 1 root root 4096 Dec 9 22:58 rx_packets -r--r--r-- 1 root root 4096 Dec 9 22:58 tx_aborted_errors -r--r--r-- 1 root root 4096 Dec 9 22:58 tx_bytes -r--r--r-- 1 root root 4096 Dec 9 22:58 tx_carrier_errors -r--r--r-- 1 root root 4096 Dec 9 22:58 tx_compressed -r--r--r-- 1 root root 4096 Dec 9 22:58 tx_dropped -r--r--r-- 1 root root 4096 Dec 9 22:58 tx_errors -r--r--r-- 1 root root 4096 Dec 9 22:58 tx_fifo_errors -r--r--r-- 1 root root 4096 Dec 9 22:58 tx_heartbeat_errors -r--r--r-- 1 root root 4096 Dec 9 22:58 tx_packets -r--r--r-- 1 root root 4096 Dec 9 22:58 tx_window_errors
Most of the files (attributes) are short ASCII strings,
and all are world-readable in this area.
Their contents can be read with cat
and
more
.
Linux
Interface p7p1
connects to a cable modem
leading toward the Internet, p3p1
and em1
are on interior LANs.
Notice that p7p1
has the usual MTU of 1500.
The cable modem would have autonegotiated just 576 bytes
and prohibited the use of IPv6,
if I hadn't explicitly set the MTU to 1500 as described on my
IPv6 on Linux
page.
# grep . /sys/class/net/[ep]*/mtu /sys/class/net/em1/mtu:1500 /sys/class/net/p3p1/mtu:1500 /sys/class/net/p7p1/mtu:1500
And let's compare the traffic moved each way on the two interfaces.
# grep /sys/class/net/[ep]*/statistics/[rt]x_bytes /sys/class/net/em1/statistics/rx_bytes:343582270 /sys/class/net/em1/statistics/tx_bytes:33863 /sys/class/net/p3p1/statistics/rx_bytes:3025010313 /sys/class/net/p3p1/statistics/tx_bytes:1520737729 /sys/class/net/p7p1/statistics/rx_bytes:5861719964 /sys/class/net/p7p1/statistics/tx_bytes:857737409
And let's see some of the current state.
A script could read these files instead of
parsing the output of ethtool
.
The interface is connected to a live switch
and running at 1000 Mbps full duplex:
# grep . /sys/class/net/p3p1/{carrier,duplex,speed} /sys/class/net/p3p1/carrier:1 /sys/class/net/p3p1/duplex:full /sys/class/net/p3p1/speed:1000
Modifying Kernel Objects with echo
Yes, it's that simple! We can use this for performance tuning, and also to force controllers to re-scan their buses and fully discover new hardware.
Performance Tuning
The biggest performance limitation is disk I/O.
DDR RAM memory is semiconductors, the fastest thing in the system. Magnetic disks are mechanical, the slowest (by far) thing in the system. This is why changing to solid-state drives (or SSD) brings such an improvement in performance. SSD storage is still relatively expensive. It's getting better, but most of us still store most or all of our data on magnetic disks.
You need to be careful about how you design your file systems to use the disks. Don't have a lot of activity on one physical disk sending the read/write heads back and forth, making everything wait. Limit the activity per physical disk.
The XFS and Btrfs file systems have performance advantages over Ext4, at least in typical situations. Use an appropriate file system.
Processes take a huge performance hit if they need to send memory pages to and from the disk. Paging is bad enough, swapping is far worse as it sends an entire process's memory space from RAM to disk and then back again. Install enough RAM that you don't really need any swap area, even for light paging.
That PATA disk of mine is convenient but not necessary, I use it infrequently and it mostly holds non-critical data. Don't try to do sophisticated things with unsophisticated hardware.
If you use those suggestions, the Linux kernel is pretty well designed and you will probably be happy with performance as long as you aren't doing anything too unexpected. (Or unrealistic, although there's nothing we can do about that) But if you do want to tune things, sysfs provides a way.
The Linux kernel has three different scheduling algorithms for disk I/O. Whoever built your kernel may have made a conscious decision, or else accepted the default without realizing it. There is no one best answer, it depends on what you want to optimize.
The Deadline scheduler attempts to provide the lowest latency. You easily notice latency even when it's a small fraction of a second. Latency is usually why users perceive slow performance. So Deadline is a good scheduler for systems supporting interactive services.
Completely Fair Queuing or CFQ provides better throughput, at the cost of increased latency. So it's worse for interactive use but better for jobs with larger data sets. Each process or thread gets an equal time slice for disk I/O under CFQ.
NOOP or No Scheduler simply passes requests in the order they are received. This usually provides the best throughput, especially for storage hardware with its own queueing (such as a SAN or Storage Area Network, SSD or solid-state drives, and intelligent RAID controllers). This usually provides the worst latency, especially with SAN and RAID controllers with long I/O queues. But it's good for non-interactive compute jobs like scientific computing and rendering movie frames. It would also be a good choice for storage devices used for backup.
The kernel maintains a separate scheduler for each disk. Let's compare different distributions with simple shell loops. The kernel data structure lists the supported schedulers with the currently selected one in square brackets:
# lsb_release -a LSB Version: :core-4.1-amd64:core-4.1-noarch:cxx-4.1-amd64:cxx-4.1-noarch:desktop-4.1-amd64:desktop-4.1-noarch:languages-4.1-amd64:languages-4.1-noarch:printing-4.1-amd64:printing-4.1-noarch Distributor ID: RedHatEnterpriseWorkstation Description: Red Hat Enterprise Linux Workstation release 7.0 (Maipo) Release: 7.0 Codename: Maipo # for DEV in /sys/block/sd*/queue/scheduler > do > echo $DEV > cat $DEV > done /sys/block/sda/queue/scheduler noop [deadline] cfq /sys/block/sdb/queue/scheduler noop [deadline] cfq
# lsb_release -a LSB Version: * Distributor ID: Mageia Description: Mageia 4 Release: 4 Codename: thornicroft # for DEV in /sys/block/sd*/queue/scheduler > do > echo $DEV > cat $DEV > done /sys/block/sda/queue/scheduler noop deadline [cfq] /sys/block/sdb/queue/scheduler noop deadline [cfq] /sys/block/sdc/queue/scheduler noop deadline [cfq] /sys/block/sdd/queue/scheduler noop deadline [cfq] /sys/block/sde/queue/scheduler noop deadline [cfq] /sys/block/sdf/queue/scheduler noop deadline [cfq]
# lsb_release -a No LSB modules are available. Distributor ID: Debian Description: Debian GNU/Linux 7.8 (wheezy) Release: 7.8 Codename: wheezy # cat /sys/block/mmcblk0/queue/scheduler noop [deadline] cfq
# lsb_release -a LSB Version: :core-4.1-arm:core-4.1-noarch Distributor ID: Pidora Description: Pidora release 2014 (Raspberry Pi Fedora Remix) Release: 2014 Codename: RaspberryPiFedoraRemix # cat /sys/block/mmcblk0/queue/scheduler noop deadline [cfq]
Red Hat uses the Deadline scheduler (optimizing latency for
interactive use) for non-SATA disks, and CFQ (tradeoff between
latency and throughput) for SATA.
Mageia uses CFQ for all.
On Raspberry Pi where the solid-state disk is
/dev/mmcblk0
,
Raspbian (derived from Debian) uses Deadline
and Pidora (derived from Fedora) uses CFQ
But you can easily change the kernel's I/O scheduler for individual disks to do performance experiments! Simply echo the appropriate string into the data structure. Don't worry, you can't ask for anything unsupported. I'll make an intentional typo in the second command to demonstrate what happens.
# cat /sys/block/sda/queue/scheduler noop [deadline] cfq # echo cqf > /sys/block/sda/queue/scheduler bash: echo: write error: Invalid argument # echo cfq > /sys/block/sda/queue/scheduler # cat /sys/block/sda/queue/scheduler noop deadline [cfq]
Depending on the scheduler you select, some set of tunable
parameters appear under the iosched
subdirectory.
Of course, if you pick NOOP or no scheduler
then there is nothing to be tuned.
# cat /sys/block/sda/queue/scheduler noop deadline [cfq] # ls /sys/block/sda/queue/iosched/ back_seek_max fifo_expire_sync quantum slice_idle back_seek_penalty group_idle slice_async slice_sync fifo_expire_async low_latency slice_async_rq target_latency # echo deadline > /sys/block/sda/queue/scheduler # cat /sys/block/sda/queue/scheduler noop [deadline] cfq # ls /sys/block/sdb/queue/iosched/ fifo_batch front_merges read_expire write_expire writes_starved # echo noop > /sys/block/sda/queue/scheduler # cat /sys/block/sda/queue/scheduler [noop] deadline cfq # ls /sys/block/sda/queue/iosched/
In some specific situations you can improve performance
by modifying the low_latency
,
quantum
, and
slice_idle
parameters of the CFQ scheduler.
Again, you simply echo appropriate strings into the files.
With the Deadline scheduler, the read_expire
parameter might be reduced from its default of
500 (milliseconds) to 100 for better perceived performance.
Changes to the write_expire
and
fifo_batch
parameters are less likely to be
noticeably helpful in most situations.
For tuning Ethernet LAN performance,
you will be better off using the ethtool
command
to tune most of the parameters.
An exception would be if you want to increase the kernel input
queue length on busy systems connected to high-speed networks,
although that's another tradeoff of increasing throughput
at the expense of making latency worse.
You do that by echoing a string into
/proc/sys/net/core/netdev_max_backlog
,
in another memory-resident file system that maps
kernel parameters into user-navigable file system hierarchies.
Forcing Device Discovery Through Scans
Let's say you add some disks to a running system. Some platforms support hot-swappable disks. Or let's say you are running on top of virtualization and you created and attached some new virtual disk.
What will likely happen is that the bus controller will
notice that there is a new SCSI device, and the corresponding
hierarchy below /sys/devices/pci*
will be created.
But the device itself will not be automatically scanned,
recognized as to device class, and the corresponding new
/dev/sd*
disk device created.
There is no need to reboot!
The SCSI host class device has a scan
attribute file with the unusual permission of 0200.
It's a write-only file, the only thing you can do is
write to it as root
.
# find /sys/devices/ -name scan /sys/devices/pci0000:00/0000:00:02.1/usb1/1-8/1-8:1.0/host10/scsi_host/host10/scan /sys/devices/pci0000:00/0000:00:04.0/0000:01:09.2/usb2/2-5/2-5:1.0/host11/scsi_host/host11/scan /sys/devices/pci0000:00/0000:00:06.0/ata5/host4/scsi_host/host4/scan /sys/devices/pci0000:00/0000:00:06.0/ata6/host5/scsi_host/host5/scan /sys/devices/pci0000:00/0000:00:08.0/ata1/host0/scsi_host/host0/scan /sys/devices/pci0000:00/0000:00:08.0/ata2/host1/scsi_host/host1/scan /sys/devices/pci0000:00/0000:00:08.1/ata3/host2/scsi_host/host2/scan /sys/devices/pci0000:00/0000:00:08.1/ata4/host3/scsi_host/host3/scan # ls -l $( find /sys/devices/ -name scan ) --w------- 1 root root 4096 Mar 25 17:17 /sys/devices/pci0000:00/0000:00:02.1/usb1/1-8/1-8:1.0/host10/scsi_host/host10/scan --w------- 1 root root 4096 Mar 25 17:17 /sys/devices/pci0000:00/0000:00:04.0/0000:01:09.2/usb2/2-5/2-5:1.0/host11/scsi_host/host11/scan --w------- 1 root root 4096 Mar 23 22:07 /sys/devices/pci0000:00/0000:00:06.0/ata5/host4/scsi_host/host4/scan --w------- 1 root root 4096 Mar 23 22:07 /sys/devices/pci0000:00/0000:00:06.0/ata6/host5/scsi_host/host5/scan --w------- 1 root root 4096 Mar 23 22:07 /sys/devices/pci0000:00/0000:00:08.0/ata1/host0/scsi_host/host0/scan --w------- 1 root root 4096 Mar 23 22:07 /sys/devices/pci0000:00/0000:00:08.0/ata2/host1/scsi_host/host1/scan --w------- 1 root root 4096 Mar 23 22:07 /sys/devices/pci0000:00/0000:00:08.1/ata3/host2/scsi_host/host2/scan --w------- 1 root root 4096 Mar 23 22:07 /sys/devices/pci0000:00/0000:00:08.1/ata4/host3/scsi_host/host3/scan
If you write a special string, which for whatever
mysterious reason is "- - -
"
or three dashes separated by single spaces,
you cause a SCSI device scan and the new
/dev/sd*
device files will appear.
All you need is a simple shell loop.
This will also cause re-scans of the disks
already recognized, which causes no problem.
# for DEV in $( find /sys/devices/ -name scan ) > do > echo '- - -' > $DEV > done
To Explore Further
The best place to start is Patrick Mochel's paper The sysfs Filesystem.
The kernel documentation includes some short but useful
files on sysfs.
You may have this installed under
/usr/src/linux/Documentation/filesystems
or you can view these on-line:
sysfs.txt,
sysfs-pci.txt,
and
sysfs-tagging.txt.
For programmers, the API is described in more detail in
Anath Mavinakayanahalli's and Daniel Stekloff's paper
Libsysfs — a programming interface to gather
device information in Linux
and in the text file libsysfs.txt
included
in a package named libsysfs-devel
or
lib64sysfs-devel
or similar.
The text file will likely end up in
/usr/share/doc/lib*sysfs-devel/libsysfs.txt
.
Olivier Daudel wrote a book for O'Reilly, /Proc et /sys. It was only in print for a short time, and seems to have never been translated out of French. All this is based on kernel internal data structures, and so the details will change with little to no warning as new kernel releases appear.