23/11/2001 : This page is not valid. Plz go to docum.org
I found this in the doc-dir from the wrr project :
Introduction
============
This document describes some of the Traffic Control features of the
Linux 2.2 kernels. This document is not written by a person knowing a lot
about the Linux kernel so some of the information and terminology presented
here may be wrong. And they might be much more complicated than this text
leaves the impression of. I hope the information presented here will be useful
for many users any way.
What you need
=============
Traffic Control is new in the Linux 2.2 kernels, so you will need a 2.2 kernel
or later to use it. And you will need to select the options in the section
"QoS and/or fair queuing" when compiling the kernel. Furthermore, you will
need the user space program called "tc" (Traffic Control). The program is
located in the iproute2+tc packet available from:
ftp://ftp.inr.ac.ru/ip-routing
The big picture
===============
Each NIC (Network Interface Controller = network card) supported on your Linux
box is supported by a Network Driver which controls the hardware. Basically,
two things can be done with such network drivers:
1) The Linux Networking Code can request the network driver to send a
packet on the physical network.
2) The network driver can give packets it has received on the physical
network to the Linux Networking Code.
The Traffic Control features deals with the first of these only. That is, it
deals with the transmission of packets from your machine. Traditionally,
packets sent form your machine have traveled the following way:
+------------+ +------------+
| Linux | | Network |
| Networking | -> | Driver | ->
| Code | | |
+------------+ +------------+
Note that the packets from the Networking Code could have been generated in
several different ways. They could for example have been made on a request
from some application (Netscape, for example) running on your box. If your
machine is acting as a firewall, router or an ethernet bridge, they could also
have been read on one network interface and the put out on another interface
by the Networking Code.
With Traffic Control an extra box is inserted in this picture:
+------------+ +------------+ +------------+
| Linux | | Traffic | | Network |
| Networking | -> | Control | -> | Driver | ->
| Code | | | | |
+------------+ +------------+ +------------+
An important thing to understand is that basically only the following can be
done with the Traffic Control box:
1) The Linux Networking Code can give a packet to the box.
2) The Network Driver can request a packet from the box.
The properties of the Traffic Control box in a given setup is which packets it
decides to give to the Network Driver, in which order and in which speed.
Queuing disciplines
===================
When the Linux kernel starts up there is no Traffic Control box and the picture
looks as on the first drawing above.
But, suppose we want to insert a so-called FIFO queue of 10 packets in the box.
The properties of such a queue is:
1) Packets are given in the same order to the Network driver as they were
coming from the Networking Code.
2) Up to 10 packets can be stored in the queue.
You can think of a FIFO queue as a queue of humans in which there is only room
for 10 people. We are just queuing packets instead of humans. Note that it is
not very useful to setup up such a queue since the Network Driver itself has
a queue of maybe 100 packets.
To setup such a FIFO queue for the Traffic Control box on eth0 use the command:
tc qdisc add dev eth0 root pfifo limit 10
The FIFO queue is an example of a so-called qdisc, which is an abbreviation for
queuing discipline. Queuing disciplines are the most fundamental concept in
the Traffic Control functions of Linux. A qdisc has the same properties as the
Traffic Control box, that is:
1) You can give a packet to it (this is called enqueuing a packet)
2) You can request a packet from it (this is called dequeuing a packet)
So in fact, the Traffic Control box can be viewed as a socket into which you
can plug an arbitrary queuing discipline. The root keyword in the command
above tell that the pfifo queue should be put into this socket.
Now try to type:
tc qdisc ls dev eth0
You will see a line describing what you have just created:
qdisc pfifo 8001: dev eth0 limit 10p
This illustrates another aspect of qdiscs: All qdiscs are given a handle, in
this case the handle "8001:" (or "8001:0").
You can specify a handle when you create a qdisc. If you don't the system
will find one, which has happened here. The handle of a qdisc is a four digit
hexadecimal number followed by a colon.
Finally, you can delete the qdisc again with the command:
tc qdisc del dev eth0 root
The bfifo, sfq and tbf queuing disciplines
==========================================
There exits many other qdiscs in the Linux Traffic Control code than the
pfifo queuing discipline.
The bfifo queue is like pfifo, but instead of containing a limited number
of packets it will at most contain a limited number of bytes.
The sfq (Stochastic Fairness Queue) is more advanced. As far as I have
understood, it divides the packets into so called flows. That is, if you open
two different TCP connections packets from these two connections will probably
be considered as belonging to two different flows. It then tries to distribute
bandwidth equally between the different flows.
The tbf queue can be used to limit the bandwidth. It is not possible to take
packets from tbf queue at a speed greater than one you specify. For example,
if you don't want packets to be send with a greater speed than 128Kbit/s, you
can use a tbf queue.
If you want to attached one of these qdiscs to your network interface you can
try with the command:
tc qdisc add help
This will print a short description of the extra parameters the queue
takes. The queue can then be installed as in the pfifo example above. Note
that you need to delete any existing qdiscs before you can create a new one
in the way it is described above.
Classes
=======
The basic building block of the Traffic Control system is qdiscs as described
above. The queuing disciplines mentioned in the previous section are
comparatively simple to deal with. That is, they are setup and maybe given
some parameters. Afterwards packets can be enqueued to them and dequeued from
them as described.
But many queuing are of a different nature. These qdiscs do not store packets
themselves. Instead, they contain other qdiscs, which they give packets to and
take packets from. Such qdiscs are known as qdiscs with classes.
For example, one could imagine a priority-based queuing discipline with the
following properties:
1) Each packet enqueued to the queuing discipline is assigned a priority. For
example the priority could be deduced from the source or destination IP
address of the packet. Let us say that the priority is a number
between 1 and 5.
2) When a packet is dequeued it will always select a packet it contains with
the lowest priority number.
A way to implement such a queuing discipline is to make the priority-based
queuing discipline contain 5 other queuing disciplines numbered from 1 to 5.
The priority-based queuing discipline will then do the following:
1) When a packet is enqueued, it calculates the priority number, i.e. a number
between 1 and 5. It then enqueues the packet to the queuing discipline
indicated by this number
2) When a packet is dequeued it always dequeues from the non-empty queuing
discipline with the lowest number.
What is interesting about this, is that the 5 contained queuing disciplines
could be arbitrary queuing disciplines. For example sfq queues or any other
queue.
In Linux this concept is handled by classes. That is, a queuing discipline
might contain classes. In this example, the priority queuing discipline has 5
classes. Each class can be viewed as a socket to which you can plug in any
other queuing discipline. When a qdisc with classes is created, it will
typically assign simple FIFO queues to the classes it contains. But these can
be replaced with other qdiscs by the tc program.
There is a qdisc call prio, which I believe does exactly what is described
here. I have not used it myself, however.
If a qdisc has the handle "8001:" it's classes is given handles of the form
"8001:wxyz" where xwyz if a non-zero hexadecimal number. So in this example
the prio qdisc will maybe have the handle "8001:" and it's classes the
handles "8001:1" to "8001:5". To insert an SFQ queue in the socket of the
class "8001:2" you can use the command:
tc qdisc add dev eth0 parent 8001:2 sfq
Filters
=======
The last main concept that needs to be explained is filters. In the example
with the priory qdisc above, it was said that the qdisc could select a
priority depending on for example the IP addresses of the packets. The job of
the filters is exactly to map packets to classes. That is, if a qdisc contain
classes, you will typically be able also assign a filter that qdisc. Each time
a packet is enqueued to the qdisc, the qdisc will ask the filter to which class
this packet should go. The qdisc will then enqueue the packet to the qdisc
plugged into that class.
So what typically make the qdiscs with classes different is from each other
is which class they decide to dequeue a packet when they are asked to dequeue.
The author of this document has not experimented with filters. But as far as
he can figure out some of the qdisc (cbq and prio) seems to be able both to
handle filters as described here AND to use some kind of build-in filter.
Other material
==============
Look at:
http://qos.ittc.ukans.edu
and at
http://www.ds9a.nl/2.4Routing/
For more information.
-------------------------------------------------------------------
This document was written by Christian Worm Mortensen, worm@diku.dk