|
Home | Switchboard | Unix Administration | Red Hat | TCP/IP Networks | Neoliberalism | Toxic Managers |
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix |
|
Linux uses the term bonding for two different things
|
In Suse bonding can be configured via YaST, but often the resulting config files are incomplete: the most typical problem is that out of two cards which you want to bond together only one is visible, the second needs to be added manually in config files. Here is the outline on how to do it.
Submitted by Rachelsdad on 26 October 2009 - 11:30am.The Linux bonding driver originally came from Donald Becker's beowulf patches for kernel 2.0. It evolved considerably since that time.When bonding NICs under various flavors of SuSE, I tend to use YaST
(on other distros, I just hack the config files). To do this under YaST:
Leave the two (or more) NICs to be bonded (slaves) unconfigured, and add an interface.
Configure the interface as a Bond Network. Assign the static IP and a mask.? address subnet called Sometimes locally. IP an of portion network the extends mask A host. identifies bit that indicates ?0? network; 1> subnet mask to this interface, and set the default route.
On the main configuration page for the interface, the two (or more) unconfigured NICs should be listed with checkboxes next to them to set them as bonding slaves. Check the desired NICs, and then set the Bond Driver Options as appropriate, e.g., from the drop down, select a preconfigured option, say, mode=balance-rr miimon=100, or type your options. Save this, and leave the physical NICs whcih are to be the slaves unconfigured.
Finish your YaST network device configuration, and you should end up with a bonded pair (or trio, etc.).
For RHEL and SLES kernels ship with the bonding driver already available as a module and the ifenslave user level control program installed.
The available bonding driver parameters are listed below. If a parameter is not specified the default value is used. When initially configuring a bond, it is recommended "tail -f /var/log/messages" be run in a separate window to watch for bonding driver error messages.
At least the miimon parameter should be specified in typical configurations.
Options with textual values will accept either the text name or, for backwards compatibility, the option value. E.g., "mode=802.3ad" and "mode=4" set the same mode.
The parameters are as follows:
Possible values are:
- balance-rr or 0 Round-robin policy: Transmit packets in sequential order from the first available slave through the last. This mode provides load balancing and fault tolerance, for example
BONDING_MODULE_OPTS='miimon=100 mode=0 use_carrier=0'
BONDING_SLAVE0='eth-id-00:26:55:29:4f:0e'
BONDING_SLAVE1='eth-id-00:26:55:29:4f:0c'
- active-backup or 1 Active-backup policy: Only one slave in the bond is active. A different slave becomes active if, and only if, the active slave fails. The bond's MAC address is externally visible on only one port (network adapter) to avoid confusing the switch. In bonding version 2.6.2 or later, when a failover occurs in active-backup mode, bonding will issue one or more gratuitous ARPs on the newly active slave. One gratutious ARP is issued for the bonding master interface and each VLAN interfaces configured above it, provided that the interface has at least one IP address configured.
Gratuitous ARPs issued for VLAN interfaces are tagged with the appropriate VLAN id. This mode provides fault tolerance. The primary option, documented below, affects the behavior of this mode.
- balance-xor or 2 XOR policy: Transmit based on the selected transmit hash policy. The default policy is a simple: Alternate transmit policies may be selected via the xmit_hash_policy option. This mode provides load balancing and fault tolerance.
- broadcast or 3 Broadcast policy: transmits everything on all slave interfaces. This mode provides fault tolerance.
- 802.3ad or 4 IEEE 802.3ad Dynamic link aggregation. Creates aggregation groups that share the same speed and duplex settings. Utilizes all slaves in the active aggregator according to the 802.3ad specification.
Slave selection for outgoing traffic is done according to the transmit hash policy, which may be changed from the default simple XOR policy via the xmit_hash_policy option, documented below. Note that not all transmit policies may be 802.3ad compliant, particularly in regards to the packet mis-ordering requirements of section 43.2.4 of the 802.3ad standard. Differing peer implementations will have varying tolerances for noncompliance.
Prerequisites:
Most switches will require some type of configuration to enable 802.3ad mode.
- Ethtool support in the base drivers for retrieving the speed and duplex of each slave.
- A switch that supports IEEE 802.3ad Dynamic link aggregation.
- balance-tlb or 5 Adaptive transmit load balancing: channel bonding that does not require any special switch support. The outgoing traffic is distributed according to the current load (computed relative to the speed) on each slave. Incoming traffic is received by the current slave. If the receiving slave fails, another slave takes over the MAC address of the failed receiving slave.
- Prerequisite: Ethtool support in the base drivers for retrieving the speed of each slave.
- balance-alb or 6
- Adaptive load balancing: includes balance-tlb plus receive load balancing (rlb) for IPV4 traffic, and does not require any special switch support. The receive load balancing is achieved by ARP negotiation.
- The bonding driver intercepts the ARP Replies sent by the local system on their way out and overwrites the source hardware address with the unique hardware address of one of the slaves in the bond such that different peers use different hardware addresses for the server.
- Receive traffic from connections created by the server is also balanced. When the local system sends an ARP Request the bonding driver copies and saves the peer's IP information from the ARP packet.
- When the ARP Reply arrives from the peer, its hardware address is retrieved and the bonding driver initiates an ARP reply to this peer assigning it to one of the slaves in the bond.
- A problematic outcome of using ARP negotiation for balancing is that each time that an ARP request is broadcast it uses the hardware address of the bond. Hence, peers learn the hardware address of the bond and the balancing of receive traffic collapses to the current slave. This is handled by sending updates (ARP Replies) to all the p with their individually assigned hardware address such that the traffic is redistributed. Receive trafs re-aobin) among the group of highest speed slaves in the bond.
- When a link is reconnected or a new slave joins the bond the receive traffic is redistributed among all active slaves in the bond by initiating ARP Replies with the selected mac address to each of the clients. The updelay parameter (detailed below) must be set to a value equal or greater than the switch's forwarding delay so that the ARP Replies sent to the peers will not be blocked by the switch.
Prerequisites:
Ethtool support in the base drivers for retrieving the speed of each slave.
Base driver support for setting the hardware address of a device while it is open. This is required so that there will always be one slave in the team using the bond hardware address (the curr_active_slave) while having a unique hardware address for each slave in the bond. If the curr_active_slave fails its hardware address is swapped with the new curr_active_slave that was chosen.
If bonding insists that the link is up when it should not be, it may be that your network device driver does not support netif_carrier_on/off. The default state for netif_carrier is "carrier on," so if a driver does not support netif_carrier, it will appear as if the link is always up. In this case, setting use_carrier to 0 will cause bonding to revert to the MII / ETHTOOL ioctl method to determine the link state.
A value of 1 enables the use of netif_carrier_ok(), a value of 0
will use the deprecated MII / ETHTOOL ioctls. The default value is 1.
layer2 Uses XOR of hardware MAC addresses to generate the hash.
layer3+4 This policy uses upper layer protocol information, when available, to generate the hash. This allows for traffic to a particular network peer to span multiple slaves, although a single connection will not span multiple slaves.
For fragmented TCP or UDP packets and all other IP protocol traffic, the source and destination port information is omitted. For non-IP traffic, the formula is the same as for the layer2 transmit hash policy.
This policy is intended to mimic the behavior of certain switches, notably Cisco switches with PFC2 as well as some Foundry and IBM products.
This algorithm is not fully 802.3ad compliant. A single TCP or UDP conversation containing both fragmented and unfragmented packets will see packets striped across two interfaces. This may result in out of order delivery. Most traffic types will not meet this criteria, as TCP rarely fragments traffic, and most UDP traffic is not involved in extended conversations. Other implementations of 802.3ad may or may not tolerate this noncompliance.
The default value is layer2. This option was added in bonding version 2.6.3. In earlier versions of bonding, this parameter does not exist, and the layer2 policy is the only policy.
Next, to determine if your installation supports bonding, issue the command:
$ grep ifenslave /sbin/ifup
If this returns any matches, then your initscripts or sysconfig has support for bonding.
SuSE SLES 9 & 10 networking configuration system does support bonding, the YaST system configuration frontend provide means to work with bonding devices.
First, if they have not already been configured, configure the slave devices. On SLES this is most easily done by running the yast2 sysconfig configuration utility. The name of the configuration file for each device will be of the form:
ifcfg-id-xx:xx:xx:xx:xx:xx
Before editing, the file will contain multiple lines, and will look something like this:
BOOTPROTO='dhcp' STARTMODE='on' USERCTL='no' UNIQUE='XNzu.WeZGOGF+4wE' _nm_name='bus-pci-0001:61:01.0'
Change the BOOTPROTO and STARTMODE lines to the following:
BOOTPROTO='none' STARTMODE='off'
Do not alter the UNIQUE or _nm_name lines. Remove any other lines (USERCTL, etc).
Once the ifcfg-id-xx:xx:xx:xx:xx:xx files have been modified, it's time to create the configuration file for the bonding device itself. This file is named ifcfg-bondX, where X is the number of the bonding device to create, starting at 0. The first such file is ifcfg-bond0, the second is ifcfg-bond1, and so on. The sysconfig network configuration system will correctly start multiple instances of bonding.
The contents of the ifcfg-bondX file is as follows:
BOOTPROTO="static" BROADCAST="10.0.2.255" IPADDR="10.0.2.10" NETMASK="255.255.0.0" NETWORK="10.0.2.0" REMOTE_IPADDR="" STARTMODE="onboot" BONDING_MASTER="yes" BONDING_MODULE_OPTS="mode=active-backup miimon=100" BONDING_SLAVE0="eth0" BONDING_SLAVE1="bus-pci-0000:06:08.1"
Replace the sample BROADCAST, IPADDR, NETMASK and NETWORK values with the appropriate values for your network.
The STARTMODE specifies when the device is brought online.
The possible values are:
The line BONDING_MASTER='yes' indicates that the device is a bonding master device. The only useful value is "yes."
The contents of BONDING_MODULE_OPTS are supplied to the instance of the bonding module for this device. Specify the options for the bonding mode, link monitoring, and so on here. Do not include the max_bonds bonding parameter; this will confuse the configuration system if you have multiple bonding devices.
Finally, supply one BONDING_SLAVEn="slave device" for each slave. where "n" is an increasing value, one for each slave. The "slave device" is either an interface name, e.g., "eth0", or a device specifier for the network device. The interface name is easier to find, but the ethN names are subject to change at boot time if, e.g., a device early in the sequence has failed. The device specifiers (bus-pci-0000:06:08.1 in the example above) specify the physical network device, and will not change unor example, it is moved from one PCI slot to another). The example above uses one of each type for demonstration purposes; most configurations will choose one or the other for all slave devices.
When all configuration files have been modified or created, networking must be restarted for the configuration changes to take effect. This can be accomplished via the following:
# /etc/init.d/network restart
Note that the network control script (/sbin/ifdown) will remove the bonding module as part of the network shutdown processing, so it is not necessary to remove the module by hand if, e.g., the module parameters have changed.
Additional general options and details of the ifcfg file format can be found in an example ifcfg template file:
/etc/sysconfig/network/ifcfg.template
Note that the template does not document the various BONDING settings described above, but does describe many of the other options.
This section applies to distros using a version of initscripts with bonding support, for example,Red Hat Enterprise Linux version 4. On these systems, the network initialization scripts have some knowledge of bonding, and can be configured to control bonding devices, but does not automatically load the network adapter driver unless the ethX device is configured with an IP address. Because of this constraint, users must manually configure a network-script file for all physical adapters that will be members of a bondX link. Network script files are located in the directory:
/etc/sysconfig/network-scripts
The file name must be prefixed with "ifcfg-eth" and suffixed with the adapter's physical adapter number. For example, the script for eth0 would be named /etc/sysconfig/network-scripts/ifcfg-eth0. Place the following text in the file:
DEVICE=eth0 USERCTL=no ONBOOT=yes MASTER=bond0 SLAVE=yes BOOTPROTO=none
The DEVICE= line will be different for every ethX device and must correspond with the name of the file, i.e., ifcfg-eth1 must have a device line of DEVICE=eth1. The setting of the MASTER= line will also depend on the final bonding interface name chosen for your bond. As with other network devices, these typically start at 0, and go up one for each device, i.e., the first bonding instance is bond0, the second is bond1, and so on.
Next, create a bond network script. The file name for this script will be /etc/sysconfig/network-scripts/ifcfg-bondX where X is the number of the bond. For bond0 the file is named "ifcfg-bond0", for bond1 it is named "ifcfg-bond1", and so on. Within that file, place the following text:
DEVICE=bond0 IPADDR=192.168.1.1 NETMASK=255.255.255.0 NETWORK=192.168.1.0 BROADCAST=192.168.1.255 ONBOOT=yes BOOTPROTO=none USERCTL=no
Be sure to change the networking specific lines (IPADDR,
NETMASK, NETWORK and BROADCAST) to match your network configuration.
Finally, it is necessary to edit /etc/modules.conf (or
/etc/modprobe.conf, depending upon your distro) to load the bonding
module with your desired options when the bond0 interface is brought
up. The following lines in /etc/modules.conf (or modprobe.conf) will
load the bonding module, and select its options:
alias bond0 bonding options bond0 mode=balance-alb miimon=100
Replace the sample parameters with the appropriate set of options for your configuration.
Finally run "/etc/rc.d/init.d/network restart" as root. This
will restart the networking subsystem and your bond link should be now
up and running.
At this writing, the initscripts package does not directly support loading the bonding driver multiple times, so the process for doing so is the same as described in the "Configuring Multiple Bonds Manually" section, below.
NOTE: It has been observed that some Red Hat supplied kernels are apparently unable to rename modules at load time (the "-obonding1" part). Attempts to pass that option to modprobe will produce an "Operation not permitted" error. This has been reported on some Fedora Core kernels, and has been seen on RHEL 4 as well. On kernels exhibiting this problem, it will be impossible to configure multiple bonds with differing parameters.
RedHat EL 5 supports multiple bonds even in configuration with different modes. Edit /etc/modprobe.conf and add:
alias bond0 bonding alias bond1 bonding options bonding max_bonds=2
Edit/Create the file /etc/sysconfig/network-scripts/ifcfg-bond0, configuration is same as before except one option BONDING_OPTS:
DEVICE=bond0 ONBOOT=yes BOOTPROTO=dhcp USERCTL=no BONDING_OPTS="mode=1 miimon=100 primary=eth0"
For example second device can use mode=0. Edit/create the file /etc/sysconfig/network-scripts/ifcfg-bond1 and mention the bonding options:
DEVICE=bond1 ONBOOT=yes BOOTPROTO=dhcp USERCTL=no BONDING_OPTS="mode=0 miimon=100"
This section applies to distros having /etc/net already integrated or to hand-made /etc/net installations. Bonding interfaces are usual /etc/net interfaces, the only thing you need to do is to decide which interfaces you will assign to the bond and which bond options you will use. In this example we will setup a high-availability ethernet bonding from two ethernet cards. /etc/net keeps information about interfaces in
/etc/net/ifaces
First of all we have to create a configuration directory for each interface involved in configuration:
# mkdir /etc/net/ifaces/primary # mkdir /etc/net/ifaces/backup # mkdir /etc/net/ifaces/failover
Then we will fill options files for ethernet interfaces:
# cat > /etc/net/ifaces/primary/options TYPE=eth MODULE=e100 ^D # cat > /etc/net/ifaces/backup/options TYPE=eth MODULE=e100 ^D # cat >> /etc/net/iftab primary mac 00:10:dc:9e:af:d5 backup mac 00:10:dc:9e:af:d6 ^D
We have configured two ethernet cards and fixed their names with iftab. Now it's time to configure bonding:
# cat > /etc/net/ifaces/failover/options TYPE=bond BONDMODE=1 HOST='primary backup' BONDOPTIONS='use_carrier=1 miimon=100 primary=primary' ^D # cat > /etc/net/ifaces/failover/ipv4address 192.168.1.1/24 ^D # cat > /etc/net/ifaces/failover/ipv4route default via 192.168.1.254 ^D
After that the only thing we have to do is
# ifup failover
/etc/net will automatically discover (from HOST option) the correct order of initialization. You can configure as many bonds as you need. DHCP is currently not supported for bonding interfaces in /etc/net.
Each bonding device has a read-only file residing in the /proc/net/bonding directory. The file contents include information about the bonding configuration, options and state of each slave.
For example, the contents of /proc/net/bonding/bond0 after the driver is loaded with parameters of mode=0 and miimon=1000 is generally as follows:
Ethernet Channel Bonding Driver: 2.6.1 (October 29, 2004) Bonding Mode: load balancing (round-robin) Currently Active Slave: eth0 MII Status: up MII Polling Interval (ms): 1000 Up Delay (ms): 0 Down Delay (ms): 0 Slave Interface: eth1 MII Status: up Link Failure Count: 1 Slave Interface: eth0 MII Status: up Link Failure Count: 1
The precise format and contents will change depending upon the bonding configuration, state, and version of the bonding driver.
The network configuration can be inspected using the ifconfig command.
Bonding devices will have the MASTER flag set; Bonding slave devices will
have the SLAVE flag set. The ifconfig output does not
contain information on which slaves are associated with which masters.
In the example below, the bond0 interface is the master (MASTER) while eth0 and eth1 are slaves (SLAVE). Notice all slaves of bond0 have the same MAC address (HWaddr) as bond0 for all modes except TLB and ALB that require a unique MAC address for each slave.
# /sbin/ifconfig bond0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4 inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0 UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:7224794 errors:0 dropped:0 overruns:0 frame:0 TX packets:3286647 errors:1 dropped:0 overruns:1 carrier:0 collisions:0 txqueuelen:0 eth0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4 inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0 UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 RX packets:3573025 errors:0 dropped:0 overruns:0 frame:0 TX packets:1643167 errors:1 dropped:0 overruns:1 carrier:0 collisions:0 txqueuelen:100 Interrupt:10 Base address:0x1080 eth1 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4 inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0 UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 RX packets:3651769 errors:0 dropped:0 overruns:0 frame:0 TX packets:1643480 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 Interrupt:9 Base address:0x1400
For this section, "switch" refers to whatever system the bonded devices are directly connected to (i.e., where the other end of the cable plugs into). This may be an actual dedicated switch device, or it may be another regular system (e.g., another computer running Linux),
The active-backup, balance-tlb and balance-alb modes do not require any specific configuration of the switch.
The 802.3ad mode requires that the switch have the appropriate ports configured as an 802.3ad aggregation. The precise method used to configure this varies from switch to switch, but, for example, a Cisco 3550 series switch requires that the appropriate ports first be grouped together in a single etherchannel instance, then that etherchannel is set to mode "lacp" to enable 802.3ad (instead of standard EtherChannel).
The balance-rr, balance-xor and broadcast modes generally require that the switch have the appropriate ports grouped together. The nomenclature for such a group differs between switches, it may be called an "etherchannel" (as in the Cisco example, above), a "trunk group" or some other similar variation. For these modes, each switch will also have its own configuration options for the switch's transmit policy to the bond. Typical choices include XOR of either the MAC or IP addresses. The transmit policy of the two peers does not need to match. For these three modes, the bonding mode really selects a transmit policy for an EtherChannel group; all three will interoperate with another EtherChannel group.
It is possible to configure VLAN devices over a bond interface using the 8021q driver. However, only packets coming from the 8021q driver and passing through bonding will be tagged by default. Self generated packets, for example, bonding's learning packets or ARP packets generated by either ALB mode or the ARP monitor mechanism, are tagged internally by bonding itself. As a result, bonding must "learn" the VLAN IDs configured above it, and use those IDs to tag self generated packets.
For reasons of simplicity, and to support the use of adapters that can
do VLAN hardware acceleration offloading, the bonding interface declares
itself as fully hardware offloading capable, it gets the add_vid/kill_vid
notifications to gather the necessary information, and it propagates those
actions to the slaves. In case of mixed adapter types, hardware accelerated
tagged packets that should go through an adapter that is not offloading
capable are "un-accelerated" by the bonding driver so the VLAN tag sits
in the
regular location.
VLAN interfaces must be added on top of a bonding interface only after enslaving at least one slave. The bonding interface has a hardware address of 00:00:00:00:00:00 until the first slave is added. If the VLAN interface is created prior to the first enslavement, it would pick up the all-zeroes hardware address. Once the first slave is attached to the bond, the bond device itself will pick up the slave's hardware address, which is then available for the VLAN device.
Also, be aware that a similar problem can occur if all slaves are released
from a bond that still has one or more VLAN interfaces on top of it. When
a new slave is added, the bonding interface will obtain its hardware address
from the first slave, which might not match the hardware address of the
VLAN interfaces (which was
ultimately copied from an earlier slave).
There are two methods to insure that the VLAN device operates with the correct hardware address if all slaves are removed from a bond interface:
Note that changing a VLAN interface's HW address would set the underlying
device -- i.e. the bonding interface -- to promiscuous
mode, which might not be what you want.
The bonding driver at present supports two schemes for monitoring a slave device's link state: the ARP monitor and the MII monitor. At the present time, due to implementation restrictions in the bonding driver itself, it is not possible to enable both ARP and MII monitoring simultaneously.
The ARP monitor operates as its name suggests: it sends ARP queries to one or more designated peer systems on the network, and uses the response as an indication that the link is operating. This gives some assurance that traffic is actually flowing to and from one or more peers on the local network.
The ARP monitor relies on the device driver itself to verify that traffic is flowing. In particular, the driver must keep up to date the last receive time, dev->last_rx, and transmit start time, dev->trans_start. If these are not updated by the driver, then the ARP monitor will immediately fail any slaves using that driver, and those slaves will stay down. If networking monitoring (tcpdump, etc) shows the ARP requests and replies on the network, then it may be that your device driver is not updating last_rx and trans_start.
While ARP monitoring can be done with just one target, it can be useful
in a High Availability setup to have several targets to monitor. In the
case of just one target, the target itself may go down or have a problem
making it unresponsive to ARP requests. Having an additional target (or
several) increases the reliability of the ARP
monitoring.
Multiple ARP targets must be separated by commas as follows:
# example options for ARP monitoring with three targets alias bond0 bonding options bond0 arp_interval=60 arp_ip_target=192.168.0.1,192.168.0.3,192.168.0.9
For just a single target the options would resemble:
# example options for ARP monitoring with one target alias bond0 bonding options bond0 arp_interval=60 arp_ip_target=192.168.0.100
The MII monitor monitors only the carrier state of the local network interface. It accomplishes this in one of three ways: by depending upon the device driver to maintain its carrier state, by querying the device's MII registers, or by making an ethtool query to the device.
If the use_carrier module parameter is 1 (the default value), then the MII monitor will rely on the driver for carrier state information (via the netif_carrier subsystem). As explained in the use_carrier parameter information, above, if the MII monitor fails to detect carrier loss on the device (e.g., when the cable is physically disconnected), it may be that the driver does not support netif_carrier.
If use_carrier is 0, then the MII monitor will first query the device's
(via ioctl) MII registers and check the link state. If that request fails
(not just that it returns carrier down), then the MII monitor will make
an ethtool ETHOOL_GLINK request to attempt to obtain the same information.
If both methods fail (i.e., the driver either
does not support or had some error in processing both the MII register and
ethtool requests), then the MII monitor will assume the link is up.
When bonding is configured, it is important that the slave devices not have routes that supercede routes of the master (or, generally, not have routes at all). For example, suppose the bonding device bond0 has two slaves, eth0 and eth1, and the routing table is as follows:
Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 10.0.0.0 0.0.0.0 255.255.0.0 U 40 0 0 eth0 10.0.0.0 0.0.0.0 255.255.0.0 U 40 0 0 eth1 10.0.0.0 0.0.0.0 255.255.0.0 U 40 0 0 bond0 127.0.0.0 0.0.0.0 255.0.0.0 U 40 0 0 lo
This routing configuration will likely still update the receive/transmit times in the driver (needed by the ARP monitor), but may bypass the bonding driver (because outgoing traffic to, in this case, another host on network 10 would use eth0 or eth1 before bond0).
The ARP monitor (and ARP itself) may become confused by this configuration,
because ARP requests (generated by the ARP monitor) will be sent on one
interface (bond0), but the corresponding reply will arrive on a different
interface (eth0). This reply looks to ARP as an unsolicited ARP reply (because
ARP matches replies on an
interface basis), and is discarded. The MII monitor is not affected by the
state of the routing table.
The solution here is simply to insure that slaves do not have routes of their own, and if for some reason they must, those routes do not supercede routes of their master. This should generally be the case, but unusual configurations or errant manual or automatic static route additions may cause trouble.
By default, bonding enables the use_carrier option, which instructs bonding to trust the driver to maintain carrier state.
As discussed in the options section, above, some drivers do not support the netif_carrier_on/_off link state tracking system. With use_carrier enabled, bonding will always see these links as up, regardless of their actual state.
Additionally, other drivers do support netif_carrier, but do not maintain it in real time, e.g., only polling the link state at some fixed interval. In this case, miimon will detect failures, but only after some long period of time has expired. If it appears that miimon is very slow in detecting link failures, try specifying use_carrier=0 to see if that improves the failure detection time. If it does, then it may be that the driver checks the carrier state at a fixed interval, but does not cache the MII register values (so the use_carrier=0 method of querying the registers directly works). If use_carrier=0 does not improve the failover, then the driver may cache the registers, or the problem may be elsewhere.
Also, remember that miimon only checks for the device's carrier state. It has no way to determine the state of devices on or beyond other ports of a switch, or if a switch is refusing to pass traffic while still maintaining carrier on.
If running SNMP agents, the bonding driver should be loaded before any network drivers participating in a bond. This requirement is due to the interface index (ipAdEntIfIndex) being associated to the first interface found with a given IP address. That is, there is only one ipAdEntIfIndex for each IP address. For example, if eth0 and eth1 are slaves of bond0 and the driver for eth0 is loaded before the bonding driver, the interface for the IP address will be associated with the eth0 interface. This configuration is shown below, the IP address 192.168.1.1 has an interface index of 2 which indexes to eth0 in the ifDescr table (ifDescr.2).
interfaces.ifTable.ifEntry.ifDescr.1 = lo interfaces.ifTable.ifEntry.ifDescr.2 = eth0 interfaces.ifTable.ifEntry.ifDescr.3 = eth1 interfaces.ifTable.ifEntry.ifDescr.4 = eth2 interfaces.ifTable.ifEntry.ifDescr.5 = eth3 interfaces.ifTable.ifEntry.ifDescr.6 = bond0 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.10.10.10 = 5 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.192.168.1.1 = 2 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.74.20.94 = 4 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.127.0.0.1 = 1
This problem is avoided by loading the bonding driver before
any network drivers participating in a bond. Below is an example of
loading the bonding driver first, the IP address 192.168.1.1 is
correctly associated with ifDescr.2.
interfaces.ifTable.ifEntry.ifDescr.1 = lo interfaces.ifTable.ifEntry.ifDescr.2 = bond0 interfaces.ifTable.ifEntry.ifDescr.3 = eth0 interfaces.ifTable.ifEntry.ifDescr.4 = eth1 interfaces.ifTable.ifEntry.ifDescr.5 = eth2 interfaces.ifTable.ifEntry.ifDescr.6 = eth3 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.10.10.10 = 6 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.192.168.1.1 = 2 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.74.20.94 = 5 ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.127.0.0.1 = 1
While some distributions may not report the interface name in
ifDescr, the association between the IP address and IfIndex remains
and SNMP functions such as Interface_Scan_Next will report that
association.
When running network monitoring tools, e.g., tcpdump, it is common to enable promiscuous mode on the device, so that all traffic is seen (instead of seeing only traffic destined for the local host).
The bonding driver handles promiscuous mode changes to the bonding master device (e.g., bond0), and propagates the setting to the slave devices.
If two hosts (or a host and a single switch) are directly connected via multiple physical links, then there is no availability penalty to optimizing for maximum bandwidth. In this case, there is only one switch (or peer), so if it fails, there is no alternative access to fail over to. Additionally, the bonding load balance modes support link monitoring of their members, so if individual links fail, the load will be rebalanced across the remaining devices.
In a single switch configuration, the best method to maximize throughput depends upon the application and network environment. The various load balancing modes each have strengths and weaknesses in different environments, as detailed below.
For this discussion, we will break down the topologies into two categories. Depending upon the destination of most traffic, we categorize them into either "gatewayed" or "local" configurations.
In a gatewayed configuration, the "switch" is acting primarily as a router, and the majority of traffic passes through this router to other networks. An example would be the following:
+----------+ +----------+ | |eth0 port1| | to other networks | Host A +---------------------+ router +-------------------> | +---------------------+ | Hosts B and C are out | |eth1 port2| | here somewhere +----------+ +----------+
The router may be a dedicated router device, or another host acting as a gateway. For our discussion, the important point is that the majority of traffic from Host A will pass through the router to some other network before reaching its final destination.
In a gatewayed network configuration, although Host A may communicate with many other systems, all of its traffic will be sent and received via one other peer on the local network, the router.
Note that the case of two systems connected directly via multiple physical links is, for purposes of configuring bonding, the same as a gatewayed configuration. In that case, it happens that all traffic is destined for the "gateway" itself, not some other network beyond the gateway.
In a local configuration, the "switch" is acting primarily as a switch, and the majority of traffic passes through this switch to reach other stations on the same network. An example would be the following:
+----------+ +----------+ +--------+ | |eth0 port1| +-------+ Host B | | Host A +------------+ switch |port3 +--------+ | +------------+ | +--------+ | |eth1 port2| +------------------+ Host C | +----------+ +----------+port4 +--------+
Again, the switch may be a dedicated switch device, or another host acting as a gateway. For our discussion, the important point is that the majority of traffic from Host A is destined for other hosts on the same local network (Hosts B and C in the above example).
In summary, in a gatewayed configuration, traffic to and from the bonded device will be to the same MAC level peer on the network (the gateway itself, i.e., the router), regardless of its final destination. In a local configuration, traffic flows directly to and from the final destinations, thus, each destination (Host B, Host C) will be addressed directly by their individual MAC addresses.
This distinction between a gatewayed and a local network configuration is important because many of the load balancing modes available use the MAC addresses of the local network source and destination to make load balancing decisions. The behavior of each mode is described below.
This configuration is the easiest to set up and to understand, although you will have to decide which bonding mode best suits your needs. The trade offs for each mode are detailed below:
of order, causing TCP/IP's congestion control system to kick in, often by retransmitting segments.
It is possible to adjust TCP/IP's congestion limits by altering the net.ipv4.tcp_reordering
sysctl parameter. The
usual default value is 3, and the maximum useful value is 127. For a four
interface balance-rr bond, expect that a single TCP/IP stream will utilize
no more than approximately 2.3 interface's worth of throughput, even after
adjusting tcp_reordering.
Note that this out of order delivery occurs when both the sending and
receiving systems are utilizing a multiple interface bond. Consider a configuration
in which a balance-rr bond feeds into a single higher capacity network channel
(e.g., multiple 100Mb/sec ethernets feeding a single gigabit ethernet via
an etherchannel capable switch). In this configuration, traffic sent from
the multiple 100Mb devices to a destination connected to the gigabit device
will not see packets out of order. However, traffic sent from the gigabit
device to the multiple 100Mb devices may or may not see traffic out of order,
depending upon the balance policy of the switch. Many switches do not support
any modes that stripe traffic (instead choosing a port based upon IP or
MAC level
addresses); for those devices, traffic flowing from the gigabit device to
the many 100Mb devices will only utilize one interface.
If you are utilizing protocols other than TCP/IP, UDP for example, and your application can tolerate out of order delivery, then this mode can allow for single stream datagram performance that scales near linearly as interfaces are added to the bond.
This mode requires the switch to have the appropriate ports configured
for "etherchannel" or "trunking." active-backup: There is not much advantage
in this network topology to the active-backup mode, as the inactive backup
devices are all
connected to the same peer as the primary. In this case, a load balancing
mode (with link monitoring) will provide the
same level of network availability, but with increased available bandwidth.
On the plus side, active-backup mode does not require any configuration
of the switch, so it may have value if the hardware available does not support
any of the load balance modes.
Additionally, the linux bonding 802.3ad implementation distributes traffic
by peer (using an XOR of MAC addresses),
so in a "gatewayed" configuration, all outgoing traffic will generally use
the same device. Incoming traffic may also end
up on a single device, but that is dependent upon the balancing policy of
the peer's 8023.ad implementation. In a
"local" configuration, traffic will be distributed across the devices in
the bond.
Finally, the 802.3ad mode mandates the use of the MII monitor, therefore, the ARP monitor is not available in this mode.
Unlike 802.3ad, interfaces may be of differing speeds, and no special switch configuration is required. On the down side, in this mode all incoming traffic arrives over a single interface, this mode requires certain ethtool support in the network device driver of the slave interfaces, and the ARP monitor is not available.
The only additional down side to this mode is that the network
device driver must support changing the hardware address while
the device is open.
The choice of link monitoring may largely depend upon which mode you choose to use. The more advanced load balancing modes do not support the use of the ARP monitor, and are thus restricted to using the MII monitor (which does not provide as high a level of end to end assurance as the ARP monitor).
Multiple switches may be utilized to optimize for throughput when they are configured in parallel as part of an isolated network between two or more systems, for example:
+-----------+ | Host A | +-+---+---+-+ | | | +--------+ | +---------+ | | | +------+---+ +-----+----+ +-----+----+ | Switch A | | Switch B | | Switch C | +------+---+ +-----+----+ +-----+----+ | | | +--------+ | +---------+ | | | +-+---+---+-+ | Host B | +-----------+
In this configuration, the switches are isolated from one another. One reason to employ a topology such as this is for an isolated network with many hosts (a cluster configured for high performance, for example), using multiple smaller switches can be more cost effective than a single larger switch, e.g., on a network with 24 hosts, three 24 port switches can be significantly less expensive than a single 72 port switch.
If access beyond the network is required, an individual host can be equipped with an additional network device connected to an external network; this host then additionally acts as a gateway.
In actual practice, the bonding mode typically employed in configurations of this type is balance-rr. Historically, in this network configuration, the usual caveats about out of order packet delivery are mitigated by the use of network adapters that do not do any kind of packet coalescing (via the use of NAPI, or because the device itself does not generate interrupts until some number of packets has arrived). When employed in this fashion, the balance-rr mode allows individual connections between two hosts to effectively utilize greater than one interface's bandwidth.
Again, in actual practice, the MII monitor is most often used in this configuration, as performance is given preference over availability. The ARP monitor will function in this topology, but its advantages over the MII monitor are mitigated by the volume of probes needed as the number of systems involved grows (remember that each host in the network is configured with bonding).
Some switches exhibit undesirable behavior with regard to the timing of link up and down reporting by the switch.
First, when a link comes up, some switches may indicate that the link
is up (carrier available), but not pass traffic over the interface for some
period of time. This delay is typically due to some type of autonegotiation
or routing protocol, but may also occur during switch initialization (e.g.,
during recovery after a switch failure). If you find this to be a problem,
specify an appropriate value to the updelay bonding module option to delay
the use of the
relevant interface(s).
Second, some switches may "bounce" the link state one or more times while a link is changing state. This occurs most commonly while the switch is initializing. Again, an appropriate updelay value may help.
Note that when a bonding interface has no active links, the driver will immediately reuse the first link that goes up, even if the updelay parameter has been specified (the updelay is ignored in this case). If there are slave interfaces waiting for the updelay timeout to expire, the interface that first went into that state will be immediately reused. This reduces down time of the network if the value of updelay has been overestimated, and since this occurs only in cases with no connectivity, there is no additional penalty for ignoring the updelay.
In addition to the concerns about switch timings, if your switches take a long time to go into backup mode, it may be desirable to not activate a backup interface immediately after a link goes down. Failover may be delayed via the downdelay bonding module option.
It is not uncommon to observe a short burst of duplicated traffic when the bonding device is first used, or after it has been idle for some period of time. This is most easily observed by issuing a "ping" to some other host on the network, and noticing that the output from ping flags duplicates (typically one per slave).
For example, on a bond in active-backup mode with five slaves all connected to one switch, the output may appear as follows:
# ping -n 10.0.4.2 PING 10.0.4.2 (10.0.4.2) from 10.0.3.10 : 56(84) bytes of data. 64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.7 ms 64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!) 64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!) 64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!) 64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!) 64 bytes from 10.0.4.2: icmp_seq=2 ttl=64 time=0.216 ms 64 bytes from 10.0.4.2: icmp_seq=3 ttl=64 time=0.267 ms 64 bytes from 10.0.4.2: icmp_seq=4 ttl=64 time=0.222 ms
This is not due to an error in the bonding driver, rather, it is a side effect of how many switches update their MAC forwarding tables. Initially, the switch does not associate the MAC address in the packet with a particular switch port, and so it may send the traffic to all ports until its MAC forwarding table is updated. Since the interfaces attached to the bond may occupy multiple ports on a single switch, when the switch (temporarily) floods the traffic to all ports, the bond device receives multiple copies of the same packet (one per slave device).
The duplicated packet behavior is switch dependent, some switches exhibit this, and some do not. On switches that display this behavior, it can be induced by clearing the MAC forwarding table (on most Cisco switches, the privileged command "clear mac address-table dynamic" will accomplish this).
This section contains additional information for configuring bonding on specific hardware platforms, or for interfacing bonding with particular switches or other devices.
All JS20s come with two Broadcom Gigabit Ethernet ports integrated on
the planar (that's "motherboard" in IBM-speak). In the BladeCenter chassis,
the eth0 port of all JS20 blades is hard wired to I/O Module #1; similarly,
all eth1 ports are wired to I/O Module #2. An add-on Broadcom daughter card
can be installed on a JS20 to provide
two more Gigabit Ethernet ports. These ports, eth2 and eth3, are wired to
I/O Modules 3 and 4, respectively.
Each I/O Module may contain either a switch or a passthrough module (which allows ports to be directly connected to an external switch). Some bonding modes require a specific BladeCenter internal network topology in order to function; these are detailed below.
Additional BladeCenter-specific networking information can be found in three IBM Redbooks (www.ibm.com/redbooks):
"IBM eServer BladeCenter Networking Options"
"IBM eServer BladeCenter Layer 2-7 Network Switching"
"Cisco Systems Intelligent Gigabit Ethernet Switch Module for the IBM BladeCenter"
Because a BladeCenter can be configured in a very large number of ways, this discussion will be confined to describing basic configurations.
Normally, Ethernet Switch Modules (ESMs) are used in I/O modules 1 and 2. In this configuration, the eth0 and eth1 ports of a JS20 will be connected to different internal switches (in the respective I/O modules).
A passthrough module (OPM or CPM, optical or copper, passthrough module) connects the I/O module directly to an external switch. By using PMs in I/O module #1 and #2, the eth0 and eth1 interfaces of a JS20 can be redirected to the outside world and connected to a common external switch.
Depending upon the mix of ESMs and PMs, the network will appear to bonding
as either a single switch topology (all PMs) or as a multiple switch topology
(one or more ESMs, zero or more PMs). It is also possible to connect ESMs
together, resulting in a configuration much like the example in "High Availability
in a Multiple Switch
Topology," above.
The balance-rr mode requires the use of passthrough modules for devices in the bond, all connected to an common external switch. That switch must be configured for "etherchannel" or "trunking" on the appropriate ports, as is usual for balance-rr.
The balance-alb and balance-tlb modes will function with either switch
modules or passthrough modules (or a mix). The only specific requirement
for these modes is that all network interfaces must be able to reach all
destinations for traffic sent over the bonding device (i.e., the network
must converge at some point outside
the BladeCenter).
The active-backup mode has no additional requirements.
When an Ethernet Switch Module is in place, only the ARP
monitor will reliably detect link loss to an external switch. This is
nothing unusual, but examination of the BladeCenter cabinet would
suggest that the "external" network ports are the ethernet ports for
the system, when it fact there is a switch between these "external"
ports and the devices on the JS20 system itself. The MII monitor is
only able to detect link failures between the ESM and the JS20 system.
When a passthrough module is in place, the MII monitor does
detect failures to the "external" port, which is then directly
connected to the JS20 system.
Note: There is a special feature (Trunk Failover) available on some of
the IBM switch modules (the Cisco IGESM for one) that will provide feedback
to the internal connections, such that a failure on the external uplinks
can be relayed back to the internal server facing links. This allows the
use of MII monitor to detect an external uplink failure. Details on its
use and configuration can be found in section 7.7 of the IBM Redpaper at:
http://www.redbooks.ibm.com/abstracts/redp3869.html
The Serial Over LAN (SoL) link is established over the primary ethernet (eth0) only, therefore, any loss of link to eth0 will result in losing your SoL connection. It will not fail over with other network traffic, as the SoL system is beyond the control of the bonding driver.
It may be desirable to disable spanning tree on the switch (either the internal Ethernet Switch Module, or an external switch) to avoid fail-over delay issues when using bonding.
October 26, 2009Submitted by Rachelsdad on 26 October 2009 - 11:30am.
When bonding NICs under various flavors of SuSE, I tend to use YaST
(on other distros, I just hack the config files). To do this under YaST:
Leave the two (or more) NICs to be bonded (slaves) unconfigured, and add an interface.
Configure the interface as a Bond Network. Assign the static IP and a mask.? address subnet called Sometimes locally. IP an of portion network the extends mask A host. identifies bit that indicates ?0? network; 1> subnet mask to this interface, and set the default route.
On the main configuration page for the interface, the two (or more) unconfigured NICs should be listed with checkboxes next to them to set them as bonding slaves. Check the desired NICs, and then set the Bond Driver Options as appropriate, e.g., from the drop down, select a preconfigured option, say, mode=balance-rr miimon=100, or type your options. Save this, and leave the physical NICs whcih are to be the slaves unconfigured.
Finish your YaST network device configuration, and you should end up with a bonded pair (or trio, etc.).
January 9, 2009 |
This article has the steps to bond two network interfaces on a SLES 10 server to increase network throughput or to achieve NIC failover for high availability configurations.
- Find out whether the network card supports miimon, ethtool monitoring. This will determine the bonding module options for our configuration at other places.
# ethtool eth0<If you see something like below, you can use miimon mode.
Settings for eth0 Current message level: 0x000000ff (255) Link Detected: yes- Configure your network cards in yast, and configure the first network card with the IP address and other network information that you want the bonded interface to have. Configure the other network card with a dummy IP addresses. As we won't be using this dummy configuration anywhere so it doesn't matter.
- Go to a terminal window and cd to /etc/sysconfig/network/ and make a copy of configuration file of the network card you just configured with the i.p address and other network information. Network configuration file-name starts with name ifcfg-eth-id*.
We will be using this file as a template for our bonding configuration. Name of the destination file should be ifcfg-bond0 for the first bonded pair.
# cd /etc/sysconfig/nlsetwork/ # ls ifcfg-eth-id* # cp ifcfg-eth-id-<your Ist network card> ifcfg-bond0 Note: You can use yast2 network configuration window to find out config file of your Ist network card. Note down the mac address of the first network card from yast and compare it with the names of ifcfg-eth-id-* files which have the mac address of card appended in their names.- We will use the above created "ifcfg-bond0" as a template to start with. We need to discover the PCI bus IDs for the two 'real' NICs. At the prompt. For this cd go to:/etc/sysconfig/network and type "grep bus-pci ifcfg-eth-id*".
# cd /etc/sysconfig/network # grep bus-pci ifcfg-eth-id*Yoou should see something like this.
:_nm_name='bus-pci-0000:05:00.0' :_nm_name='bus-pci-0000:04:00.0'- The above command gives us the addresses of the two physical network cards. Using this information, we can now modify our ifcfg-bond0 file to tell it the card details to use.
Add in a section like this at the end of the ifcfg-bond0 file and save it.
BONDING_MASTER=yes BONDING_SLAVE_0='bus-pci-0000:05:00.0' BONDING_SLAVE_1='bus-pci-0000:04:00.0'- The next step is to specify to the system which driver to load when bond0 if referenced. To do this, open the file /etc/modprobe.conf.local.
# vi /etc/modprobe.conf.localAddd the following lines in the end of file and save it
alias bond0 bonding options bonding miimon=100 mode=0 use_carrier=0The above specifies that when we see bond0 being referenced, we need to load the bonding driver with the parameters outlined. The 'miimon=100' value tells the driver to use mii monitoring, watching every 100 milliseconds for a link failure. The 'mode' parameter specifies one of four bonding policies.
Note: The default is round-robin. Possible mode values are:
0 Round-robin policy: Transmit in a sequential order from the first available slave through the last. This mode provides load balancing and fault tolerance.
1 Active-backup policy: Only one slave in the bond is active. A different slave becomes active if, and only if, the active slave fails. The bond's MAC address is externally visible on only one port (network adapter) to avoid confusing the switch. This mode provides fault tolerance.
2 XOR policy: Transmit based on [(source MAC address XOR'd with destination MAC address) modula slave count]. This selects the same slave for each destination MAC address. This mode provides load balancing and fault tolerance.
3 Broadcast policy: transmits everything on all slave interfaces. This mode provides fault tolerance.- We need to now clear out the old ifcfg files that we don't need. Just delete or move for backup the ifcfg-eth-id* files in the /etc/sysconfig/network directory and restart the network.
# cd /etc/sysconfig/network # mv ifcfg-eth-id* /backup # rcnetwork restart- If you have done the configuration correctly, you will see the bond0 interface appearing with the correct IP address and 'as bonding master', followed by two 'enslaving eth' lines. Verify the configuration using ifconfig and you'll notice the MAC addresses for all the cards are identical, just as the IP addresses for eth0, eth1 and bond0 are identical.
- Test your configuration by plugging in network cables in both network cards and start ping to another machine on network. Now plug out one network cable from card-2 and verify that ping is still going on fine. Now plug in the cable to card-2 and remove the cable from card-1. If the ping still goes on fine, your configuration and setup is correct.
- If needed, you can repeat the process for a second bond, just modify the modprobe.conf.local with an 'alias bond1 bonding' line and carry on as before.
Submitted by Conz on 22 April 2009 - 1:33am.Worked like a charm for me on a Proliant dl360 with broadcom nics.
I did change the mode to 4 (lacp) to create a switch based link aggregation / etherchannel which was preferable in our case.Configuration besides the mode=4 is identical.
Posted the additional modes below for other people who might be looking into setting up ethernet bonding. Copied from http://www.linuxhorizon.ro/bonding.html
mode=4 (802.3ad)
IEEE 802.3ad Dynamic link aggregation. Creates aggregation groups that share the same speed and duplex settings. Utilizes all slaves in the active aggregator according to the 802.3ad specification.Pre-requisites:
1. Ethtool support in the base drivers for retrieving the speed and duplex of each slave.
2. A switch that supports IEEE 802.3ad Dynamic link aggregation.Most switches will require some type of configuration to enable 802.3ad mode.
mode=5 (balance-tlb)
Adaptive transmit load balancing: channel bonding that does not require any special switch support. The outgoing traffic is distributed according to the current load (computed relative to the speed) on each slave. Incoming traffic is received by the current slave. If the receiving slave fails, another slave takes over the MAC address of the failed receiving slave.
Prerequisite:
Ethtool support in the base drivers for retrieving the speed of each slave.
mode=6 (balance-alb)
Adaptive load balancing: includes balance-tlb plus receive load balancing (rlb) for IPV4 traffic, and does not require any special switch support. The receive load balancing is achieved by ARP negotiation. The bonding driver intercepts the ARP Replies sent by the local system on their way out and overwrites the source hardware address with the unique hardware address of one of the slaves in the bond such that different peers use different hardware addresses for the server.
The most used are the first four mode types...
Also you can use multiple bond interface but for that you must load the bonding module as many as you need.
Presuming that you want two bond interface you must configure the /etc/modules.conf as follow:alias bond0 bonding
options bond0 -o bond0 mode=0 miimon=100
alias bond1 bonding
options bond1 -o bond1 mode=1 miimon=100I'm not an expert (yet), but from what I've gathered so far, check out the site http://www.linux-corner.info/bonding.html for info (and a howto) for "bonding" 2 NICs in Linux 2.6 (should work for virtually any distro).
Also of note: bonding doesn't require anything special from your other networking infrastructure -- but trunking requires "smart" switches (ones that can detect and not barf at the presence of one address on multiple paths).
I'll re-post if/when I find an article as easy to use as the one above.
I hope you found this useful!
Posted by cseader under Linux/Kernel, Tools/Utils
1 CommentApplication:
If you are using AutoYaST and need a way to setup NIC Bonding, then you can just follow the steps outlined here. The setup in this text is a generic setup and should work with most every hardware.
Explanation:
In order to get this to work properly you will first need to change your networking tags a bit in your AutoYaST xml. Below is an example of what it should look like.<networking>
<dhcp_options>
<dhclient_hostname_option>AUTO</dhclient_hostname_option>
</dhcp_options> <dns> <dhcp_hostname config:type="boolean">true</dhcp_hostname> <dhcp_resolv config:type="boolean">false</dhcp_resolv> <domain>localdomain.com</domain> <hostname>localhost</hostname> <nameservers config:type="list"> <nameserver></nameserver> <nameserver></nameserver> </nameservers> <searchlist config:type="list"> <search>localdomain.com</search> </searchlist> </dns> <interfaces config:type="list"> <interface> <bootproto>none</bootproto> <device>eth0</device> <startmode>off</startmode> </interface> <interface> <bootproto>none</bootproto> <device>eth1</device> <startmode>off</startmode> </interface> </interfaces> <routing> <ip_forward config:type="boolean">false</ip_forward> </routing> </networking>
Notice the two interface tags for eth0 and eth1 (You could have an eth2 and eth3 as well). The bootproto tag should be changed to none and the startmode tag should be changed to off. This is so that these interfaces don't get enabled at all except through the bond device.
In order to setup the bond device we use the files tag in AutoYaST in order to lay down a configuration file in /etc/sysconfig/network/ for the bond device. (You could setup multiple bond devices if you had 4 network interfaces ) Below is a sample configuration of a bond device.<files config:type="list">
<file><file_contents><![CDATA[
STARTMODE='onboot'
BOOTPROTO='dhcp'
#IPADDR='x.x.x.x/24'
BONDING_MASTER='yes'
BONDING_SLAVE_0='eth0'
BONDING_SLAVE_1='eth1'
BONDING_MODULE_OPTS='mode=1 primary=eth1 arp_interval=1000
]]></file_contents>
<file_owner>root</file_owner>
<file_path>/etc/sysconfig/network/ifcfg-bond0</file_path>
<file_permissions>644</file_permissions>
</file>
</files>
Notice BOOTPROTO is dhcp for this config, now you can change that to static if you want and then uncomment the IPADDR line and put in a static address, but then the configuration won't be generic to lay accross dozens of machines. The BONDING_SLAVE lines are for calling out the network interfaces which will be the slaves for the Bonding.(You could add a 3rd slave if you wanted) The BONDING_MODULE_OPTS are for the bonding kernel module options to set the mode and other settings on the bonding device. You can find out all of the options by issuing the following as root from command line.
:~ # modinfo bonding parm:arp_ip_target:arp targets in n.n.n.n form (array of charp) parm:arp_interval:arp interval in milliseconds (int) parm:xmit_hash_policy:XOR hashing method: 0 for layer 2 (default), 1 for layer 3+4 (charp) parm:lacp_rate:LACPDU tx rate to request from 802.3ad partner (slow/fast) (charp) parm:primary:Primary network device to use (charp) parm:mode:Mode of operation : 0 for balance-rr, 1 for active-backup, 2 for balance-xor, 3 for broadcast, 4 for 802.3ad, 5 for balance-tlb, 6 for balance-alb (charp) parm:use_carrier:Use netif_carrier_ok (vs MII ioctls) in miimon; 0 for off, 1 for on (default) (int) parm:downdelay:Delay before considering link down, in milliseconds (int) parm:updelay:Delay before considering link up, in milliseconds (int) parm:miimon:Link check interval in milliseconds (int) parm:max_bonds:Max number of bonded devices (int)So at this point feel free to modify the module options to best fit your environment. I am including a full networking and files tags together so you can just copy and paste into your AutoYaST configuration.
<networking>
<dhcp_options>
<dhclient_hostname_option>AUTO</dhclient_hostname_option>
</dhcp_options>
<dns>
<dhcp_hostname config:type="boolean">true</dhcp_hostname>
<dhcp_resolv config:type="boolean">false</dhcp_resolv>
<domain>localdomain.com</domain>
<hostname>localhost</hostname>
<nameservers config:type="list">
<nameserver></nameserver>
<nameserver></nameserver>
</nameservers>
<searchlist config:type="list">
<search>localdomain.com</search>
</searchlist>
</dns>
<interfaces config:type="list">
<interface>
<bootproto>none</bootproto>
<device>eth0</device>
<startmode>off</startmode>
</interface>
<interface>
<bootproto>none</bootproto>
<device>eth1</device>
<startmode>off</startmode>
</interface>
</interfaces>
<routing>
<ip_forward config:type="boolean">false</ip_forward>
</routing>
</networking>
<files config:type="list">
<file>
<file_contents><![CDATA[
STARTMODE='onboot'
BOOTPROTO='dhcp'
#IPADDR='x.x.x.x/24'
BONDING_MASTER='yes'
BONDING_SLAVE_0='eth0'
BONDING_SLAVE_1='eth1'
BONDING_MODULE_OPTS='mode=1 primary=eth1 arp_interval=1000
]]></file_contents>
<file_owner>root</file_owner>
<file_path>/etc/sysconfig/network/ifcfg-bond0</file_path>
<file_permissions>644</file_permissions>
</file>
</files>
Mar 16, 2005
By Aaron Gresko
Digg This - Slashdot This Posted: 16 Mar 2005
Bonding network interfaces together allows for increased network throughput or failover in high availability configurations. Getting SUSE Linux Enterpise Server 8 (SLES8) configured with a bonded device is just a little more complicated than in SLES9, though neither are very difficult to set up.
This article will show how to configure bonding in SLES8 using the example of setting up a bonded interface (bond0) in a server with two ethernet cards already configured with static IP addresses (eth0 and eth1).
One note of caution before proceeding? make sure the SLES8 system has been updated to service pack 3. The configuration given will not work with a system that has not been updated.
Setting up a bonded interface on SLES8 requires the following:
- Modify the bonding module parameters
- Create the bond interface
- Enslave devices to the bond
Modify the Bonding Module Parameters
To set up the bonding module and enable failover mode, do the following:
- Open a terminal and su to root.
- Open /etc/modules.conf and add an alias for the bond device, for example:
alias bond0 bonding
- Configure the bonding module options. The necessary parameters are mode and miimon.
The mode parameter can be set to 0 for round robin operation, 1 for active backup mode (failover), or 2 for XOR operation. When set for failover operation, MII link monitoring should be enabled by setting the miimon parameter to a value higher than zero.
Add an option line for each bond interface, for example:
options bonding mode=1 miimon=200
- The changes to modules.conf will take effect the next time the system is started. Set up the bonding module immediately, by entering modprobe bonding mode=1 miimon=200 at the prompt.
Create the Bond Interface
The bond interface will be created using a configuration file in /etc/sysconfig/network/. Do the following:
- Open a terminal and su to root.
- cd to /etc/sysconfig/network/.
- Create the bond configuration file by entering vi ifcfg-bond0.
- In the configuration file, enter the following:
DEVICE='bond0' IPADDR='192.168.1.1' NETMASK='255.255.255.0' BROADCAST='192.168.1.255' BOOTPROTO='static' STARTMODE='onboot'- Start the bond interface by entering ifup bond0 at the command line. Alternatively, restart the network by entering rcnetwork restart at the command line.
- Run ifconfig to verify the bond interface is up.
Enslave Devices To The Bond
With the bond interface up and running, the final step is to enslave interfaces to the bond interface. The enslaving is done with the ifenslave command. In the case of the example system, the devices eth0 and eth1 will be enslaved to bond0.
- Open a terminal and su to root
- Enslave the devices by entering ifenslave bond0 eth0 eth1
- Confirm the ethernet devices are enslaved by entering ifconfig
The output of ifconfig should show that the devices are enslaved, as shown in the illustration below:
Running rcnetwork restart will not enslave the interfaces, nor will the devices be enslaved at system startup. The ability to define master/slave relationships in the interface configuration files is present in SLES9 though.
To have the interfaces enslaved when the bond interface is brought up, follow the workaround described in TID 10096919 in the Novell Support Knowledgebase.
Linux bond or team multiple network interfaces (NIC) into single interface
LinuxTitli · 50 comments Finally today I had implemented NIC bounding (bind both NIC so that it works as a single device). We have two Dell servers that need setup with Intel Dual Gig NIC. My idea is to improve performance by pumping out more data from both NIC without using any other method.This box act as heavy duty ftp server. Each night I need to transfer over 200GB data from this box to another box. Therefore, the network would be setup is two servers on a switch using dual network cards. I am using Red Hat enterprise Linux version 4.0.
Linux allows binding multiple network interfaces into a single channel/NIC using special kernel module called bonding. According to official bonding documentation, "The Linux bonding driver provides a method for aggregating multiple network interfaces into a single logical "bonded" interface. The behavior of the bonded interfaces depends upon the mode; generally speaking, modes provide either hot standby or load balancing services. Additionally, link integrity monitoring may be performed."
Setting up bounding is easy with RHEL v4.0.
Step #1: Create a bond0 configuration file
Red Hat Linux stores network configuration in /etc/sysconfig/network-scripts/ directory. First, you need to create bond0 config file:
# vi /etc/sysconfig/network-scripts/ifcfg-bond0
Append following lines to it:DEVICE=bond0
Replace above IP address with your actual IP address. Save file and exit to shell prompt.
IPADDR=192.168.1.20
NETWORK=192.168.1.0
NETMASK=255.255.255.0
USERCTL=no
BOOTPROTO=none
ONBOOT=yesStep #2: Modify eth0 and eth1 config files:
Open both configuration using vi text editor and make sure file read as follows for eth0 interface
# vi /etc/sysconfig/network-scripts/ifcfg-eth0
Modify/append directive as follows:DEVICE=eth0
Open eth1 configuration file using vi text editor:
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none# vi /etc/sysconfig/network-scripts/ifcfg-eth1
Make sure file read as follows for eth1 interface:DEVICE=eth1
Save file and exit to shell prompt.
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=noneStep # 3: Load bond driver/module
Make sure bonding module is loaded when the channel-bonding interface (bond0) is brought up. You need to modify kernel modules configuration file:
# vi /etc/modprobe.conf
Append following two lines:alias bond0 bonding
Save file and exit to shell prompt. You can learn more about all bounding options in kernel source documentation file (click here to read file online).
options bond0 mode=balance-alb miimon=100Step # 4: Test configuration
First, load the bonding module:
# modprobe bonding
Restart networking service in order to bring up bond0 interface:# service network restart
Verify everything is working:# less /proc/net/bonding/bond0
Output:Bonding Mode: load balancing (round-robin) MII Status: up MII Polling Interval (ms): 0 Up Delay (ms): 0 Down Delay (ms): 0 Slave Interface: eth0 MII Status: up Link Failure Count: 0 Permanent HW addr: 00:0c:29:c6:be:59 Slave Interface: eth1 MII Status: up Link Failure Count: 0 Permanent HW addr: 00:0c:29:c6:be:63List all interfaces:
# ifconfig
Output:bond0 Link encap:Ethernet HWaddr 00:0C:29:C6:BE:59 inet addr:192.168.1.20 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:2804 errors:0 dropped:0 overruns:0 frame:0 TX packets:1879 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:250825 (244.9 KiB) TX bytes:244683 (238.9 KiB) eth0 Link encap:Ethernet HWaddr 00:0C:29:C6:BE:59 inet addr:192.168.1.20 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fec6:be59/64 Scope:Link UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 RX packets:2809 errors:0 dropped:0 overruns:0 frame:0 TX packets:1390 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:251161 (245.2 KiB) TX bytes:180289 (176.0 KiB) Interrupt:11 Base address:0x1400 eth1 Link encap:Ethernet HWaddr 00:0C:29:C6:BE:59 inet addr:192.168.1.20 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fec6:be59/64 Scope:Link UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 RX packets:4 errors:0 dropped:0 overruns:0 frame:0 TX packets:502 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:258 (258.0 b) TX bytes:66516 (64.9 KiB) Interrupt:10 Base address:0x1480Now you have bond multiple network interfaces into a single channel (NIC). Read the official howto it covers following additional topics:
- VLAN Configuration
- Cisco switch related configuration
- Advanced routing and troubleshooting
Yes. The old 2.0.xx channel bonding patch was not SMP safe.
The new driver was designed to be SMP safe from the start.
Any Ethernet type cards (you can even mix cards - a Intel
EtherExpress PRO/100 and a 3com 3c905b, for example). For most modes,
devices need not be of the same speed.
There is no limit.
This is limited only by the number of network interfaces Linux
supports and/or the number of network cards you can place in your
system.
If link monitoring is enabled, then the failing device will be
disabled. The active-backup mode will fail over to a backup link, and
other modes will ignore the failed link. The link will continue to be
monitored, and should it recover, it will rejoin the bond (in whatever
manner is appropriate for the mode). See the sections on High
Availability and the documentation for each mode for additional
information.
Link monitoring can be enabled via either the miimon or
arp_interval parameters (described in the module parameters section,
above). In general, miimon monitors the carrier state as sensed by
the underlying network device, and the arp monitor (arp_interval)
monitors connectivity to another host on the local network.
If no link monitoring is configured, the bonding driver will
be unable to detect link failures, and will assume that all links are
always available. This will likely result in lost packets, and a
resulting degradation of performance. The precise performance loss
depends upon the bonding mode and network configuration.
Yes. See the section on High Availability for details.
The full answer to this depends upon the desired mode.
In the basic balance modes (balance-rr and balance-xor), it
works with any system that supports etherchannel (also called
trunking). Most managed switches currently available have such
support, and many unmanaged switches as well.
The advanced balance modes (balance-tlb and balance-alb) do
not have special switch requirements, but do need device drivers that
support specific features (described in the appropriate section under
module parameters, above).
In 802.3ad mode, it works with with systems that support IEEE
802.3ad Dynamic Link Aggregation. Most managed and many unmanaged
switches currently available support 802.3ad.
The active-backup mode should work with any Layer-II switch.
If not explicitly configured (with ifconfig or ip link), the
MAC address of the bonding device is taken from its first slave
device. This MAC address is then passed to all following slaves and
remains persistent (even if the first slave is removed) until the
bonding device is brought down or reconfigured.
If you wish to change the MAC address, you can set it with
ifconfig or ip link:
# ifconfig bond0 hw ether 00:11:22:33:44:55 # ip link set bond0 address 66:77:88:99:aa:bb
The MAC address can be also changed by bringing down/up the
device and then changing its slaves (or their order):
# ifconfig bond0 down ; modprobe -r bonding # ifconfig bond0 .... up # ifenslave bond0 eth...
This method will automatically take the address from the next
slave that is added.
To restore your slaves' MAC addresses, you need to detach them
from the bond (`ifenslave -d bond0 eth0'). The bonding driver will
then restore the MAC addresses that the slaves had before they were
enslaved.
The latest version of the bonding driver can be found in the latest
version of the linux kernel, found on
http://kernel.org.
The latest version of this document can be found in either the latest
kernel source (named Documentation/networking/bonding.txt), or on the
bonding site.
Discussions regarding the bonding driver take place primarily on the
bonding-devel mailing list, hosted at sourceforge.net. If you have questions
or problems, post them to the list.
The administrative interface (to subscribe or unsubscribe) can be found
at:
https://lists.sourceforge.net/lists/listinfo/bonding-devel
Donald Becker's Ethernet Drivers and diag programs may be found at :
http://www.scyld.com/network/
You will also find a lot of information regarding Ethernet, NWay, MII,
etc. at
www.scyld.com.
Society
Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy
Quotes
War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes
Bulletin:
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
History:
Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history
Classic books:
The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor
The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D
Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...
|
You can use PayPal to to buy a cup of coffee for authors of this site |
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.
Last modified: December 12, 2020