| C H A P T E R 7 |
|
More Complicated Configurations |
This chapter describes more complicated Lustre configurations and includes the following sections:
Servers megan and oscar each have three TCP NICs (eth0, eth1, and eth2) and an Elan NIC. The eth2 NIC is used for management purposes and should not be used by LNET. TCP clients have a single TCP interface and Elan clients have a single Elan interface.
Options under modprobe.conf are used to specify the networks available to a node. You have the choice of two different options - the networks option, which explicitly lists the networks available and the ip2nets option, which provides a list-matching lookup. Only one option can be used at any one time. The order of LNET lines in modprobe.conf is important when configuring multi-homed servers. If a server node can be reached using more than one network, the first network specified in modprobe.conf will be used.
options lnet networks=tcp0(eth0),tcp1(eth1)
options lnet networks=elan0 TCP-only clients: options lnet networks=tcp0
options lnet networks="iib0" options kiiblnd ipif_basename=ib0
The ip2nets option is typically used to provide a single, universal modprobe.conf file that can be run on all servers and clients. An individual node identifies the locally available networks based on the listed IP address patterns that match the node's local IP addresses. Note that the IP address patterns listed in the ip2nets option are only used to identify the networks that an individual node should instantiate. They are not used by LNET for any other communications purpose. The servers megan and oscar have eth0 IP addresses 192.168.0.2 and .4. They also have IP over Elan (eip) addresses of 132.6.1.2 and .4. TCP clients have IP addresses 192.168.0.5-255. Elan clients have eip addresses of 132.6.[2-3].2, .4, .6, .8.
modprobe.conf is identical on all nodes:
options lnet 'ip2nets="tcp0(eth0,eth1)192.168.0.[2,4]; tcp0 \ 192.168.0.*; elan0 132.6.[1-3].[2-8/2]"'
| Note - LNET lines in modprobe.conf are only used by the local node to determine what to call its interfaces. They are not used for routing decisions. |
Because megan and oscar match the first rule, LNET uses eth0 and eth1 for tcp0 on those machines. Although they also match the second rule, it is the first matching rule for a particular network that is used. The servers also match the (only) Elan rule. The [2-8/2] format matches the range 2-8 stepping by 2; that is 2,4,6,8. For example, clients at 132.6.3.5 would not find a matching Elan network.
For the combined MGS/MDT with TCP network, run:
$ mkfs.lustre --fsname spfs --mdt --mgs /dev/sda $ mkdir -p /mnt/test/mdt $ mount -t lustre /dev/sda /mnt/test/mdt
For the MGS on the separate node with TCP network, run:
$ mkfs.lustre --mgs /dev/sda $ mkdir -p /mnt/mgs $ mount -t lustre /dev/sda /mnt/mgs
For starting the MDT on node mds16 with MGS on node mgs16, run:
$ mkfs.lustre --fsname=spfs --mdt --mgsnode=mgs16@tcp0 /dev/sda $ mkdir -p /mnt/test/mdt $ mount -t lustre /dev/sda2 /mnt/test/mdt
For starting the OST on TCP-based network, run:
$ mkfs.lustre --fsname spfs --ost --mgsnode=mgs16@tcp0 /dev/sda$ $ mkdir -p /mnt/test/ost0 $ mount -t lustre /dev/sda /mnt/test/ost0
TCP clients can use the host name or IP address of the MDS, run:
mount -t lustre megan@tcp0:/mdsA/client /mnt/lustre
Use this command to start the Elan clients, run:
mount -t lustre 2@elan0:/mdsA/client /mnt/lustre
Servers megan and oscar are on the Elan network with eip addresses 132.6.1.2 and .4. Megan is also on the TCP network at 192.168.0.2 and routes between TCP and Elan. There is also a standalone router, router1, at Elan 132.6.1.10 and TCP 192.168.0.10. Clients are on either Elan or TCP.
modprobe.conf is identical on all nodes, run:
options lnet 'ip2nets="tcp0 192.168.0.*; elan0 132.6.1.*"' \ 'routes="tcp [2,10]@elan0; elan 192.168.0.[2,10]@tcp0"'
modprobe lnet lctl network configure
To start megan and oscar, run:
$ mkfs.lustre --fsname spfs --mdt --mgs /dev/sda $ mkdir -p /mnt/test/mdt $ mount -t lustre /dev/sda /mnt/test/mdt $ mount -t lustre mgs16@tcp0,1@elan:/testfs /mnt/testfs
mount -t lustre megan:/mdsA/client /mnt/lustre/
mount -t lustre 2@elan0:/mdsA/client /mnt/lustre
There is one OSS with two infiniband HCAs. Lustre clients have only one Infiniband HCA using native Lustre drivers of o2ibind. Load balancing is done on both HCAs on the OSS with the help of LNET.
Lustre users have options available on following networks.
options lnet ip2nets= "o2ib0(ib0),o2ib1(ib1) 192.168.10.1.[101-102]
options lnet ip2nets=o2ib0(ib0) 192.168.10.[103-253/2]
options lnet ip2nets=o2ib1(ib0) 192.168.10.[102-254/2]
To start the MGS and MDT server, run:
modprobe lnet
$ mkfs.lustre --fsname lustre --mdt --mgs /dev/sda $ mkdir -p /mnt/test/mdt $ mount -t lustre /dev/sda /mnt/test/mdt $ mount -t lustre mgs@o2ib0:/lustre /mnt/mdt
$ mkfs.lustre --fsname lustre --ost --mgsnode=mds@o2ib0 /dev/sda $ mkdir -p /mnt/test/mdt $ mount -t lustre /dev/sda /mnt/test/ost $ mount -t lustre mgs@o2ib0:/lustre /mnt/ost
mount -t lustre 192.168.10.101@o2ib0,192.168.10.102@o2ib1:/mds/client /mnt/lustre
To aggregate bandwidth across both rails of a dual-rail IB cluster (o2iblnd)[1] using LNET, consider these points:
As an example, consider a two-rail IB cluster running the OFA stack (OFED) with these IPoIB address assignments.
ib0 ib1 Servers 192.168.0.* 192.168.1.* Clients 192.168.[2-127].* 192.168.[128-253].*
You could create these configurations:
ip2nets="o2ib0(ib0), o2ib1(ib1) 192.168.[0-1].* #all servers;\ o2ib0(ib0) 192.168.[2-253].[0-252/2] #even clients;\ o2ib1(ib1) 192.168.[2-253].[1-253/2] #odd clients"
This configuration gives every server two NIDs, one on each network, and statically load-balances clients between the rails.
ip2nets=" o2ib0(ib0) 192.168.[0-1].[0-252/2] #even servers;\ o2ib1(ib1) 192.168.[0-1].[1-253/2] #odd servers;\ o2ib0(ib0),o2ib1(ib1) 192.168.[2-253].* #clients"
This configuration gives every server a single NID on one rail or the other. Clients have a NID on both rails.
ip2nets=” o2ib0(ib0),o2ib2(ib1) 192.168.[0-1].[0-252/2] #even servers;\ o2ib1(ib0),o2ib3(ib1) 192.168.[0-1].[1-253/2] #odd servers;\ o2ib0(ib0),o2ib3(ib1) 192.168.[2-253].[0-252/2) #even clients;\ o2ib1(ib0),o2ib2(ib1) 192.168.[2-253].[1-253/2) #odd clients"
This configuration includes two additional proxy o2ib networks to work around Lustre's simplistic NID selection algorithm. It connects "even" clients to "even" servers with o2ib0 on rail0, and "odd" servers with o2ib3 on rail1. Similarly, it connects "odd" clients to "odd" servers with o2ib1 on rail0, and "even" servers with o2ib2 on rail1.
Copyright © 2008 Sun Microsystems, Inc. All Rights Reserved.