C H A P T E R  7

More Complicated Configurations

This chapter describes more complicated Lustre configurations and includes the following sections:


7.1 Multihomed Servers

Servers megan and oscar each have three TCP NICs (eth0, eth1, and eth2) and an Elan NIC. The eth2 NIC is used for management purposes and should not be used by LNET. TCP clients have a single TCP interface and Elan clients have a single Elan interface.

7.1.1 Modprobe.conf

Options under modprobe.conf are used to specify the networks available to a node. You have the choice of two different options - the networks option, which explicitly lists the networks available and the ip2nets option, which provides a list-matching lookup. Only one option can be used at any one time. The order of LNET lines in modprobe.conf is important when configuring multi-homed servers. If a server node can be reached using more than one network, the first network specified in modprobe.conf will be used.

Networks

On the servers:

options lnet networks=tcp0(eth0),tcp1(eth1)

Elan-only clients:

options lnet networks=elan0	
TCP-only clients:
options lnet networks=tcp0	
IB-only clients
options lnet networks="iib0"
options kiiblnd ipif_basename=ib0


Note - In the case of TCP-only clients, all available IP interfaces are used for tcp0 since the interfaces are not specified. If there is more than one, the IP of the first one found is used to construct the tcp0 ID.


ip2nets

The ip2nets option is typically used to provide a single, universal modprobe.conf file that can be run on all servers and clients. An individual node identifies the locally available networks based on the listed IP address patterns that match the node's local IP addresses. Note that the IP address patterns listed in the ip2nets option are only used to identify the networks that an individual node should instantiate. They are not used by LNET for any other communications purpose. The servers megan and oscar have eth0 IP addresses 192.168.0.2 and .4. They also have IP over Elan (eip) addresses of 132.6.1.2 and .4. TCP clients have IP addresses 192.168.0.5-255. Elan clients have eip addresses of 132.6.[2-3].2, .4, .6, .8.

modprobe.conf is identical on all nodes:

options lnet 'ip2nets="tcp0(eth0,eth1)192.168.0.[2,4]; tcp0 \ 192.168.0.*; elan0 132.6.[1-3].[2-8/2]"'


Note - LNET lines in modprobe.conf are only used by the local node to determine what to call its interfaces. They are not used for routing decisions.


Because megan and oscar match the first rule, LNET uses eth0 and eth1 for tcp0 on those machines. Although they also match the second rule, it is the first matching rule for a particular network that is used. The servers also match the (only) Elan rule. The [2-8/2] format matches the range 2-8 stepping by 2; that is 2,4,6,8. For example, clients at 132.6.3.5 would not find a matching Elan network.

7.1.2 Start Servers

For the combined MGS/MDT with TCP network, run:

$ mkfs.lustre --fsname spfs --mdt --mgs /dev/sda
$ mkdir -p /mnt/test/mdt
$ mount -t lustre /dev/sda /mnt/test/mdt

- OR -

For the MGS on the separate node with TCP network, run:

$ mkfs.lustre --mgs /dev/sda
$ mkdir -p /mnt/mgs
$ mount -t lustre /dev/sda /mnt/mgs

For starting the MDT on node mds16 with MGS on node mgs16, run:

$ mkfs.lustre --fsname=spfs --mdt --mgsnode=mgs16@tcp0 /dev/sda
$ mkdir -p /mnt/test/mdt
$ mount -t lustre /dev/sda2 /mnt/test/mdt

For starting the OST on TCP-based network, run:

$ mkfs.lustre --fsname spfs --ost --mgsnode=mgs16@tcp0 /dev/sda$
$ mkdir -p /mnt/test/ost0
$ mount -t lustre /dev/sda /mnt/test/ost0

7.1.3 Start Clients

TCP clients can use the host name or IP address of the MDS, run:

mount -t lustre megan@tcp0:/mdsA/client /mnt/lustre

Use this command to start the Elan clients, run:

mount -t lustre 2@elan0:/mdsA/client /mnt/lustre


Note - If the MGS node has multiple interfaces (for instance, cfs21 and 1@elan), only the client mount command has to change. The MGS NID specifier must be an appropriate nettype for the client (for example, a TCP client could use uml1@tcp0, and an Elan client could use 1@elan). Alternatively, a list of all MGS NIDs can be given, and the client chooses the correctd one. For example:

$ mount -t lustre mgs16@tcp0,1@elan:/testfs /mnt/testfs



7.2 Elan to TCP Routing

Servers megan and oscar are on the Elan network with eip addresses 132.6.1.2 and .4. Megan is also on the TCP network at 192.168.0.2 and routes between TCP and Elan. There is also a standalone router, router1, at Elan 132.6.1.10 and TCP 192.168.0.10. Clients are on either Elan or TCP.

7.2.1 Modprobe.conf

modprobe.conf is identical on all nodes, run:

options lnet 'ip2nets="tcp0 192.168.0.*; elan0 132.6.1.*"' \  'routes="tcp [2,10]@elan0; elan 192.168.0.[2,10]@tcp0"'

7.2.2 Start servers

To start router1, run:

modprobe lnet
lctl network configure

To start megan and oscar, run:

$ mkfs.lustre --fsname spfs --mdt --mgs /dev/sda
$ mkdir -p /mnt/test/mdt
$ mount -t lustre /dev/sda /mnt/test/mdt	
$ mount -t lustre mgs16@tcp0,1@elan:/testfs /mnt/testfs

7.2.3 Start clients

For the TCP client, run:

mount -t lustre megan:/mdsA/client /mnt/lustre/

For the Elan client, run:

mount -t lustre 2@elan0:/mdsA/client /mnt/lustre


7.3 Load Balancing with Infiniband

There is one OSS with two infiniband HCAs. Lustre clients have only one Infiniband HCA using native Lustre drivers of o2ibind. Load balancing is done on both HCAs on the OSS with the help of LNET.

7.3.1 Modprobe.conf

Lustre users have options available on following networks.

options lnet ip2nets= "o2ib0(ib0),o2ib1(ib1) 192.168.10.1.[101-102] 
options lnet ip2nets=o2ib0(ib0) 192.168.10.[103-253/2] 
options lnet ip2nets=o2ib1(ib0) 192.168.10.[102-254/2]

7.3.2 Start servers

To start the MGS and MDT server, run:

modprobe lnet

To start MGS and MDT, run:

$ mkfs.lustre --fsname lustre --mdt --mgs /dev/sda
$ mkdir -p /mnt/test/mdt
$ mount -t lustre /dev/sda /mnt/test/mdt	
$ mount -t lustre mgs@o2ib0:/lustre /mnt/mdt

To start the OSS, run:

$ mkfs.lustre --fsname lustre --ost --mgsnode=mds@o2ib0 /dev/sda
$ mkdir -p /mnt/test/mdt
$ mount -t lustre /dev/sda /mnt/test/ost	
$ mount -t lustre mgs@o2ib0:/lustre /mnt/ost

7.3.3 Start clients

For the TCP client, run:

mount -t lustre
192.168.10.101@o2ib0,192.168.10.102@o2ib1:/mds/client /mnt/lustre
 


7.4 Multi-Rail Configurations with LNET

To aggregate bandwidth across both rails of a dual-rail IB cluster (o2iblnd)[1] using LNET, consider these points:

As an example, consider a two-rail IB cluster running the OFA stack (OFED) with these IPoIB address assignments.

		ib0						ib1
Servers		192.168.0.*						192.168.1.*
Clients		192.168.[2-127].*						192.168.[128-253].*

You could create these configurations:

ip2nets="o2ib0(ib0),				o2ib1(ib1)	192.168.[0-1].*									#all servers;\
				o2ib0(ib0)			192.168.[2-253].[0-252/2]	#even clients;\
				o2ib1(ib1)			192.168.[2-253].[1-253/2]	#odd clients"

This configuration gives every server two NIDs, one on each network, and statically load-balances clients between the rails.

ip2nets="	o2ib0(ib0)						192.168.[0-1].[0-252/2]							#even servers;\
	o2ib1(ib1)						192.168.[0-1].[1-253/2]							#odd servers;\
	o2ib0(ib0),o2ib1(ib1)						192.168.[2-253].*							#clients"

This configuration gives every server a single NID on one rail or the other. Clients have a NID on both rails.

ip2nets=”	o2ib0(ib0),o2ib2(ib1)						192.168.[0-1].[0-252/2]							#even servers;\
	o2ib1(ib0),o2ib3(ib1)						192.168.[0-1].[1-253/2]							#odd servers;\
	o2ib0(ib0),o2ib3(ib1)						192.168.[2-253].[0-252/2)	#even clients;\
	o2ib1(ib0),o2ib2(ib1)						192.168.[2-253].[1-253/2)	#odd clients"

This configuration includes two additional proxy o2ib networks to work around Lustre's simplistic NID selection algorithm. It connects "even" clients to "even" servers with o2ib0 on rail0, and "odd" servers with o2ib3 on rail1. Similarly, it connects "odd" clients to "odd" servers with o2ib1 on rail0, and "even" servers with o2ib2 on rail1.

 


1 (Footnote) Multi-rail configurations are only supported by o2iblnd; other IB LNDs do not support multiple interfaces.