C H A P T E R  6

Configuring Lustre - Examples

This chapter provides Lustre configuration examples and includes the following section:


6.1 Simple TCP Network

This chapter presents several examples of Lustre configurations on a simple TCP network.

6.1.1 Lustre with Combined MGS/MDT

Below is an example is of a Lustre setup “datafs” having combined MDT/MGS with four OSTs and a number of Lustre clients.

6.1.1.1 Installation Summary

6.1.1.2 Configuration Generation and Application

1. Install the Lustre RPMS (per Installing Lustre) on all nodes that are going to be part of the Lustre filesystem. Boot the nodes in Lustre kernel, including the clients.

2. Change modprobe.conf by adding the following line to it.

options lnet networks=tcp		

3. Configuring Lustre on MGS and MDT node.

$ mkfs.lustre --fsname datafs --mdt --mgs /dev/sda

4. Make a mount point on MDT/MGS for the filesystem and mount it.

$ mkdir -p /mnt/data/mdt
$ mount -t lustre /dev/sda /mnt/data/mdt

5. Configuring Lustre on all four OSTs.

mkfs.lustre --fsname datafs --ost --mgsnode=mds16@tcp0 /dev/sda
mkfs.lustre --fsname datafs --ost --mgsnode=mds16@tcp0 /dev/sdd
mkfs.lustre --fsname datafs --ost --mgsnode=mds16@tcp0 /dev/sda1
mkfs.lustre --fsname datafs --ost --mgsnode=mds16@tcp0 /dev/sdb


Note - While creating the filesystem, make sure you are not using disk with the operating system.


6. Make a mount point on all the OSTs for the filesystem and mount it.

$ mkdir -p /mnt/data/ost0
$ mount -t lustre /dev/sda /mnt/data/ost0
 
$ mkdir -p /mnt/data/ost1
$ mount -t lustre /dev/sdd /mnt/data/ost1
 
$ mkdir -p /mnt/data/ost2
$ mount -t lustre /dev/sda1 /mnt/data/ost2
 
$ mkdir -p /mnt/data/ost3
$ mount -t lustre /dev/sdb /mnt/data/ost3
 
$ mount -t lustre mdt16@tcp0:/datafs /mnt/datafs

6.1.2 Lustre with Separate MGS and MDT

The following example describes a Lustre filesystem “datafs” having an MGS and an MDT on separate nodes, four OSTs, and a number of Lustre clients.

6.1.2.1 Installation Summary

6.1.2.2 Configuration Generation and Application

1. Install the Lustre RPMs (per Installing Lustre) on all the nodes that are going to be a part of the Lustre filesystem. Boot the nodes in the Lustre kernel, including the clients.

2. Change the modprobe.conf by adding the following line to it.

options lnet networks=tcp

3. Start Lustre on the MGS node.

$ mkfs.lustre --mgs /dev/sda

4. Make a mount point on MGS for the filesystem and mount it.

$ mkdir -p /mnt/mgs
$ mount -t lustre /dev/sda1 /mnt/mgs

5. Start Lustre on the MDT node.

$ mkfs.lustre --fsname=datafs --mdt --mgsnode=mgsnode@tcp0 \ /dev/sda2

6. Make a mount point on MDT/MGS for the filesystem and mount it.

$ mkdir -p /mnt/data/mdt
$ mount -t lustre /dev/sda /mnt/data/mdt

7. Start Lustre on all the four OSTs.

mkfs.lustre --fsname datafs --ost --mgsnode=mds16@tcp0 /dev/sda
mkfs.lustre --fsname datafs --ost --mgsnode=mds16@tcp0 /dev/sdd
mkfs.lustre --fsname datafs --ost --mgsnode=mds16@tcp0 /dev/sda1
mkfs.lustre --fsname datafs --ost --mgsnode=mds16@tcp0 /dev/sdb

8. Make a mount point on all the OSTs for the filesystem and mount it

$ mkdir -p /mnt/data/ost0
$ mount -t lustre /dev/sda /mnt/data/ost0
 
$ mkdir -p /mnt/data/ost1
$ mount -t lustre /dev/sdd /mnt/data/ost1
 
$ mkdir -p /mnt/data/ost2
$ mount -t lustre /dev/sda1 /mnt/data/ost2
 
$ mkdir -p /mnt/data/ost3
$ mount -t lustre /dev/sdb /mnt/data/ost3
 
$ mount -t lustre mdsnode@tcp0:/datafs /mnt/datafs

6.1.2.3 Configuring Lustre with a CSV File

A new utility (script) - /usr/sbin/lustre_config can be used to configure Lustre 1.6. This script enables you to automate formatting and setup of disks on multiple nodes.

Describe your entire installation in a Comma Separated Values (CSV) file and pass it to the script. The script contacts multiple Lustre targets simultaneously, formats the drives, updates modprobe.conf, and produces HA configuration files using definitions in the CSV file. (The lustre_config -h option shows several samples of CSV files.)



Note - The CSV file format is a file type that stores tabular data. Many popular spreadsheet programs, such as Microsoft Excel, can read from/write to CSV files.


How lustre_config Works

The lustre_config script parses each line in the CSV file and executes remote commands, like mkfs.lustre, to format each Lustre target in the Lustre cluster.

Optionally, the lustre_config script can also:

How to Create a CSV File

Five different types of line formats are available to create a CSV file. Each line format represents a target. The list of targets with the respective line formats are described below:

Linux MD device

The CSV line format is:

hostname, MD, md name, operation mode, options, raid level, component devices

Where:


Variable

Supported Type

hostname

Hostname of the node in the cluster.

MD

Marker of the MD device line.

md name

MD device name, for example: /dev/md0

operation mode

Operations mode, either create or remove. Default is create.

options

A ‘catchall’ for other mdadm options, for example, -c 128

raid level

RAID level: 0, 1, 4, 5, 6, 10, linear and multipath.

hostname

Hostname of the node in the cluster.

component devices

Block devices to be combined into the MD device. Multiple devices are separated by space or by using shell extensions, for example: /dev/sd{a,b,c}


Linux LVM PV (Physical Volume)

The CSV line format is:

hostname, PV, pv names, operation mode, options

Where:


Variable

Supported Type

hostname

Hostname of the node in the cluster.

PV

Marker of the PV line.

pv names

Devices or loopback files to be initialized for later use by LVM or to wipe the label, for example: /dev/sda

Multiple devices or files are separated by space or by using shell expansions, for example: /dev/sd{a,b,c}

operation mode

Operations mode, either create or remove. Default is create.

options

A ‘catchall’ for other pvcreate/pvremove options, for example: -vv


Linux LVM VG (Volume Group)

The CSV line format is:

hostname, VG, vg name, operation mode, options, pv paths

Where:


Variable

Supported Type

hostname

Hostname of the node in the cluster.

VG

Marker of the VG line.

vg name

Name of the volume group, for example: ost_vg

operation mode

Operations mode, either create or remove. Default is create.

options

A ‘catchall’ for other vgcreate/rgremove options, for example: -s 32M

pv paths

Physical volumes to construct this VG, required by the create mode; multiple PVs are separated by space or by using shell expansions, for example: /dev/sd[k-m]1


Linux LVM LV (Logical Volume)

The CSV line format is:

hostname, LV, lv name, operation mode, options, lv size, vg name

Where:


Variable

Supported Type

hostname

Hostname of the node in the cluster.

LV

Marker of the LV line.

lv name

Name of the logical volume to be created (optional) or path of the logical volume to be removed (required by the remove mode).

operation mode

Operations mode, either create or remove. Default is create.

options

A ‘catchall’ for other lvcreate/lvremove options, for example: -i 2 -l 128

lv size

Size [kKmMgGtT] to be allocated for the new LV. Default is megabytes (MB).

vg name

Name of the VG in which the new LV is created.


Lustre target

The CSV line format is:

hostname, module_opts, device name, mount point, device type, fsname, mgs nids, index, format options, mkfs options, mount options, failover nids

Where:


Variable

Supported Type

hostname

Hostname of the node in the cluster. It must match uname -n

module_opts

Lustre networking module options. Use the newline character (\n) to delimit multiple options.

device name

Lustre target (block device or loopback file).

mount point

Lustre target mount point.

device type

Lustre target type (mgs, mdt, ost, mgs|mdt, mdt|mgs).

fsname

Lustre filesystem name (limit is 8 characters).

mgs nids

NID(s) of the remote mgs node, required for MDT and OST targets; if this item is not given for an MDT, it is assumed that the MDT is also an MGS (according to mkfs.lustre).

index

Lustre target index.

format options

A ‘catchall’ contains options to be passed to mkfs.lustre. For example: device-size, --param, and so on.

mkfs options

Format options to be wrapped with --mkfsoptions= and passed to mkfs.lustre.

mount options

If this script is invoked with -m option, then the value of this item is wrapped with --mountfsoptions= and passed to mkfs.lustre; otherwise, the value is added into /etc/ fstab

failver nids

NID(s) of the failover partner node.




Note - In one node, all NIDs are delimited by commas (','). To use comma-separated NIDs in a CSV file, they must be enclosed in quotation marks, for example: "lustre-mgs2,2@elan"

When multiple nodes are specified, they are delimited by a colon (':').

If you leave a blank, it is set to default.


The lustre_config.csv file looks like:

{mdtname}.{domainname},options lnet networks=tcp,/dev/sdb,/mnt/mdt,mgs|mdt
{ost2name}.{domainname},options lnet networks=tcp,/dev/sda,/mnt/ost1,ost,,192.168.16.34@tcp0
{ost1name}.{domainname},options lnet networks=tcp,/dev/sda,/mnt/ost0,ost,,192.168.16.34@tcp0


Note - Provide a Fully Qualified Domain Name (FQDN) for all nodes that are a part of the filesystem in the first parameter of all the rows starting in a new line. For example:

mdt1.clusterfs.com,options lnet networks=tcp,/dev/sdb,/mnt/mdt,mgs|mdt

- AND -

ost1.clusterfs.com,options lnet\ networks=tcp,/dev/sda,/mnt/
ost1,ost,,192.168.16.34@tcp0


Using CSV with lustre_config

Once you created the CSV file, you can start to configure the filesystem by using the lustre_config script.

1. List the available parameters. At the command prompt. Type:

$ lustre_config
lustre_config: Missing csv file!
 
Usage: lustre_config [options] <csv file>
This script is used to format and set up multiple lustre servers from a csv file.
 
Options:
-h	help and examples
-a	select all the nodes from the csv file to operate on
-w	hostname,hostname,...
 
select the specified list of nodes (separated by commas) to operate on rather than all the nodes in the csv file
 
-x	hostname,hostname,... exclude the specified list of nodes (separated by commas)
-t	HAtype produce High-Availability software configurations
 
The argument following -t is used to indicate the High-Availability software type. The HA software types which are currently supported are: hbv1 (Heartbeat version 1) and hbv2 (Heartbeat version 2).
 
-n	no net - don’t verify network connectivity and hostnames in the cluster
-d	configure Linux MD/LVM devices before formatting the Lustre targets
-f	force-format the Lustre targets using --reformat option OR you can specify --reformat in the ninth field of the target line in the csv file
-m	no fstab change - don’t modify /etc/fstab to add the new Lustre targets. If using this option, then the value of "mount options" item in the csv file will be passed to mkfs.lustre,else the value will be added into the /etc/fstab
-v	verbose mode
csv file is a spreadsheet that contains configuration parameters (separated by commas) for each target in a Lustre cluster 

Example 1: Simple Lustre configuration with CSV (use the following command):

$ lustre_config -v -a -f lustre_config.csv 

This command starts the execution and configuration on the nodes or targets in lustre_config.csv, prompting you for the password to log in with root access to the nodes. To avoid this prompt, configure a shell like pdsh or SSH.

After completing the above steps, the script makes Lustre target entries in the /etc/fstab file on Lustre server nodes, such as:

/dev/sdb			/mnt/mdt	lustre  defaults								0	0
/dev/sda			/mnt/ost	lustre  defaults								0	0

2. Run mount /dev/sdb and mount /dev/sda to start the Lustre services.



Note - Use the /usr/sbin/lustre_createcsv script to collect information on Lustre targets from running a Lustre cluster and generating a CSV file. It is a reverse utility (compared to lustre_config) and should be run on the MGS node.


Example 2: More complicated Lustre configuration with CSV:

For RAID and LVM-based configuration, the lustre_config.csv file looks like this:

# Configuring RAID 5 on mds16.clusterfs.com
mds16.clusterfs.com,MD,/dev/md0,,-c 128,5,/dev/sdb /dev/sdc /dev/sdd
 
# configuring multiple RAID5 on oss161.clusterfs.com
oss161.clusterfs.com,MD,/dev/md0,,-c 128,5,/dev/sdb /dev/sdc /dev/sdd
oss161.clusterfs.com,MD,/dev/md1,,-c 128,5,/dev/sde /dev/sdf /dev/sdg
 
# configuring LVM2-PV from the RAID5 from the above steps on
oss161.clusterfs.com
oss161.clusterfs.com,PV,/dev/md0 /dev/md1
 
# configuring LVM2-VG from the PV and RAID5 from the above steps on
oss161.clusterfs.com
oss161.clusterfs.com,VG,oss_data,,-s 32M,/dev/md0 /dev/md1
 
# configuring LVM2-LV from the VG, PV and RAID5 from the above steps
on oss161.clusterfs.com
oss161.clusterfs.com,LV,ost0,,-i 2 -I 128,2G,oss_data
oss161.clusterfs.com,LV,ost1,,-i 2 -I 128,2G,oss_data
# configuring LVM2-PV on oss162.clusterfs.com
oss162.clusterfs.com,PV, /dev/sdb /dev/sdc /dev/sdd /dev/sde
/dev/sdf /dev/sdg
 
# configuring LVM2-VG from the PV from the above steps on
oss162.clusterfs.com
oss162.clusterfs.com,VG,vg_oss1,,-s 32M,/dev/sdb /dev/sdc /dev/sdd
oss162.clusterfs.com,VG,vg_oss2,,-s 32M,/dev/sde /dev/sdf /dev/sdg
 
# configuring LVM2-LV from the VG and PV from the above steps on
oss162.clusterfs.com
oss162.clusterfs.com,LV,ost3,,-i 3 -I 64,1G,vg_oss2
oss162.clusterfs.com,LV,ost2,,-i 3 -I 64,1G,vg_oss1
 
#configuring Lustre file system on MDS/MGS, OSS and OST with RAID and LVM created above
mds16.clusterfs.com,options lnet networks=tcp,/dev/md0,/mnt/mdt,mgs|mdt,,,,,,,
oss161.clusterfs.com,options lnet networks=tcp,/dev/oss_data/ost0,/mnt/ost0,ost,,192.168.16.34@tcp0,,,,
oss161.clusterfs.com,options lnet networks=tcp,/dev/oss_data/ost1,/mnt/ost1,ost,,192.168.16.34@tcp0,,,,
oss162.clusterfs.com,options lnet networks=tcp,/dev/pv_oss1/ost2,/mnt/ost2,ost,,192.168.16.34@tcp0,,,,
oss162.clusterfs.com,options lnet networks=tcp,/dev/pv_oss2/ost3,/mnt/ost3,ost,,192.168.16.34@tcp0,,,,
$ lustre_config -v -a -d -f lustre_config.csv 
 

This command creates RAID and LVM, and then configures Lustre on the nodes or targets specified in lustre_config.csv. The script prompts you for the password to log in with root access to the nodes.

After completing the above steps, the script makes Lustre target entries in the /etc/fstab file on Lustre server nodes, such as:

For MDS | MDT:

/dev/md0 /mnt/mdtlustre 	defaults	00

For OSS:

/pv_oss1/ost2 /mnt/ost2lustre 	defaults	00

3. Start the Lustre services, run:

mount /dev/sdb
mount /dev/sda