C H A P T E R  3

Installing Lustre

Lustre installation involves two procedures, meeting the installation prerequisites and installing the Lustre software, either from RPMs or from source code. This chapter includes these sections:

Lustre can be installed from either packaged binaries (RPMs) or freely-available source code. Installing from the package release is straightforward, and recommended for new users. Integrating Lustre into an existing kernel and building the associated Lustre software is an involved process.

For either installation method, the following are required:



Note - When installing Lustre and creating components on devices, a certain amount of space is reserved, so less than 100% of storage space will be available. Lustre servers use the ext3 file system to store user-data objects and system data. By default, ext3 file systems reserve 5% of space that cannot be used by Lustre. Additionally, Lustre reserves up to 400 MB on each OST for journal use[1]. This reserved space is unusable for general storage. For this reason, you will see up to 400 MB of space used on each OST before any file object data is saved to it.



3.1 Preparing to Install Lustre

To sucessfully install and run Lustre, make sure the following installation prerequisites have been met:

3.1.1 Supported Operating System, Platform and Interconnect

Lustre 1.6 supports the following operating systems, platforms[2] and interconnects. To install Lustre from downloaded packages (RPMs), you must use a supported configuration.


Configuration Component

Supported Type

Operating system

Red Hat Enterprise Linux 4 and 5

SuSE Linux Enterprise Server 9 and 10

Linux 2.6 and a kernel >= 2.6.16

NOTE: Lustre does not support security-enhanced (SE) Linux (including clients and servers).

Platform

x86, IA-64, x86-64 (EM64 and AMD64)

PowerPC architectures (for clients only) and mixed-endian clusters

Interconnect

TCP/IP

Quadrics Elan 3 and 4

Myri-10G and Myrinet - 2000

Mellanox

InfiniBand (Voltaire, OpenIB, Silverstorm and any OFED-supported InfiniBand adapter)




Note - Lustre clients running on architectures with different endianness are supported. One limitation is that the PAGE_SIZE kernel macro on the client must be as large as the PAGE_SIZE of the server. In particular, ia64 clients with large pages (up to 64kB pages) can run with i386 servers (4kB pages). If you are running i386 clients with ia64 servers, you must compile the ia64 kernel with a 4kB PAGE_SIZE (so the server page size is not larger than the client page size).


3.1.2 Required Lustre Software

To install Lustre, the following are required:

These packages can be downloaded from the Lustre download site.

3.1.3 Required Tools and Utilities

Several third-party utilities are required:



Note - Lustre-patched e2fsprogs utility only needs to be installed on machines that mount backend (ldiskfs) file systems, such as the OSS, MDS and MGS nodes. It does not need to be loaded on clients.


3.1.4 (Optional) High-Availability Software

If you plan to enable failover server functionality with Lustre (either on an OSS or the MDS), you must add high-availability (HA) software to your cluster software. You can use any HA software package with Lustre.[3] For more information, see Failover.

3.1.5 Debugging Tools

Lustre is a complex system and you may encounter problems when using it. You should have debugging tools on hand to help figure out how and why a problem occurred. A variety of diagnostic and analysis tools are available to debug issues with the Lustre software. Some of these are provided in Linux distributions, while others have been developed and are made available by the Lustre project.

These in-kernel debug mechanisms are incorporated into the Lustre software:

These tools are also provided with the Lustre software:

These general debugging tools are provided as a part of the standard Linux distribution:

These logging and data collection tools can be used to collect information for debugging Lustre kernel issues:

To debug Lustre in a development environment, use:

A variety of debuggers and analysis tools are available including:

For detailed information about these debugging tools, see Tools for Lustre Debugging.

3.1.6 Environmental Requirements

Make sure the following environmental requirements are met before installing Lustre:

3.1.7 Memory Requirements

This section describes the memory requirements of Lustre.

3.1.7.1 MDS Memory Requirements

MDS memory requirements are determined by the following factors:

The amount of memory used by the MDS is a function of how many clients are on the system, and how many files they are using in their working set. This is driven, primarily, by the number of locks a client can hold at one time. The default maximum number of locks for a compute node is 100*num_cores, and interactive clients can hold in excess of 10,000 locks at times. For the MDS, this works out to approximately 2 KB per file, including the Lustre DLM lock and kernel data structures for it, just for the current working set.

There is, by default, 400 MB for the file system journal, and additional RAM usage for caching file data for the larger working set that is not actively in use by clients, but should be kept "HOT" for improved access times. Having file data in cache can improve metadata performance by a factor of 10x or more compared to reading it from disk. Approximately 1.5 KB/file is needed to keep a file in cache.

For example, for a single MDT on an MDS with 1,000 clients, 16 interactive nodes, and a 2 million file working set (of which 400,000 files are cached on the clients):

File system journal = 400 MB

1000 * 4-core clients * 100 files/core * 2kB = 800 MB

16 interactive clients * 10,000 files * 2kB = 320 MB

1,600,000 file extra working set * 1.5kB/file = 2400 MB

Thus, the minimum requirement for a system with this configuration is 4-GB RAM. However, additional memory may significantly improve performance[4].

If there are directories containing 1 million or more files, you may benefit significantly from having more memory. For example, in an environment where clients randomly access one of 10 million files, having extra memory for the cache significantly improves performance.

3.1.7.2 OSS Memory Requirements

When planning the hardware for an OSS node, consider the memory usage of several components in the Lustre system (i.e., journal, service threads, file system metadata, etc.).

Because of these memory requirements, the following calculations should be taken as determining the absolute minimum RAM required in an OSS node.

Calculating OSS Memory Requirements

The minimum, recommended RAM size for an OSS with two OSTs is computed below:

This consumes about 1,700 MB just for the pre-allocated buffers, and an additional 900 MB for minimal filesystem and kernel usage. Therefore, for a non-failover configuration, the minimum RAM would be 3 GB for an OSS node with two OSTs. For a failover configuration, the minimum RAM would be at least 4 GB. For 4 OSTs on each OSS in a failover configuration 8 GB of RAM is reasonable.

As a reasonable rule of thumb, about 1 GB of base memory plus 1 GB per OST can be used. In failover configurations, about 2 GB per OST is needed.


3.2 Installing Lustre from RPMs

This procedure describes how to install Lustre from the RPM packages. This is the easier installation method and is recommended for new users.

Alternately, you can install Lustre directly from the source code. For more information on this installation method, see Installing Lustre from Source Code.



Note - In all Lustre installations, the server kernel that runs on an MDS, MGS or OSS must be patched. However, running a patched kernel on a Lustre client is optional and only required if the client will be used for multiple purposes, such as running as both a client and an OST.




caution icon Caution - Lustre contains kernel modifications which interact with storage devices and may introduce security issues and data loss if not installed, configured or administered properly. Before installing Lustre, be cautious and back up ALL data.


Use this procedure to install Lustre from RPMs.

1. Verify that all Lustre installation requirements have been met.

For more information on these prerequisites, see Preparing to Install Lustre.

2. Download the Lustre RPMs.

a. On the Lustre download site, select your platform.

The files required to install Lustre (kernels, modules and utilities RPMs) are listed for the selected platform.

b. Download the required files.

Use the Download Manager or download the files individually.

3. Install the Lustre packages.

Some Lustre packages are installed on servers (MDS and OSSs), and others are installed on Lustre clients. Lustre packages must be installed in a specific order.



caution icon Caution - For a non-production Lustre environment or for testing, a Lustre client and server can run on the same machine. However, for best performance in a production environment, dedicated clients are always best. Performance and other issues can occur when an MDS or OSS and a client are running on the same machine[5]. The MDS and MGS can run on the same machine.


a. For each Lustre package, determine if it needs to be installed on servers and/or clients. Use TABLE 3-1 to determine where to install a specific package. Depending on your platform, not all of the listed files need to be installed.


TABLE 3-1 Lustre required packages, descriptions and installation guidance

Lustre Package

Description

Install on servers

Install on patchless clients

Install on patched clients

Lustre kernel RPMs

 

 

 

 

kernel-lustre-smp-<ver>

Lustre-patched kernel package for RHEL 4 and 5 (i686, ia64 and x86_64) and SuSE Server 9 and 10 (x86_64) platform.

X

 

X[6]

 

kernel-lustre-bigsmp-<ver>

Lustre-patched kernel package for SuSE Server 9 and 10 (i686) platform.

 

X

 

 

 

X*

 

kernel-ib-<ver>

Lustre OFED package. Install if the network interconnect is InfiniBand (IB).

 

X

 

X

 

X*

Lustre module RPMs

 

 

 

 

 

lustre-modules-<ver>

Lustre modules for the patched kernel.

X

 

X*

 

lustre-client-modules-<ver>

Lustre modules for patchless clients.

 

X

 

 

 

 

 

 

 

 

 

 

 

Lustre utilities

 

 

 

 

 

lustre-<ver>

Lustre utilities package. This includes userspace utilities to configure and run Lustre.

 

 

X

 

 

 

 

X*

 

lustre-ldiskfs-<ver>

Lustre-patched backing file system kernel module package for the ext3 file system

 

X

 

 

 

e2fsprogs-<ver>

Utilities package used to maintain the ext3 backing file system.

 

X

 

 

 

lustre-client-<ver>

Lustre utilities for patchess clients

 

X

 


b. Install the kernel, modules and ldiskfs packages.

Use the rpm -ivh command to install the kernel, module and ldiskfs packages. For example:

$ rpm -ivh kernel-lustre-smp-<ver> \
kernel-ib-<ver> \
lustre-modules-<ver> \
lustre-ldiskfs-<ver>

c. Install the utilities/userspace packages.

Use the rpm -ivh command to install the utilities packages. For example:

$ rpm -ivh lustre-<ver>

d. Install the e2fsprogs package.

Use the rpm -ivh command to install the e2fsprogs package. For example:

$ rpm -ivh e2fsprogs-<ver>
 
If e2fsprogs is already installed on your Linux system, install the Lustre-specific e2fsprogs version by using rpm -Uvh to update the existing e2fsprogs package. For example:$ rpm -Uvh e2fsprogs-<ver> The rpm command options --force or --nodeps are not required to install or update the Lustre-specific e2fsprogs package. We specifically recommend that you not use these options. If errors are reported, notify Lustre Support by filing a bug.

e. (Optional) If you want to add optional packages to your Lustre file system, install them now.

Optional packages include file system creation and repair tools, debugging tools, test programs and scripts, Linux kernel and Lustre source code, and other packages. A complete list of optional packages for your platform is provided on the Lustre download site.

4. Verify that the boot loader (grub.conf or lilo.conf) has been updated to load the patched kernel.

5. Reboot the patched clients and the servers.

a. If you applied the patched kernel to any clients, reboot them.

Unpatched clients do not need to be rebooted.

b. Reboot the servers.

Once all machines have rebooted, go to Configuring Lustre to configure Lustre Networking (LNET) and the Lustre file system.


3.3 Installing Lustre from Source Code

If you need to build a customized Lustre server kernel or are using a Linux kernel that has not been tested with the version of Lustre you are installing, you may need to build and install Lustre from source code. This involves several steps:

Please note that the Lustre/kernel configurations available at the Lustre download site have been extensively tested and verified with Lustre. The recommended method for installing Lustre servers is to use these pre-built binary packages (RPMs). For more information on this installation method, see Installing Lustre from RPMs.



caution icon Caution - Lustre contains kernel modifications which interact with storage devices and may introduce security issues and data loss if not installed, configured and administered correctly. Before installing Lustre, be cautious and back up ALL data.


Note - When using third-party network hardware with Lustre, the third-party modules (typically, the drivers) must be linked against the Linux kernel. The LNET modules in Lustre also need these references. To meet these requirements, a specific process must be followed to install and recompile Lustre. See Installing Lustre with a Third-Party Network Stack, for an example showing how to install Lustre 1.6.6 using the Myricom MX 1.2.7 driver. The same process can be used for other third-party network stacks.

3.3.1 Patching the Kernel

If you are using non-standard hardware, plan to apply a Lustre patch, or have another reason not to use packaged Lustre binaries, you have to apply several Lustre patches to the core kernel and run the Lustre configure script against the kernel.

3.3.1.1 Introducing the Quilt Utility

To simplify the process of applying Lustre patches to the kernel, we recommend that you use the Quilt utility.

Quilt manages a stack of patches on a single source tree. A series file lists the patch files and the order in which they are applied. Patches are applied, incrementally, on the base tree and all preceding patches. You can:

A variety of Quilt packages (RPMs, SRPMs and tarballs) are available from various sources. Use the most recent version you can find. Quilt depends on several other utilities, e.g., the coreutils RPM that is only available in RedHat 9. For other RedHat kernels, you have to get the required packages to successfully install Quilt. If you cannot locate a Quilt package or fulfill its dependencies, you can build Quilt from a tarball, available at the Quilt project website:

http://savannah.nongnu.org/projects/quilt

For additional information on using Quilt, including its commands, see Introduction to Quilt and the quilt(1) man page.

3.3.1.2 Get the Lustre Source and Unpatched Kernel

The Lustre Engineering Team has targeted several Linux kernels for use with Lustre servers (MDS/OSS) and provides a series of patches for each one. The Lustre patches are maintained in the kernel_patch directory bundled with the Lustre source code.



Note - Each patch series has been tailored to a specific kernel version, and may or may not apply cleanly to other versions of the kernel.


To obtain the Lustre source and unpatched kernel:

1. Verify that all of the Lustre installation requirements have been met.

For more information on these prerequisites, see Preparing to Install Lustre.

2. Download the Lustre source code. On the Lustre download site, select a version of Lustre to download and then select Source as the platform.

3. Download the unpatched kernel.

For convenience, Sun maintains an archive of unpatched kernel sources at:

http://downloads.lustre.org/public/kernels/

4. To save time later, download e2fsprogs now.

The source code for Sun’s Lustre-enabled e2fsprogs distribution can be found at:

http://downloads.lustre.org/public/tools/e2fsprogs/

3.3.1.3 Patch the Kernel

This procedure describes how to use Quilt to apply the Lustre patches to the kernel. To illustrate the steps in this procedure, a RHEL 5 kernel is patched for Lustre 1.6.5.1.

1. Unpack the Lustre source and kernel to separate source trees.

a. Unpack the Lustre source.

For this procedure, we assume that the resulting source tree is in /tmp/lustre-1.6.5.1

b. Unpack the kernel.

For this procedure, we assume that the resulting source tree (also known as the destination tree) is in /tmp/kernels/linux-2.6.18

2. Select a config file for your kernel, located in the kernel_configs directory (lustre/kernel_patches/kernel_config).

The kernel_config directory contains the .config files, which are named to indicate the kernel and architecture with which they are associated. For example, the configuration file for the 2.6.18 kernel shipped with RHEL 5 (suitable for i686 SMP systems) is kernel-2.6.18-2.6-rhel5-i686-smp.config.

3. Select the series file for your kernel, located in the series directory (lustre/kernel_patches/series).

The series file contains the patches that need to be applied to the kernel.

4. Set up the necessary symlinks between the kernel patches and the Lustre source.

This example assumes that the Lustre source files are unpacked under /tmp/lustre-1.6.5.1 and you have chosen the 2.6-rhel5.series file). Run:

$ cd /tmp/kernels/linux-2.6.18
$ rm -f patches series
$ ln -s /tmp/lustre-1.6.5.1/lustre/kernel_patches/series/2.6-\ rhel5.series ./series
$ ln -s /tmp/lustre-1.6.5.1/lustre/kernel_patches/patches .

5. Use Quilt to apply the patches in the selected series file to the unpatched kernel. Run:

$ cd /tmp/kernels/linux-2.6.18
$ quilt push -av

The patched destination tree acts as a base Linux source tree for Lustre.

3.3.2 Create and Install the Lustre Packages

After patching the kernel, configure it to work with Lustre, create the Lustre packages (RPMs) and install them.

1. Configure the patched kernel to run with Lustre. Run:

$ cd <path to kernel tree>
$ cp /boot/config-‘uname -r‘ .config
$ make oldconfig || make menuconfig
$ make include/asm
$ make include/linux/version.h
$ make SUBDIRS=scripts
$ make include/linux/utsrelease.h

2. Run the Lustre configure script against the patched kernel and create the Lustre packages.

$ cd <path to lustre source tree>
$ ./configure --with-linux=<path to kernel tree>
$ make rpms

This creates a set of .rpms in /usr/src/redhat/RPMS/<arch> with an appended date-stamp. The SuSE path is /usr/src/packages.



Note - You do not need to run the Lustre configure script against an unpatched kernel.


Example set of RPMs:

lustre-1.6.5.1-\2.6.18_53.xx.xx.el5_lustre.1.6.5.1.custom_20081021.i686.rpm
 
lustre-debuginfo-1.6.5.1-\2.6.18_53.xx.xx.el5_lustre.1.6.5.1.custom_20081021.i686.rpm
 
lustre-modules-1.6.5.1-\2.6.18_53.xx.xxel5_lustre.1.6.5.1.custom_20081021.i686.rpm
 
lustre-source-1.6.5.1-\2.6.18_53.xx.xx.el5_lustre.1.6.5.1.custom_20081021.i686.rpm


Note - If the steps to create the RPMs fail, contact Lustre Support by reporting a bug. See Reporting a Lustre Bug.




Note - Lustre supports several features and packages that extend the core functionality of Lustre. These features/packages can be enabled at the build time by issuing appropriate arguments to the configure command. For a list of supported features and packages, run ./configure -help in the Lustre source tree. The configs/ directory of the kernel source contains the config files matching each the kernel version. Copy one to .config at the root of the kernel tree.


3. Create the kernel package. Navigate to the kernel source directory and run:

$ make rpm

Example result:

kernel-2.6.95.0.3.EL_lustre.1.6.5.1custom-1.i686.rpm


Note - Step 3 is only valid for RedHat and SuSE kernels. If you are using a stock Linux kernel, you need to get a script to create the kernel RPM.


4. Install the Lustre packages.

Some Lustre packages are installed on servers (MDS and OSSs), and others are installed on Lustre clients. For guidance on where to install specific packages, see TABLE 3-1, which lists required packages and for each package, where to install it. Depending on the selected platform, not all of the packages listed in TABLE 3-1 need to be installed.



Note - Running the patched server kernel on the clients is optional. It is not necessary unless the clients will be used for multiple purposes, for example, to run as a client and an OST.


Lustre packages should be installed in this order:

a. Install the kernel, modules and ldiskfs packages.

Navigate to the directory where the RPMs are stored, and use the rpm -ivh command to install the kernel, module and ldiskfs packages.

$ rpm -ivh kernel-lustre-smp-<ver> \
kernel-ib-<ver> \
lustre-modules-<ver> \
lustre-ldiskfs-<ver>

b. Install the utilities/userspace packages.

Use the rpm -ivh command to install the utilities packages. For example:

$ rpm -ivh lustre-<ver>

c. Install the e2fsprogs package.

Make sure the e2fsprogs package downloaded in Step 4 is unpacked, and use the rpm -i command to install it. For example:

$ rpm -i e2fsprogs-<ver>

d. (Optional) If you want to add optional packages to your Lustre system, install them now.

5. Verify that the boot loader (grub.conf or lilo.conf) has been updated to load the patched kernel.

6. Reboot the patched clients and the servers.

a. If you applied the patched kernel to any clients, reboot them.

Unpatched clients do not need to be rebooted.

b. Reboot the servers.

Once all the machines have rebooted, the next steps are to configure Lustre Networking (LNET) and the Lustre file system. See Configuring Lustre.

3.3.3 Installing Lustre with a Third-Party Network Stack

When using third-party network hardware, you must follow a specific process to install and recompile Lustre. This section provides an installation example, describing how to install Lustre 1.6.6 while using the Myricom MX 1.2.7 driver. The same process is used for other third-party network stacks, by replacing MX-specific references in Step 2 with the stack-specific build and using the proper --with option when configuring the Lustre source code.

1. Compile and install the Lustre kernel.

a. Install the necessary build tools. GCC and related tools must also be installed. For more information, see Required Lustre Software.

$ yum install rpm-build redhat-rpm-config
$ mkdir -p rpmbuild/{BUILD,RPMS,SOURCES,SPECS,SRPMS}
$ echo '%_topdir %(echo $HOME)/rpmbuild' > .rpmmacros

b. Install the patched Lustre source code.

This RPM is available at the Lustre download page.

$ rpm -ivh kernel-lustre-source-2.6.18-92.1.10.el5_lustre.1.6.6.x86_64.rpm

c. Build the Linux kernel RPM.

$ cd /usr/src/linux-2.6.18-92.1.10.el5_lustre.1.6.6
$ make distclean
$ make oldconfig dep bzImage modules
$ cp /boot/config-`uname -r` .config
$ make oldconfig || make menuconfig
$ make include/asm
$ make include/linux/version.h
$ make SUBDIRS=scripts
$ make rpm

d. Install the Linux kernel RPM.

If you are building a set of RPMs for a cluster installation, this step is not necessary. Source RPMs are only needed on the build machine.

$ rpm -ivh ~/rpmbuild/kernel-lustre-2.6.18-92.1.10.el5_lustre.1.6.6.x86_64.rpm
$ mkinitrd /boot/2.6.18-92.1.10.el5_lustre.1.6.6

e. Update the boot loader (/etc/grub.conf) with the new kernel boot information.

$ /sbin/shutdown 0 -r

2. Compile and install the MX stack.

$ cd /usr/src/
$ gunzip mx_1.2.7.tar.gz (can be obtained from www.myri.com/scs/)
$ tar -xvf mx_1.2.7.tar
$ cd mx-1.2.7
$ ln -s common include
$ ./configure --with-kernel-lib
$ make
$ make install

3. Compile and install the Lustre source code.

a. Install the Lustre source (this can be done via RPM or tarball). The source file is available at the Lustre download page. This example shows installation via the tarball.

$ cd /usr/src/
$ gunzip lustre-1.6.6.tar.gz
$ tar -xvf lustre-1.6.6.tar

b. Configure and build the Lustre source code.

The ./configure --help command shows a list of all of the --with options. All third-party network stacks are built in this manner.

$ cd lustre-1.6.6
$ ./configure --with-linux=/usr/src/linux --with-mx=/usr/src/mx-1.2.7
$ make
$ make rpms

The make rpms command output shows the location of the generated RPMs

4. Use the rpm -ivh command to install the RPMS.

$ rpm -ivh lustre-1.6.6-2.6.18_92.1.10.el5_lustre.1.6.6smp.x86_64.rpm
$ rpm -ivh lustre-modules-1.6.6-2.6.18_92.1.10.el5_lustre.1.6.6smp.x86_64.rpm
$ rpm -ivh lustre-ldiskfs-3.0.6-2.6.18_92.1.10.el5_lustre.1.6.6smp.x86_64.rpm

5. Add the following lines to the /etc/modprobe.conf file.

options kmxlnd hosts=/etc/hosts.mxlnd
options lnet networks=mx0(myri0),tcp0(eth0)

6. Populate the myri0 configuration with the proper IP addresses.

vim /etc/sysconfig/network-scripts/myri0

7. Add the following line to the /etc/hosts.mxlnd file.

$ IP HOST BOARD EP_ID

8. Start Lustre.

Once all the machines have rebooted, the next steps are to configure Lustre Networking (LNET) and the Lustre file system. See Configuring Lustre.


1 (Footnote) Additionally, a few bytes outside the journal are used to create accounting data for Lustre.
2 (Footnote) We encourage the use of 64-bit platforms.
3 (Footnote) In this manual, the Linux-HA (Heartbeat) package is referenced, but you can use any HA software.
4 (Footnote) Having more RAM is always prudent, given the relatively low cost of this component compared to the total system cost.
5 (Footnote) Running the MDS and a client on the same machine can cause recovery and deadlock issues, and the performance of other Lustre clients to suffer. Running the OSS and a client on the same machine can cause issues with low memory and memory pressure. The client consume all of the memory and tries to flush pages to disk. The OSS needs to allocate pages to receive data from the client, but cannot perform this operation, due to low memory. This can result in OOM kill and other issues.
6 (TableFootnote) Only install this kernel RPM if you want to patch the client kernel. You do not have to patch the clients to run Lustre.