One of the main factors leading to the high performance of Lustre file systems is the ability to stripe data across multiple storage targets (OSTs) in a round-robin fashion. Basically files can be split up into multiple chunks that will then be stored on different OSTs across the Lustre system.
Any file is just a linear sequence of bytes. The logical view of a file, divided into segments, may appear as the following:
In the physical view, the five segments may be striped across four OSTs:
Striping offers two benefits: 1) an increase in bandwidth because multiple processes can simultaneously access the same file, and 2) the ability to store large files that would take more space than a single OST. However, striping is not without disadvantages: 1) increased overhead due to network operations and server contention, and 2) increased risk of file damage due to hardware malfunction. Users have the option of configuring the size and number of stripes used for any file.
The default stripe settings vary across machines but the stripe count is generally set to 1 or 2 and the stripe size is generally 1 MB. To determine the stripe settings for a file or directory use the lfs getstripe command:
> lfs getstripe movie.mov movie.mov lmm_stripe_count: 4 lmm_stripe_size: 1048576 lmm_stripe_offset: 186 obdidx objid objid group 186 52153455 0x31bcc6f 0 258 53124880 0x32a9f10 0 25 52477227 0x320bd2b 0 97 52444876 0x3203ecc 0
In this example, the file movie.mov is striped across 4 OSTs with a stripe size of 1 MB. The obdidx numbers listed are the indices of the OSTs used in the striping of this file. Using getstripe on a directory gives information for the directory plus the files contained in the directory. You can limit getstripe to only show directory information by using the -d option. Alternately, you can use the -r option to recursively follow all subdirectories.
Large files benefit from higher stripe counts. By striping a large file over many OSTs, you increase bandwidth for accessing the file and can benefit from having many processes operating on a single file concurrently. Conversely, a very large file that is only striped across one or two OSTs can degrade the performance of the entire Lustre system by filling up OSTs unnecessarily. A good practice is to have dedicated directories with high stripe counts for writing very large files into.
Another scenario to avoid is having small files with large stripe counts. This can be detrimental to performance due to the unnecessary communication overhead to multiple OSTs. A good practice is to make sure small files are written to a directory with a stripe count of 1—effectively, no striping.
The lfs setstripe command is used to dictate a particular striping configuration for a file or directory. For a file, setstripe:
Using setstripe on a directory:
Note: Once a file has been written to Lustre with a particular stripe configuration, you cannot simply use setstripe to change it. The file must be re-written with a new configuration. Generally, if you need to change the striping of a file, you can do one of two things:
The options for lfs setstripe are:
> lfs setstripe -c 50 -s 32m bigdir
For more detailed information and I/O benchmarks see here.