HPC

BeeGFS Parallel File System - What's All The BuZzzzz About

March 21, 2022
6 min read
EXX-Blog-BeeGFS-Whats-all-the-buzz.jpg

An Overview of BeeGFS

For decades, parallel file systems have been crucial in building storage environments that are capable of keeping up with today's most demanding technology solutions. From storing massive, mission-critical datasets, to providing high-speed scratch space, to storing the results of expensive long-running computation, parallel file systems are vital to the operations of many industries.

If you’re familiar with parallel cluster file systems, then you're most likely familiar with BeeGFS

BeeGFS is an open source parallel file system that is suitable for large data storage and environments that require high scalability. It was designed as a storage solution for high performance computing by the Fraunhofer Institute for Industrial Mathematics. Originally known as the Fraunhofer Gesellschaft File System (FhGFS), BeeGFS is the leading parallel cluster file system, developed with a strong focus on performance as well as being very easy to install and manage.

beegfs-architecture.png

As a parallel file system indicates, BeeGFS files are striped over multiple server nodes to maximize read/write performance and scalability of the file system. These server nodes work together to deliver a single file system that can be simultaneously mounted and accessed by other server nodes, commonly known as clients. In a nut-shell, clients can see and consume this distributed file system similarly to a local file system such as NTFS, XFS, or ext4.


Need a turnkey storage appliance built on validated hardware for demanding HPC workloads?
Learn more about Exxact's BeeGFS solutions


BeeGFS vs. Traditional File Systems

BeeGFS was created to overcome the limitations and restrictions of traditional filesystems; avoid architectural bottlenecks; and address modern HPC, cognitive market I/O profile demand.

Why use BeeGFS?

Traditionally parallel file systems have been seen as complex to deploy and manage. However, BeeGFS has been designed for flexibility and simplicity from its inception.

BeeGFS transparently spreads user data across multiple servers. By increasing the number of servers and disks in the system, you can simply scale performance and capacity of the file system to the level that you need seamlessly, from small clusters all the way up to enterprise-class systems with thousands of nodes. 

This gives BeeGFS high performance value that is also cost-effective, not to mention BeeGFS can be easily tested in small proof-of-concept environments to prove out the performance and value.

BeeGFS: A New Standard for Performance, Scalability and Flexibility

BeeGFS is easy to use and requires no kernel patches. The client is a patchless kernel module, while the server components are user space daemons. It comes with graphical cluster installation tools and allows you to add more clients and servers to the running system whenever you want it.

BeeGFS offers maximum performance and scalability on various levels. It supports distributed file contents with flexible striping across the storage servers on a file or by directory base as well as distributed metadata and its client and server components are available for Linux on x86, x86_64, ARM64, and other architectures.

BeeGFS is optimized to provide:

Maximum Scalability

Maximum Flexibility

  • maximum performance and scalability
    It supports distributed file contents with flexible striping across storage servers on a per-file or per-directory basis as well as distributed metadata.

  • Best in class client throughput
    8GB/s with only a single process streaming on a 100GBit network, while a few streams can fully saturate the network.

  • Linear scalability through dynamic metadata namespace partitioning.

  • Flexible choice of underlying file system to perfectly fit the given storage hardware. BeeGFS Storage Pools make different types of storage devices available within the same namespace. By having SSDs and HDDs in different pools, pinning of a user project to the flash pool enables all-flash storage performance for the current project while still providing the advantage of the cost-efficient high capacity of spinning disks for other data.
  • BeeGFS supports a wide range of Linux distributions such as RHEL/Fedora, SLES/OpenSuse or Debian/Ubuntu as well as a wide range of Linux kernels from ancient 2.6.18 up to the latest vanilla. The storage services run on top of an existing local filesystem (such as xfs, zfs or others) using the normal POSIX interface and clients and servers can be added to an existing system without downtime.

  • BeeGFS supports multiple networks and dynamic failover in case one of the network connections is down.

  • BeeGFS client and server components can also run on the same physical machines. Thus, BeeGFS can turn a compute rack into a cost-efficient converged data processing and shared storage unit, eliminating the need for external storage resources and providing simplified management.

BeeGFS Primary Enterprise Features

  • Storage and metadata mirroring with high availability
  • System-wide disk space allocation, quota tracking and enforcement
  • Access Control List
  • Storage pools
  • BeeOND: BeeGFS On Demand (Burst Buffering)

BeeGFS has recently made some improvements with key benefits such as:

  • Added support for new operating system and Mellanox OFED releases.
  • Added option KRELEASE to client makefile to facilitate client builds in a chroot.
  • Added option to disable SSL certificate checks in beegfs-mon to enable self-signed certificates for InfluxDB.
  • Customer contribution: RDMA completion queue interrupts will now be distributed across all CPU cores instead of always using the first core.
  • Multiple updates to documentation.

At Exxact Corporation we pride ourselves on building high quality turnkey storage solutions. Our BeeGFS Solution is a robust solution built on validated hardware for the most demanding HPC workloads. If you’d like to learn more contact our sales engineer team. They're waiting to build out the perfect solution for you.


Have any questions about BeeGFS or other storage solutions? Contact Exxact Today


EXX-Blog-BeeGFS-Whats-all-the-buzz.jpg
HPC

BeeGFS Parallel File System - What's All The BuZzzzz About

March 21, 20226 min read

An Overview of BeeGFS

For decades, parallel file systems have been crucial in building storage environments that are capable of keeping up with today's most demanding technology solutions. From storing massive, mission-critical datasets, to providing high-speed scratch space, to storing the results of expensive long-running computation, parallel file systems are vital to the operations of many industries.

If you’re familiar with parallel cluster file systems, then you're most likely familiar with BeeGFS

BeeGFS is an open source parallel file system that is suitable for large data storage and environments that require high scalability. It was designed as a storage solution for high performance computing by the Fraunhofer Institute for Industrial Mathematics. Originally known as the Fraunhofer Gesellschaft File System (FhGFS), BeeGFS is the leading parallel cluster file system, developed with a strong focus on performance as well as being very easy to install and manage.

beegfs-architecture.png

As a parallel file system indicates, BeeGFS files are striped over multiple server nodes to maximize read/write performance and scalability of the file system. These server nodes work together to deliver a single file system that can be simultaneously mounted and accessed by other server nodes, commonly known as clients. In a nut-shell, clients can see and consume this distributed file system similarly to a local file system such as NTFS, XFS, or ext4.


Need a turnkey storage appliance built on validated hardware for demanding HPC workloads?
Learn more about Exxact's BeeGFS solutions


BeeGFS vs. Traditional File Systems

BeeGFS was created to overcome the limitations and restrictions of traditional filesystems; avoid architectural bottlenecks; and address modern HPC, cognitive market I/O profile demand.

Why use BeeGFS?

Traditionally parallel file systems have been seen as complex to deploy and manage. However, BeeGFS has been designed for flexibility and simplicity from its inception.

BeeGFS transparently spreads user data across multiple servers. By increasing the number of servers and disks in the system, you can simply scale performance and capacity of the file system to the level that you need seamlessly, from small clusters all the way up to enterprise-class systems with thousands of nodes. 

This gives BeeGFS high performance value that is also cost-effective, not to mention BeeGFS can be easily tested in small proof-of-concept environments to prove out the performance and value.

BeeGFS: A New Standard for Performance, Scalability and Flexibility

BeeGFS is easy to use and requires no kernel patches. The client is a patchless kernel module, while the server components are user space daemons. It comes with graphical cluster installation tools and allows you to add more clients and servers to the running system whenever you want it.

BeeGFS offers maximum performance and scalability on various levels. It supports distributed file contents with flexible striping across the storage servers on a file or by directory base as well as distributed metadata and its client and server components are available for Linux on x86, x86_64, ARM64, and other architectures.

BeeGFS is optimized to provide:

Maximum Scalability

Maximum Flexibility

  • maximum performance and scalability
    It supports distributed file contents with flexible striping across storage servers on a per-file or per-directory basis as well as distributed metadata.

  • Best in class client throughput
    8GB/s with only a single process streaming on a 100GBit network, while a few streams can fully saturate the network.

  • Linear scalability through dynamic metadata namespace partitioning.

  • Flexible choice of underlying file system to perfectly fit the given storage hardware. BeeGFS Storage Pools make different types of storage devices available within the same namespace. By having SSDs and HDDs in different pools, pinning of a user project to the flash pool enables all-flash storage performance for the current project while still providing the advantage of the cost-efficient high capacity of spinning disks for other data.
  • BeeGFS supports a wide range of Linux distributions such as RHEL/Fedora, SLES/OpenSuse or Debian/Ubuntu as well as a wide range of Linux kernels from ancient 2.6.18 up to the latest vanilla. The storage services run on top of an existing local filesystem (such as xfs, zfs or others) using the normal POSIX interface and clients and servers can be added to an existing system without downtime.

  • BeeGFS supports multiple networks and dynamic failover in case one of the network connections is down.

  • BeeGFS client and server components can also run on the same physical machines. Thus, BeeGFS can turn a compute rack into a cost-efficient converged data processing and shared storage unit, eliminating the need for external storage resources and providing simplified management.

BeeGFS Primary Enterprise Features

  • Storage and metadata mirroring with high availability
  • System-wide disk space allocation, quota tracking and enforcement
  • Access Control List
  • Storage pools
  • BeeOND: BeeGFS On Demand (Burst Buffering)

BeeGFS has recently made some improvements with key benefits such as:

  • Added support for new operating system and Mellanox OFED releases.
  • Added option KRELEASE to client makefile to facilitate client builds in a chroot.
  • Added option to disable SSL certificate checks in beegfs-mon to enable self-signed certificates for InfluxDB.
  • Customer contribution: RDMA completion queue interrupts will now be distributed across all CPU cores instead of always using the first core.
  • Multiple updates to documentation.

At Exxact Corporation we pride ourselves on building high quality turnkey storage solutions. Our BeeGFS Solution is a robust solution built on validated hardware for the most demanding HPC workloads. If you’d like to learn more contact our sales engineer team. They're waiting to build out the perfect solution for you.


Have any questions about BeeGFS or other storage solutions? Contact Exxact Today