Phone
 949.380.7288
email
 4sales@pssclabs.com  
PSSC Labs

SUPPORT: CBeST

What is CBeST?

Overview
What do you need out of your Linux Supercomputer? Reliability, performance, ease of use and excellent support are probably on the top of your list. PSSC Labs understands the significance of your work and your need for the most powerful, reliable Linux Supercomputer possible. PSSC Labs is the answer to your Linux Supercomputer needs. PSSC Labs Linux Experts create an elegant, easy to use Linux Supercomputer solution that is ready for your use right out of the box. You will not need to consume valuable time configuring your Linux Supercomputer. Instead PSSC Labs Linux Experts will deliver your supercomputer with CBeST, the Complete Beowulf Software Toolkit. CBeST is much more than just cluster management. CBeST is a detailed process designed to deliver exactly what you need out of your Linux Supercomputer: reliability, performance, ease of use and excellent support.

Like most great software packages, CBeST evolved out of open source components. CBeST now includes PSSC Labs specific custom scripts and integration protocals that have been developed over the past five years with a proven track record of success. These CBeST components are elegantly integrated allowing you to easily manage, monitor, maintain, support and upgrade your Linux Supercomputer. PSSC Labs' Linux Experts customize and optimize CBeST to your specific computing needs. This customization consists of kernel optimization for performance enhancement and custom scripts for supercomputer maintenance. PSSC Labs Linux Supercomputers are the most turnkey solution possible.

PSSC Labs has delivered over 400 Linux Supercomputers to the worlds most demanding computing environments including NASA, United States Department of Defense, United States National Labs and over 100 prestigious universities. Through this unmatched and expanding experience, PSSC Labs develops CBeST to accommodate all types of Linux Supercomputer users. Whether you are NASA Systems Administer, Research Scientist or first time Linux user; CBeST is the perfect tool.

Your PSSC Labs Linux Supercomputer comes with unlimited phone and email support for the lifetime of your supercomputer. Our experience allows us to support all aspects of your Linux Supercomputer including troubleshooting open source software such as MPI and Open PBS. PSSC Labs Supercomputers come with a user manual that walks you through every component of your new supercomputer.

CBeST Components - Customization and Optimization For Superior Performance

Optimized Kernel
Most Linux Supercomputer manufacturers focus on the nuts and bolts of a Linux Supercomputer. PSSC Labs pays careful attention to your supercomputer's hardware stability and performance. Once the hardware is fully tested and benchmarked, PSSC Labs Linux Experts go to work; installing and optimizing your supercomputers' Linux kernel to match your specific hardware and software needs. Our Linux Experts carefully customize all aspects of the Linux kernel making adjustments to enhance performance by turning on/off kernel options to best match your hardware. This prevents unnecessary system resource drain. PSSC Labs Cluster Experts utilize ACPI (advanced configuration and power interface) to enhance performance through better resource management at the motherboard level. Hardware driver updates are made to your Linux Supercomputer for maximum performance. PSSC Labs Linux Experts also update and patch file system drivers and userland utilities. One important note, this optimized kernel still remains completely open source. You can make any necessary adjustments.

Kernel optimization is a time consuming processor. However, through this process PSSC Labs Linux Experts can improve cluster performance by as much as 15%. Some Linux builders skip this important step and simply install a prepackaged version of Linux. The reason why PSSC Labs takes this extra time is to deliver the most complete, turn-key high performance computing solution; not just a bunch of computers. Complete instructions for kernel upgrades are included.

Custom Scripts
PSSC Labs includes several custom scripts to ease cluster maintenance. Although the scripts are custom, PSSC Labs can provide source information if required. Browse below at a sample of scripts included with each PSSC Labs Beowulf Linux Cluster.

gsh
Executes commands in parallel on all running compute nodes. It uses Ganglia to build a node list of active nodes. It uses rcp/scp.
Arguments: A single command with or without arguments or a string of multiple commands enclosed in quotes. It uses Ganglia to build a node list of active nodes.

Examples:
:: gsh uptime
:: gsh 'sensors | grep Temp'
:: gsh 'cd /var/spool/up2date && rpm -Fv *rpm'

rshall/sshall
Executes commands sequentially on all running compute nodes one node at a time. If you don't give it a command it will start a shell on each node. Only one shell per node is started at a time. To move to the next node you need to exit from the current shell. It uses Ganglia to build a node list of active nodes.
Arguments: A single command with or without arguments or a string of multiple commands enclosed in quotes.

Examples:
:: rshall free
:: rshall df -h /scratch
:: sshall 'cd /scratch/ && mkdir -v test && cp -av /net/master/scratch/data/* test'

rshallbg/sshallbg 'commands'
Same as the rshall/sshall scripts with the addition of backgrounding each rsh/ssh "fork". It uses Ganglia to build a node list of active nodes.
Arguments: A single command with or without arguments or a string of multiple commands enclosed in quotes.

Examples:
:: rshall df -h /scratch
:: sshall 'cd /scratch/ && mkdir -v test && cp -av /net/master/scratch/data/* test'

node-wakeup
This script uses WOL (Wake-On-Lan) to remotely turn on nodes. It uses the MAC addresses found in /etc/dhcpd.conf. WOL is not a reliable service. It uses Ganglia to build a node list of active nodes.
Arguments: all or short node name.

Examples:
:: node-wakeup all
:: node-wakeup n12

node-reboot
This script reboot all the currently running compute nodes. It uses gsh or rsh/ssh for remote command execution.
Arguments: all or short node name.

Examples:
:: node-reboot all
:: node-reboot n12

node-poweroff
This script halts (and powers off when supported) all the running compute nodes. It uses gsh or rsh/ssh for remote command execution.
Arguments: all or short node name.

Examples:
:: node-poweroff all
:: node-poweroff n12

node-up2date
This script updates out-of-date RPM packages on the running compute nodes. It uses gsh or rsh/ssh for remote command execution.
Arguments: None.

Examples:
:: node-up2date

cluster-syslog-monitor
This script uses a syslog FIFO log file to help you see what your cluster is currently doing. All of the nodes send a copy of their syslog messages to the head node.
Arguments: None.
Examples:
:: cluster-syslog-monitor

pssc-gmetric-sensors
This script is used to record LM_Sensors and other system health data using Ganglia's gmetric command. The metrics are then available with the use of the "ganglia" command or through the Ganglia web interface.
Arguments: None.
Configuration File: /etc/sysconfig/pssc-gmetric-sensors
Examples:

crontab -l | grep pssc-gmetric-sensors

0 * * * *       /opt/sbin/pssc-gmetric-sensors
5 * * * *       /opt/sbin/pssc-gmetric-sensors
10 * * * *      /opt/sbin/pssc-gmetric-sensors
15 * * * *      /opt/sbin/pssc-gmetric-sensors
20 * * * *      /opt/sbin/pssc-gmetric-sensors
25 * * * *      /opt/sbin/pssc-gmetric-sensors
30 * * * *      /opt/sbin/pssc-gmetric-sensors

mygetimage-nodes
This command is used to grab the SystemImager golden image (usually from n01).
Arguments: None.

Examples:
:: ssh n01
prepareclient
service autofs stop
exit

:: mygetimage-nodes
45 * * * *      /opt/sbin/pssc-gmetric-sensors
50 * * * *      /opt/sbin/pssc-gmetric-sensors
55 * * * *      /opt/sbin/pssc-gmetric-sensors

Network Configuration & Custom Settings
Network performance is crucial to the overall performance of your Linux Supercomputer. In most Linux Supercomputers, the network is the most significant performance bottleneck. PSSC Labs understands this and pays careful attention to your supercomputers' network configuration. Before CBeST installation begins on your new supercomputer, a PSSC Labs Linux Experts will contact you to discuss your particular network environment and software usage needs. We will then properly configure the supercomputer including network layout, custom IP settings and security settings. The latest network drivers for your specific hardware are installed to ensure maximum performance and stability. DHCP/BOOTB is used to centralize node IP configuration and to facilitate rapid deployment of additional nodes using System Imager. Multiple MPI jobs are run to test this performance and stability before your Linux Supercomputer leaves our facility.

PSSC Labs supports all forms of networking including gigabit Ethernet, 10 Gigabit Ethernet, Myrinet, Infiband, Dolphinics and Quadrics. CBeST documentation includes detailed explanation about these network settings and
provides information on changes to these settings.

Hard Drive Partitioning
The Linux operating system is very flexible; allowing PSSC Labs Linux Experts to partition your head node and slave node hard drives for optimum performance and reliability. This includes separating the operating system from data files. By separating the operating system from data files you can easily reinstall the operating system without wiping out data.

Swap space plays a key role in overall performance. PSSC Labs Linux Experts will work with you making recommendations to maximize performance. You can select the amount of swap space required for your specific application. Don't know how much swap space is needed? Ask a PSSC Labs Linux Expert. We can help. Details on how your hard drive is partitioned are included in the CBeST user manual.

Message Passing Software

MPI
At the core of your Linux Supercomputer is the message passing interface. PSSC Labs installs any or all message passing software packages depending on your needs. This includes PVM (Parallel Virtual Machine), MPICH (Portable Implementation of MPI), MPICH-GM (Myricom Developed Port of MPICH) and LAM MPI (An Implementation of the Message Passing Interface). Before your Linux Supercomputer leaves our facility, extensive message passing code testing is conducted to ensure maximum performance and stability. CBeST documentation includes several examples of running MPI jobs. PSSC Labs Linux Experts are available to answer all your troubleshooting questions.

Batch Scheduling Software

Open PBS
Most Linux Supercomputers delivered by PSSC Labs perform tasks other than parallel processing. PSSC Labs Linux Supercomputers are excellent for batch scheduling applications as well. Open PBS is installed, tested and delivered with your Linux Supercomputer. Open PBS is a powerful and flexible batch queuing system developed in the early nineties. Open PBS allows you to control all resources including, user priority, memory access, process access, node access, duration of execution, job priorities and much more. Contact a PSSC Labs Linux Expert to find out more about Open PBS.

Optional Batch Schedulers
PSSC Labs also delivers supercomputers using Sun Grid Engine.

Supercomputer Management & Monitoring

Ganglia Supercomputer Monitoring Utility
Ganglia allows you to quickly and easily view all important operating conditions of your Linux Supercomputer through a GUI. You can remotely access details regarding cpu usage, cpu temperatures, chassis and cpu fan speeds, hard drive temperatures, memory utilization, hard drive swap space and many more metrics. Ganglia even allows you to view your own set of metrics.

Temperature Monitoring
Maintaining a good computing environment is key to the success and longevity of your Linux Supercomputer. Application runs could require thousands of compute hours. This means your Linux Supercomputer needs to be as reliable as possible. CBeST includes LM_Sensors to monitor CPU temperatures and case fans. PSSC Labs Linux Experts configure LM_Sensors to send a warning prompt in the event of CPU temperature overheating.

System Imager
Linux Supercomputer administrators face many issues including maintaining a consistent OS kernel on all nodes. In addition, Linux administrators may face the daunting task of repairing a corrupted file system, replacing a failed hard drive and adding new nodes to an existing Linux Supercomputer. CBeST eases this process with the use of System Imager. System Imager enables you to manually create an entire slave node kernel image on the head node and then push this image to every slave node of your supercomputer. You can update the slave node systems by syncing them to an updated image on the head node. These updates are extremely fast because only the configuration portions that have changed will be pushed to the slave nodes. Complete details on using System Imager are included in the CBeST user manual.

Custom Scripts
PSSC Labs Supercomputer Technicians include custom scripts to help you better manage the supercomputer. All scripts and shell aliases are included with your Linux Supercomputer to facility their use.

1)A node synchronization cron script allows for remote file copy across the cluster; keeping all nodes updated to the latest configuration.

2)Parallel command scripts using the Ganglia Execution Environment (gexec), the Parallel Distributed Shell (pdsh) and ssh/rsh/rlogin commands to launch and execute administrative commands across cluster nodes.

3)Health monitoring scripts are installed to work with Ganglia to record and graphically represent cluster metrics dating back up to one year. These metrics include processor, motherboard and hard drive temperatures. Processor and chassis fans status can also be monitored through Ganglia. You can also add your own script to track additional metrics.

4)Power management scripts are include to allow remote power off and power on functions.

Security Settings
Port mapper and ipchains / iptables are configured to help keep your supercomputer secure. Cron scripts are included to keep user accounts and configuration files synchronized. Examples for allowing client machines access to your cluster through NFS resources are provided in your user manual.

Compilers

GNU Comilers & Commercial Compilers
CBeST includes the freely available GNU compilers with your Linux Supercomputer. We do offer installation of commercial compilers from the Portland Group and Intel.

PORTLAND GROUP Workstation, Server and Supercomputer Compilers Optimized compilers including HPF, F90, F77, C, C++ and debugger INTEL FORTRAN & C++ -Compilers products for the Linux Operating System

CBeST Scalability
Designed to take advantage of low cost commodity components, CBeST scales to thousands of nodes. CBeST currently runs on over 5000 nodes across the United States.

CBeST User Manual
CBeST User Manual has been called "exactly what I want to see" by system administrators. Detailed information on CBeST tools, installation, customization, operation and troubleshooting are included with every PSSC Labs Linux Supercomputer.

Your CBeST user manual is your ultimate supercomputer resource. From the minute your PSSC Labs supercomputer arrives at your door, your user manual will guide you through the initial installation of the supercomputer to troubleshooting MPI and PBS jobs. Developed by the same PSSC Labs Linux Experts that installed the software on your Linux Supercomputer, the user manual is designed to assist and educate any and all levels of supercomputer administrators. If you are a research scientist, focusing on your own work and not administering a supercomputer, CBeST documentation is the perfect first resource guide. If you are a full time system administrator, this manual will walk you through all aspects of the CBeST installation and give you plenty of options to make your own changes.

CBeST Support
All PSSC Labs Linux Supercomputers include complete email and phone support for the lifetime of your Linux Supercomputer. This includes all questions related to CBeST including MPI and Open PBS. There is never a fee for this support no matter how many times you contact PSSC Labs. The experience PSSC Labs gains through delivering 300 Linux Supercomputers enables our Linux Experts to support any issues you may have. We can support your technical questions regarding the Linux operating system, message passing software, security settings and performance. PSSC Labs prides itself on providing the best level of support possible. Give us a test and contact a PSSC Labs Linux Expert today.