openMosix

The openMosix software packages turn networked computers running GNU/Linux into a cluster. It automatically balances the load between different nodes of the cluster, and nodes can join or leave the running cluster without disruption. The load is spread out among nodes according to their connection and CPU speeds.

openMosix is a Linux-kernel patch which provides full compatibility with standard Linux for IA32-compatible platforms. The internal load-balancing algorithm transparently migrates processes to other cluster members. The advantage is a better load-sharing between the nodes. The cluster itself tries to optimize utilization at any time (of course the sysadmin can affect these automatic load-balancing by manual configuration during runtime).

Programs such as Flac, Bladeenc, Povray and mjpeg tools work without any modifications, as does MPI, indeed, MPI applications benefit from running in an openMosix environment. What happens is that although a process gets started on node 1, the cluster determines , based on different criteria, whether it wouldn't be better to run a certain process on another , less loaded, node. openMosix uses an advanced algorithm based on market economics to determine which node suites the application best. Off course you can always force an application to run on a certain node. This way even applications that have already been parallelized will gain from openMosix.`

What it comes down to is that openMosix is one of the easiest and fastest ways to setup your own high performance cluster, no hassle with extra libraries that are required and no hassle with configuration problems, and add to that an excellent cluster management interface such as openMosixview and you are on the road to get your own supercomputer.

Users log on to the cluster, start their application and leave, openMosix looks for the best node to run the application on an migrates the application, the user does not have to find a suitable node to run his application, openMosix will do that for him, 2 minutes later another user starts another application , the same process happens, hence giving your whole cluster some work. For the users it is is one giant SMP machine. While in yesterdays environment each job had to wait till the other was finished now applications can run together. Nodes can join and leave the cluster, so you might want to setup an environment where during daytime your employees do their normals work on their workstations, and at night their workstations join the openMosix cluster and start number crunching.

openMosix is at it's best when it has multiple, sometimes the same or similar, long running jobs that, examples for these kind of applications are numerous, some examples of these kind of applications might include financial institutions running risk analysis or scientific research institutions running DNA comparisons.

However, there were still some features missing in yesterdays release of openMosix, not all applications can migrate to other nodes, some of them require features that prevent a process from being separated into a system and user context. Some processes are so tightly bound to hardware or network features that migrating them would slow them down rather than giving them more resources.

During the past year, several people started working on solutions for these issues, among others were implementations that support Checkpointing , the implementation of the autodiscovery tools that made openMosix nice and easy to install, the development of the General openMosix Daemon, the Load-Limit patch, the Cluster-Mask patch and the implementation of Shared Memory Migration.