Shared Memory

My initial contact with openMosix, was a couple of years ago when I was looking for a way to increase the performance of a MySQL database. After numerous tests using different kernels and different File-systems, I still wasn't satisfied, so I searched for a way to use multiple machines , that way I ran into the openMosix project

My initial tests failed due to ,among other reasons, the fact that MySQL uses Shared Memory. With this disappointment in mind I kept following the openMosix project

As our Wiki shows most of the programs that don't migrate to other nodes , are locked to their home node due to the fact that they use Shared Memory. With the announcement of the migShm patch from the MAASK group , I decided to retry my original tests , I tried to expand my tests to the applications listed in the Wiki as not migrating due to using shared memory.

Maya, Anu, Asmita, Snehal, Krushna, the developers of migShm, are students of Cummins College of Engineering, University of Pune, India, they spent a year working on this project and wrote a report about it explaining how they implemented it. The following paragraphs will try to summarize their work

The Shared Memory Problem

Shared memory, the words actually already mention the problem, when a process that does not use shared memory has to be migrated, openMosix destroys the memory map and the related pages on the home node and recreates them on the remote node, this can not be done when an application uses Shared memory since other applications are using the same memory regions.

Second, imagine 2 shared memory processes on the same node, trying to migrate both to the same remote node, one could have implemented a solution where both processes created new instances of their required memory regions, hence doubling the memory usage, while one should actually have both these processes share the same memory regions on the remote node also.

The MAASK Solution

Given the scope of this project the MAASK group decided on a modular approach, with each module being responsible for one of the following aspects : Migration of shared memory processes, Communication Module, Consistency Module, Access logs and Migration Decisions , Migration of shared memory and Thread migration

In order to manage shared memory one needs a number of system calls such as shmget(), shmat(), shmdt() and shmctl(). Each of these function calls has to be made transparently available at the remote when such a shared memory process tries to migrate, these calls then have to access the local memory map.

A Communication Module was needed in order to have communication between the deputy and the remote stub. MAASK group used the commlink for this purpose, however in order to make sure that they could guarantee communications between two processes on different nodes they also had to implement a daemon active on every node of the openMosix cluster called MigSharedMemD, which is listening on port 0x3418. A header type is defined to facilitate the communication.

Different processes using the same shared memory regions can migrate to different nodes, on those different nodes they work on local copies of that share memory. Consistency has to be maintained between the different instances of these local copies. MAASK group opted for the Eager Release consistency model, this implements that a local copies of modified pages will only be written to the original owner when the lock on that memory segment is released. , thus not for every write. This ensures that the owner node of the shared memory always has the latest copy. Upon being read by another remote process, it page faults from the owner node to fetch the latest up to date pages.

A mig_shm_daemon() is running on every node , once an invalidate message is sent by the owner to the remotes, the page table entries of all remote processes for the pages that were modified are being invalidated. The next time a process tries to access a recently invalidated page , it will page fault to the owner node and get the latest copy from the owner node.

The updating of the pages from the remote node to the owner node is performed by a write-back operation, hence the dirty pages are being sent back to the owner node in order to be restored to their correct position, just before release of the lock on the remote node.

Just as in the original openMosix algorithms one has to know which processes that use shared memory are actually suitable for migration, To facilitate this a module has been written that monitors and logs the access to shared memory. The functionality of openMosix's memsorter daemon has been extended to monitor all accesses of different processes to a shared memory region. MAASK assumed that a process attaches to only one shared memory region .

Based on these statistics and a threshold migShm determines how strong a process is linked to a certain shared memory region. A weakly linked process is migrated without migrating the memory. For a strongly linked process whether a process with or without the memory is being migrated depends on how strong other processes are linked to it. If the selected process is the most active then memory is migrated with the process.

When a process is found to be suitable for migration and the shared memory is migrated along with the process to a remote node, this remote node becomes the owner of the shared memory node.