SSS-PC is the next generation operating system which runs on IBM-PC compatible personal computers or Ultra60 or Ultra2 Sun MicroSystems workstation. SSS-PC is the successor of general purpose scalable oparating system SSS-CORE
Like UNIX, Linux, and
Windows NT, it
provides a multitasking environment, in which multiple tasks are
executed concurrently on a single machine.
For multiple machines connected with LAN, it also provides an
environment where machines can be operated as if they constituted
a single high-performance parallel processing system, as well as
a distributed processing environment like an internet and an intranet.
The SSS-PC system does not require any
specialized memory controller nor any specialized communication
hardware which a dedicated parallel processing system needs.
SSS-PC can execute multiple parallel applications
concurrently in multitasking environments in the same way as on a
single machine.
SSS-PC has the power to migrate tasks among machines while they are executing. Even parallel tasks can migrate to diferrent machines. SSS-PC enables workstations on LAN to constitute a cluster in an arbitrary combination. It is possible to suspend just one machine from among a cluster. It is also possible to add a new machine to a cluster on the fly. Even if one of machines in a cluster suddenly breaks down, it does not trouble the other machines.
The system of SSS-PC Ver. 1.0 boots from network. A machine for SSS-PC can be used for other operating systems in a disk or on network.
The following sections describe important technical features of SSS-PC.
Functional structure of SSS-PC is shown in the following figure. Please click on the image, or follow the link, to get a larger image.
In usual operating systems, the primary factor of inefficient parallel
processing is the huge cost of communication and synchronization
among machines.
We developed a high-performance mechanism of user-level communication
and synchronization, named `Memory-Based Communication Facilities
(MBCF)', for use on Ethernet interface.
The MBCF protocol is based on remote memory
accesses for data of medium-grain size (from tens to hundreds
bytes).
It provides memory protection and communication guarantee.
Compared with TCP/IP, which is a
very popular communication protocol, MBCF achieves richer functionalities
and much lower communication cost.
This is because MBCF cooperates with advanced memory management
mechanisms of recent microprocessors.
Experiment has shown that the communication overhead loaded on a
processor is two figures lower with MBCF on SSS-PC
than with TCP/IP on usual operating systems.
The communication latency is one figure lower when measured on
Fast Ethernet.
MBCF can be used with other protocols such as TCP/IP on the same
Ethernet network.
SSS-PC provides standard protocols,
like TCP/IP and IPsec, to communicate
with other operating systems.
SSS-PC features task migration which is a function that lets applications to move running machines while executing. Even parallel applications can migrate between machines. Users can perform maintenance jobs such as machine replacement, hardware component inspection and dynamic system reconfiguration without stopping running applications.
SSS-PC features unique scheduling strategy based on Free Market Mechanism. Applications refer to resource usage information provided by SSS-PC and selects best matched nodes for their own needs under their own responsibilities. Applications that monoply system resources will be pernished to avoid anarchism within the system.
Since SSS-PC is a general-purpose system,
the state of the system load and resource usage dynamically changes
according to the number of users or the types of applications.
The changes of such conditions cannot be predicted beforehand.
Even if a program is unfortunately started on a heavily-loaded
node, however, it can progress quickly
by transferring to a lightly-loaded node.
SSS-PC supports such run-time optimization
with a high-performance run-time library and an optimizing compiler.
The cost of gathering information on the system status should not
be high in order to allow a program to easily make a judgment for
optimization.
SSS-PC provides an Information Disclosure
Mechanism (IDM) for that purpose.
The kernel area for information on the system status (e.g. assignment
and usage of resources) is mapped into a user space in the read-only
mode so as to be referred at a low cost.
IDM supplies information about remote nodes, as well as local information,
by exchanging information with other IDM's.
A user application refers to IDM for the information and decides
by itself how to distribute loads, when to run, and so forth.
LINUX or UNIX applications can be ported to SSS-PCwithout modifying source codes by using comatible C-language library..
SSS-PC provides a user with a virtual
shared memory space spreading over the whole system through
MBCF.
Parallel applications for a shared memory system, however, cannot
run efficiently only with the functionality of
operations on remote memory.
For efficient execution, a supplementary functionality is necessary
which caches the contents
of remote memory (i.e. preserves copies of frequently-used data)
into the local memory.
Generally speaking, a distributed shared memory system with a caching
functionality is limited to an expensive parallel machine in the
form of dedicated hardware.
Some operating systems perform a functionality of distributed shared
memory in the kernel parts of them, but they are not practical owing
to their large overhead.
SSS-PC proposes the third approach;
an optimizing compiler actualizes a functionality of distributed
shared memory efficiently.
In our method, named `User-level Distributed Shared Memory (UDSM)',
emulating codes for a cache mechanism are inserted in a program.
The codes are analyzed so that the amount and the frequency of communication
are thoroughly reduced.
A user programmer has only to write a program after the common manner
of shared memory.
Supposing emulating codes for a cache mechanism are thoughtlessly
inserted in a user program, communication and cache maintenance
are performed in a fine-grain level as finely as memory accesses
of a processor.
The processing cost in communication software gets too large in
such fine communication even if MBCF is used, where the software
cost is 100 times smaller than usual protocols.
Thus an optimizing compiler (1) eliminates unnecessary
codes for communication and cache maintenance, (2) merges redundant
codes together, and (3) replaces a series of memory operations on
a contiguous area with a medium-grain memory operation.
`Asymmetric Distributed Shared Memory (ADSM)' is a hybrid
method of UDSM and usual OS-supported distributed shared memory.
In ADSM, cache read misses for remote memory are detected by memory
management mechanisms like usual OS-supported method.
Write operations to shared memory are, on the other hand, handled
by inserting cache emulating codes and accelerated by optimizing
communication and cache maintenance.
RCOP (Remote Communication OPtimizer) is an optimizing compiler
which provides a user with the above-mentioned
UDSM/ADSM-style distributed shared memory on
SSS-PC.
RCOP deals with a parallel program written in the C programming
language extended by macro libraries for shared memory operations.
It analyzes a shared memory parallel program and translates it into
a C program containing cache maintenance
codes for UDSM/ADSM.
The output C program is compiled by a common C compiler (gcc 2.7.2).
The object code is linked with the UDSM/ADSM run-time library to
generate an executable code.
The actual communication with MBCF is handled
by the library.
The RCOP optimization powered by the low cost of MBCF enables speedup
of many shared memory parallel programs, which was difficult on
a workstation cluster so far.
On each node in a parallel processing
system of SSS-PC, a program named
`SSS-MC (Micro Core)' always exists to manage and protect
resources.
SSS-MC is what is called a `compact OS kernel'.
Unlike other micro kernels, however, it has been developed attaching
importance to processing performance rather than to compactness
or interface consistency.
SSS-MC/SSS-PC adopts a `shared memory
view' all over its functionalities.
It provides OS-related functionalities for users in the form of
memory operations.
Memory operations for OS functionalities are protected and virtualized
without additional overhead by processor's memory management mechanisms.
This enables low cost functionalities with high generality.
MBCF is one of such functionalities.
MBCF makes communication and synchronization look like memory operations.
Since SSS-PC Ver. 1.0 is a general-purpose operating system, it has various functionalities and features other than described above. Part of them are listed below.