TOP-C stands for Task-Oriented Parallel C Coo96. The ``TOP-C
model'' is the specific master slave model implemented here. That model
has been adapted for use in ParGAP. The implementation is in
masslave.g in ParGAP's
lib directory. Note that the functions and
variables with names
TOPC... are intended as internal functions only,
and should not be used by the GAP programmer.
For the impatient, you may type
MSexample(); in a ParGAP session
now. If you prefer further hands-on learning in a tutorial style, you may
wish to next read Chapter MasterSlave Tutorial. Eventually, if you wish
a deeper understanding of the TOP-C model, you will need to read this
current section and those that follow.
The initial GAP process is the master process, and all others are slave processes. It allows most of the CPU-intensive computations to be carried out on slave processes, which typically reside on remote processors. A well-developed TOP-C application should find that the master process is almost never busy when a slave process is idle, waiting for a new computation to carry out. This provides a natural way of maximizing utilization and load balancing.
The TOP-C model depends on three concepts:
The task input is defined to be the argument of the task (considered as a function), and the task output is the return value of the task.
There is only one core TOP-C command, a utility function, and several constants. A TOP-C command must be evaluated on the master and on all slaves. We shall describe the commands in detail in the following sections, but a short list of the essentials and a small example will be helpful to set the context.
]]] ) F
See Section Other TOP-C Commands for a description of
CONTINUATION_ACTION() is described in Section The GOTO statement of the TOP-C model.
A short example shows one possible implementation of
gap> ParInstallTOPCGlobalFunction( "MyParListWithAgglom", > function( list, fnc ) > local result, i; > result := ; i := 0; > MasterSlave( function() if i >= Length(list) then return NOTASK; > else i := i+1; return i; fi; end, > fnc, > function(input,output) result[input] := output; > return NO_ACTION; end, > Error > ); > return result; > end );
(Of course rather than type such code in a ParGAP session it's
generally more convenient to have it in a file and
Read it in.)
A master-slave computation is invoked when a GAP program issues the
MasterSlave(). As given earlier, the typical form is:
where the first four arguments of
MasterSlave() are also functions, but
they must be defined by the application writer. Their calling syntax is
defined by the following GAP code, which also provides a simplified
description of how a sequential (non-parallel)
invoke these functions if there were only a single process. (A more
sophisticated version of this routine is provided in ParGAP to allow
one to debug within a single process first.) The use of the fifth
argument, taskAgglom, is deferred until section Agglomerating tasks for efficiency (ParSemiEchelonMat revisited again).
In this section, we define
MasterSlave() and describe the use of its
four arguments in a purely sequential environment. The issues of
parallelism and passing of messages between processes is covered in the
next section. The call to
MasterSlave() in ParGAP, above, will have
the same result as if
MasterSlave() were defined equivalently to
SeqMasterSlave() below, and then run in a standard, sequential GAP
(a single process). The next section describes the multi-process
MasterSlave() in ParGAP, in which
computed on the master process and sent as a message to a slave process,
taskOutput is computed on a slave process and sent as a message
to the master process.
SeqMasterSlave := function(SubmitTaskInput, DoTask, CheckTaskResult, UpdateSharedData) local taskInput, taskOutput, action; while true do taskInput := SubmitTaskInput(); if taskInput = NOTASK then break; fi; repeat taskOutput := DoTask( taskInput ); action := CheckTaskResult( taskOutput, taskInput ); until action <> REDO_ACTION; if action = UPDATE_ACTION then # Modify the shared data (global data structures) here # Called on all processes, master and slaves UpdateSharedData( taskOutput, taskInput ); fi; od; end;
One can also follow the life of a single task in a multi-processing environment through the diagram below.
MASTER | SLAVE _________________________________________________________________________ | +---------------------+ | | GenerateTaskInput() | | +---------------------+ | \input | \______________ | | | v | +---------------+ | | DoTask(input) | | +---------------+ |output/ ^ _________/ | | | | v | | +--------------------------------+ | | | CheckTaskResult(input, output) |___________| +--------------------------------+ (if action == REDO) | | | (if action == UPDATE) v | +---------------------------------+ | UpdateSharedData(input, output) | +---------------------------------+ | TOP-C Programmers' Model (Life Cycle of a Task)
Although not explicit in the code, the application writer should add
comments to define the shared data. The shared data is defined as a
global data structure that is treated as ``read-write'' by
(), while being treated as ``read-only'' by
(). Note also
that an application writer may use different names for the four functions
(), etc. It is only a convention within this manual to
give those functions the names, above. Similarly, taskInput,
taskOutput and action are the conventional names used in this manual,
and a given application may use different names.
In a correct ParGAP application, the shared data should be initialized
to the same value on all processes before the application calls
MasterSlave() is then called on all processes. After
that, the shared data can be modified only by a call to
MasterSlave() arranges for each call to
() to be executed on all processes. Further,
() has access only to taskInput, taskOutput, and
the previous value of the shared data. Thus,
the same shared data uniformly on all processes.
This section is concerned with formal definitions for the routines
associated with ParGAP. It is important to keep in mind the
pseudo-code of Chapter Basic Concepts for the TOP-C model (MasterSlave). Since
MasterSlave() uses all the ParGAP processes,
the user must invoke it on all processes. This is typically done through
some function provided by the slave listener layer, such as
(see ParEval). It may be instructive for the reader to run ParGAP
MSexample(); now, or else to look at some examples of
ParGAP applications in the section MasterSlave Tutorial. This
demonstrates the use of
MasterSlave() in a typical session.
The four functions written by the application writer are:
() is executed on a slave.
() are executed on the
master, where a taskInput is generated and a corresponding taskOutput
is received. Finally,
() is executed on all
processes. ParGAP arranges to automatically pass taskInput and
taskOutput between the master and a slave.
Since the single master process is responsible for generating all
taskInputs and receiving all taskOutputs, it is critical that
computation on the master process should not become a bottleneck for a
well-designed ParGAP application. Accordingly, the application writer
should arrange for
execute quickly, even if this means additional computation by
As seen in the examples,
() may use global variables on
the master to ``remember'' the last taskInput or other state
information. Note that such global variables cannot be part of the
shared data, since they are modified outside of
It is instructive to review the logic for the lifetime of a task, as
described by the pseudo-code for
SeqMasterSlave in Section Other TOP-C commands. Initially,
() on the
master, which returns an application-defined GAP object, taskInput.
MasterSlave() then copies taskInput to an arbitrary slave process,
MasterSlave() then calls
) on the slave.
This returns an application-defined GAP object, taskOutput, which
MasterSlave() copies to the master process. On the master,
MasterSlave() then calls
), which returns an action. (Recall that taskInput, taskOutput and
() are defined by the application writer, and so an
application program may give them different names.)
There are four possible actions (ParGAP constants):
). A standard language idiom in ParGAP is to define
() as the ParGAP function
DefaultCheckTaskResult(), whose code is as follows:
DefaultCheckTaskResult := function( taskOutput, taskInput ) if taskOutput = false then return NO_ACTION; elif IsUpToDate() then return UPDATE_ACTION; else return REDO_ACTION; fi; end;
In the simplest case,
NO_ACTION, in which
case there is no further computation related to the original taskInput.
() may record global information on the master
process, based on the taskOutput, but the shared data, and hence the
state of the slave processes, will not be modified.
In the second most common case,
UPDATE_ACTION. This action causes
MasterSlave() to call
) on all processes
(master and slaves). This is the only way in which the shared data can
be modified by a correct ParGAP program.
In the third most common case,
REDO_ACTION action is generated, the value of taskInput is
re-sent to the same slave that executed
) for the
current task. An application will typically invoke
REDO_ACTION if the
shared data has changed, and this changed shared data will produce a new
taskOutput. As before,
() then returns a new value of
taskOutput. Then, taskInput and the new taskOutput are again passed
MasterSlave() guarantees that
REDO_ACTION causes the task
to be re-sent to the same slave process. This allows the application to
cache in a global variable some information computed by the first
(). A second invocation of
() caused by
REDO_ACTION allows the task to test if the taskInput is the same
as the last invocation. In that case, the application-defined
() routine can recognize that this is a
REDO_ACTION, and it
can take advantage of the cached global variable to avoid re-computing
certain quantities that would not be changed by the altered shared data.
In order to make this strategy possible,
MasterSlave() also guarantees
that in the case of
REDO_ACTION, the slave process will not have seen
any intervening calls to
() with values of
than the current value.
In typical usage, the application-defined routine,
will first call
IsUpToDate() tests if the shared data
has been modified since the current taskInput corresponding to
() was originally generated by
The times of the relevant events are recorded as when seen on the master
process. It is an error to call
IsUpToDate() outside of a call to
IsUpToDate() returns a
The last possible action,
provided for unusual cases. As with advice about the use of ``goto'', it
is recommended to avoid
CONTINUATION_ACTION() where possible.
A favorite aphorism of this author is, ``The source code is the ultimate
documentation''. With this in mind, the reader may also wish to read
lib/masslave.g, for which readability of the code was one of the design
At this point, it should be noted that it explicitly is allowed to
modify the input or output of a task from within
This is not recommended in general, but there may be times when
() returns an
UPDATE_ACTION and must also be used to
pass additional information to
(). In order to modify
a previous input or output, it is important that the application has
chosen a representation of the input or output as a list or record, which
can be modified in place, such that the code excerpt succeeds without
oldOutput := taskOutput; # Modify taskOutput here if ( IsIdenticalObj( oldOutput, taskOutput ) ) = false then Error( "MasterSlave() will see only oldOutput, not current taskOutput" ); fi; return UPDATE_ACTION;
In principle, a dirty trick like this would also work in the case of
REDO_ACTION. However, this is not recommended. For that
functionality, the code will be clearer if an explicit
) is returned. See
Section The GOTO statement of the TOP-C model for further discussion on
the use of
CONTINUATION_ACTION(), like the goto statement, is not
recommended for ordinary programs, but it may be useful in unusual
circumstances. This is a parametrized action. When the application
() returns this action,
guarantees to invoke
() on the same slave process as for the
original task. There will have been no intervening calls to
on that slave, although there may have been an intervening call to
() on that slave.
This action allows arbitrary, repeated communication between the master
and a single slave process. The slave process executes
) and communicates with the master by returning a
taskOutput. The master process executes
) and returns a taskContinuation. The
original slave process then receives another call to
), this time with taskInput bound to taskContinuation.
When you are running a long job on a network of workstations, you will often be sharing it with others. Making your parallel job as unintrusive as possible will leave you with a warmer welcome the next time that you want to use that network of workstations. Accordingly, three useful functions are provided.
This is similar to the
nice command of many UNIX shells. UNIX
priorities are in a range from −20 to 20 with −20 being the highest.
Users typically start at priority 0. You can give yourself a lower
priority by specifying a priority of 5, for example. Usually, priorities
19 and 20 are absolute priorities. Any process with a priority higher
than 19 that wishes to run will always have precedence. Other priorities
are relative priorities. Your process will still receive some CPU time
even if other processes with higher priorities are running. You can set
your priority lower, but you cannot raise it back to its original value
after that. The return value is the previous priority of your process.
This causes the process to kill itself after that many seconds. This is a
useful safety measure, since it is unfortunately too easy for a runaway
slave process to continue if the master process is killed without the
quit;. You might consider adding something like
25000 ); (about 6 hours) to your
.gaprc file. Executing
); cancels any previous alarm. The return value is the number of seconds
remaining under the previous setting of the alarm.
) [ = setrlimit(RLIMIT_RSS, ...) ] F
Many dialects of UNIX (and their shells) offer a
command to limit the resources available to the shell. This command
limits the size of the RSS (resident set size), or the amount of physical
RAM used by your process. The size limit is in bytes. Unfortunately, some
UNIX dialects may not allow or even silently ignore this request to limit
the RSS. A UNIX command such as
top can show you if your process RSS is
staying below your requested limit.
The (tutorial contains a section Raw MasterSlave (ParMultMat revisited),
about raw version of
MasterSlave() that is useful for converting
legacy sequential code to the TOP-C model. However, that model is not
recommended for writing new code, for stylistic reasons.
[Up] [Previous] [Next] [Index]