> < ^ From:

> < ^ Subject:

In his e-mail message of 1993/02/01 Ralf Dentzer writes:

And as I am just at it, do the authors of "combinat.tst" know

what is mainly measured by this benchmark? From a gprof it seems

that most of the time is spent in the GAP Interpreter (>50 %)

and only very little for doing arithmetic and memory management (<5 % each).

Does this reflect time usage of typical GAP routines? What about

permutations, AG groups, finite fields, cyclotomic fields?

It is quite difficult to say what a typical use of GAP is. For example,

if one works with permutation groups of degree over 100, one sees an

entire different picture. On the other hand, computations with character

tables that involve only very few irrationalities might actually show a

similar distribution (I never tried this though).

And the answer is, yes, we are aware of the specific distribution of

"combinat.tst". But we have found that the performance reported by

"combinat.tst" is reasonably close to our ``feeling'' of how fast the

machines we have tried it on are.

Actually one can even be a bit more specific. Here is the output of

'pixie' (the most accurate profiling tool, because it actually counts

cycles, unfortunately only available on machines with the MIPS processor)

for GAP 3.1 running "combinat.tst". The first column contains the number

of cycles, the second the percentage of the total number of cycles, the

third the number of calls, and the fourth the name of the function. I

have grouped the functions according to the package they belong to, and

have ommitted functions whose percentage was below 0.1%.

Interpreter (eval 22.08%, function 19.83%, statemen 14.59%)

33595913 7.65 2584301 EvVar (eval.c) 9776625 2.23 196075 Eq (eval.c) 9435970 2.15 190798 Diff (eval.c) 9324288 2.12 181912 Sum (eval.c) 8133406 1.85 71330 FunShallowCopy (eval.c) 6900221 1.57 188588 EvVarAss (eval.c) 6238983 1.42 143309 EvAnd (eval.c) 5070936 1.15 101630 EvOr (eval.c) 2867693 0.65 39239 Prod (eval.c) 2094182 0.48 38040 Ne (eval.c) 1875496 0.43 33166 Le (eval.c) 435759 0.10 6307 Mod (eval.c) 55270007 12.59 250223 EvFunccall (function.c) 28437062 6.48 183380 ChangeEnv (function.c) 3205685 0.73 91591 EvReturn (function.c) 38293896 8.72 266957 EvIf (statemen.c) 17140123 3.90 53408 EvFor (statemen.c) 8393290 1.91 89811 EvStatseq (statemen.c)

Memory Management (gasman 21.51%)

33964602 7.74 308242 NewBag (gasman.c) 32295304 7.36 10 CollectGarb (gasman.c) 11116554 2.53 362625 ExitKernel (gasman.c) 8523620 1.94 57558 Resize (gasman.c) 4714125 1.07 362625 EnterKernel (gasman.c) 2938674 0.67 186597 NrHandles (gasman.c) 873862 0.20 1 InitGasman (gasman.c)

Lists (list 16.99%, vector 0.02%)

31478498 7.17 379801 EvListElm (list.c) 19866656 4.52 162053 EvListAss (list.c) 13102065 2.98 60932 FunAppend (list.c) 5375002 1.22 68101 MakeList (list.c) 1429617 0.33 68077 EvMakeList (list.c) 1249837 0.28 12062 FunAdd (list.c) 946479 0.22 4636 PosList (list.c)

Arithmetic (integer 1.21%, rational 0.17%)

1469833 0.33 24474 QuoInt (integer.c) 1396376 0.32 27243 ProdInt (integer.c) 1194533 0.27 16211 GcdInt (integer.c) 524400 0.12 4930 SumInt (integer.c)423835 0.10 4678 QuoRat (rational.c)

Reading (scanner 1.38%, read 0.54%, indents 0.2%, system 0.54%, libc 0.87%)

2086621 0.48 19636 GetSymbol (scanner.c) 1383089 0.31 7972 GetIdent (scanner.c) 924253 0.21 24385 PutChr (scanner.c) 745285 0.17 5104 Pr (scanner.c) 881700 0.20 5460 FindIdent (idents.c) 1133070 0.26 226614 SyIsIntr (system.c) 855227 0.19 4319 SyFgets (system.c) 2982948 0.68 4317 fgets (../fgets.c)

A total of 56.5% of the running time are spent in the interpreter.

There most of the time is spent evaluating function calls ('EvFunccall'

and 'ChangeEnv'). This is because most of the functions in the

combinatorics package work in a highly recursive manner, and thus the

number of function calls is unusually high.

(Looking at the number of calls to 'ChangeEnv' (which is called twice by

'EvFunccall' for every non-internal GAP function), one sees that 91690

calls are calls to GAP functions, and the remaining 158533 are calls to

internal functions: 71330 'ShallowCopy', 60932 'Append', 12062 'Add',

5439 'Length', 4004 'Position', 2071 'IsList', 1976 'IsFunc', and 719

others. Comparing the numbers for 'ChangeEnv' and 'EvReturn' we see that

there were 99 calls to functions which did not return anything, most

likely they are all calls to 'Sort'. Finally note that both 'EvIf' and

'EvFor' have 'EvStatseq' inlined, i.e., they do not call 'EvStatseq' to

evaluate their bodies. So the number of calls of 'EvStatseq' reflects

the number of calls to GAP functions that have more than a single

statement in their body. So comparing the numbers for 'ChangeEnv' and

'EvStatseq' we see that about 1880 calls where to functions with only a

single statement in their body, e.g., functions of the form '<var> ->

<expr>'.)

'EvVar' contributes a large percentage of the total running time, simply

because it is call so often. Improving the speed of 'EvVar' would

therefor speed up GAP quit a bit. Unfortunately 'EvVar' is already as

minimal as it can be.

All in all the amount of time spent in the interpreter is much higher

than one would expect from other computations. Especially the large

number of function calls is not typical.

The memory management uses 21.5% of the total running time. 288655 of

the 308242 created bags ('NewBag') are easily accounted for: 91690 from

'EvFunccall', 71330 from 'ShallowCopy', 68077 from 'MakeList', and 57558

from 'Resize' calls. Note that the number of calls to 'Resize' is again

unusually high.

(We are thinking about a new memory manager (Ralf has actually provided a

prototype for such a memory manager). This new memory manager will make

'NewBag' a lot faster (probably a factor of 5 or so). 'NewBag' will also

become a macro, so that it will be more difficult to see its influence in

future profiles. 'CollectGarb' will become faster (because it will use

generations), 'ExitKernel' and 'EnterKernel' will no longer be

neccessary, and the number of calls to 'NewBag' will also be reduced

(because it is called most of the time from 'CollectGarb'). So I think

that the cost for memory management for this test could be reduced to

well below 10% with this new memory manager.)

Other computations will usually spent less time in the memory management,

unless most of the objects allocated are rather small, such as in this

computation. And of course, if the workspace is too small, so that too

many garbage collections are neccessary, the percentage of time spent in

'CollectGarb' can become arbitrarily high (I remember a Schreier Sims

computation which I once did on my Atari, where a garbage collection was

neccessary each time after allocating 20 permutations).

Now "combinat.tst" mainly works with lists (of integers), thus it is not

surprising that 17% of the total time is spent in the list package.

Especially the number of assignments to list elements is way above the

norm.

Again all in all one would expect other computations to spent less time

in the list package. But since lists are the main data structure in GAP

almost all computations spent quite a bit of their time in this package.

The amount of time spent in the arithmetic is small. This is because

arithmetic of immediate integers (i.e., -2^28 <= n < 2^28) is done in

'Eq', 'Sum', 'Diff', etc., without calling other functions. What you see

in 'SumInt', etc. are only the few computations that involve large

integers (e.g., Binomial( 400, 50 )).

Other computations that involve a lot of large integers, rationals,

cyclotomics, permutations, words, etc., will probably spent most of their

time in the respective arithmetic package.

And of course the time spent reading is very small. If you only do a

small computation with a group (where the whole library must first be

read in), this would contribute a much higher amount.

Martin.

-- .- .-. - .. -. .-.. --- ...- . ... .- -. -. .. -.- .- Martin Sch"onert, Martin.Schoenert@Math.RWTH-Aachen.DE, +49 241 804551 Lehrstuhl D f"ur Mathematik, Templergraben 64, RWTH, D 51 Aachen, Germany

> < [top]