This chapter describes all the data structures used in this package. We start with a section on finite fields and what is already there in the **GAP** kernel and library. Then we describe compressed vectors and compressed matrices.

Throughout the package, elements in the field GF(p) of p elements are represented by numbers 0,1,...,p-1 and arithmetic is just the standard arithmetic in the integers modulo p.

Bigger finite fields are done by calculating in the polynomial ring GF(p)[x] in one indeterminate x modulo a certain irreducible polynomial. By convention, we use the so-called "Conway polynomials" (see http://www.math.rwth-aachen.de:8001/~Frank.Luebeck/data/ConwayPol/index.html) for this purpose, because they provide a standard way of embedding finite fields into their extension fields. Because Conway polynomials are monic, we can store coset representatives by storing polynomials of degree less than d, where d is the degree of the finite field over its prime field.

As of this writing, **GAP** has two implementation of finite field elements built into its kernel and library, the first of which stores finite field elements by storing the discrete logarithm with respect to a primitive root of the field. Although this is nice and efficient for small finite fields, it becomes unhandy for larger finite fields, because one needs a lookup table of length p^d for doing efficient addition. This implementation thus is limited to fields with less than or equal to 65536 elements. The other implementation using polynomial arithmetic modulo the Conway polynomial is used for fields with more than 65536 elements. For prime fields of characteristic p with more than that elements, there is an implementation using integers modulo p.

For this section, we assume that the base field is GF(p^d), the finite field with p^d elements, where p is a prime and d is a positive integer. This is realized as a field extension of degree d over the prime field GF(p) using the Conway polynomial c ∈ GF(p)[x]. We always represent field elements of GF(p^d) by polynomials a = \sum_{{i=0}}^{{d-1}} a_i x^i where the coefficients a_i are in GF(p). Because c is monic, this gives a one-to-one correspondence between polynomials and finite field elements.

The memory layout for compressed vectors is determined by two important constants, depending only on p and the word length of the machine. The word length is 4 bytes on 32bit machines (for example on the i386 architecture) and 8 bytes on 64bit machines (for example on the AMD64 architecture). More concretely, a "`Word`

" is an `unsigned long int`

in C and the length of a `Word`

is `sizeof(unsigned long int)`

.

Those constants are `bitsperel`

(bits per prime field element) and `elsperword`

(prime field elements per `Word`

). `bitsperel`

is 1 for p=2 and otherwise the smallest integer, such that 2^bitsperel > 2⋅ p-1. This means, that we can store the numbers from 0 to 2⋅ p - 1 in `bitsperel`

bits. `elsperword`

is 32/bitsperel rounded down to the next integer and multiplied by 2 if the length of a `Word`

is 8 bytes. Note that we thus store as many prime field elements as possible into one `Word`

, however, on 64bit machines we store only twice as much as would fit into 32bit, even if we could pack one more into a `Word`

. This has technical reasons to make conversion between different architectures more efficient.

These definitions imply that we can put `elsperword`

prime field elements into one `Word`

. We do this by using the `bitsperel`

least significant bits in the `Word`

for the first prime field element, using the next `bitsperel`

bits for the next prime field element and so on. Here is an example that shows how the 6 finite field elements 0,1,2,3,4,5 of GF(11) are stored in that order in one 32bit `Word`

. `bitsperel`

is here 4, because 2^4 < 2⋅ 11 - 1 = 21 < 2^5. Therefore `elsperword`

is 5 on a 32bit machine.

most significant xx|.....|.....|.....|.....|.....|..... least significant 00|00101|00100|00011|00010|00001|00000 | 5| 4| 3| 2| 1| 0

Here is another example that shows how the 20 finite field elements 0,1,2,0,0,0,1,1,1,2,2,2,0,1,2,2,1,0,2,2 of GF(3) are stored in that order in one 64bit `Word`

. `bitsperel`

is here 3, because 2^2 < 2⋅ 3 - 1 = 5 < 2^3. Therefore `elsperword`

is 20 on a 32bit machine. Remember, that we only put twice as many elements in one 64bit `Word`

than we could in one 32bit `Word`

!

xxxx..!..!..!..!..!..!..!..!..!..!..!..!..!..!..!..!..!..!..!..! 0000010010000001010010001000010010010001001001000000000010001000 2 2 0 1 2 2 1 0 2 2 2 1 1 1 0 0 0 2 1 0

(The exclamation marks mark the right hand side of the prime field elements.)

Note that different architectures store their `Word`

s in a different byte order in memory ("endianess"). We do *not* specify how the data is distributed into bytes here! All access is via `Word`

s and their arithmetic (shifting, addition, multiplication, etc.). See Section 3.4 for a discussion of this with respect to our external representation.

Now that we have seen how prime field elements are packed into `Word`

s, we proceed to the description how compressed vectors are stored over arbitrary extension fields:

Assume a compressed vector has length l with l ≥ 0. If d=1 (prime field), it just uses elsperword/l `Word`

s (division rounded up to the next integer), where the first `Word`

stores the leftmost `elsperword`

field elements in the first `Word`

as described above and so on. This means, that the very first field element is stored in the least significant bits of the first `Word`

.

In the extension field case GF(p^d), a vector of length l uses (elsperword/l)*d `Word`

s (division rounded up to the next integer), where the first d `Word`

s store the leftmost `elsperword`

field elements. The very first word stores all the constant coefficients of the polynomials representing the first `elsperword`

field elements in their order from left to right, the second `Word`

stores the coefficients of x^1 and so on until the d'th `Word`

, which stores the coefficients of x^{d-1}. Unused entries behind the end of the actual vector data within the last `Word`

has to be zero!.

The following example shows, how the 9 field elements x^2+x+1, x^2+2x+2, x^2+3x+3, x^2+4x+4, 2x^2+x, 2x^2+3x+1, 2x^2+4x+2, 3x^2+1, and 4x^2+x+3 of GF(5^3) occurring in a vector of length 9 in that order are stored on a 32bit machine:

^^^ lower memory addresses ^^^ ....|....|....|....|....|....|....|.... (least significant bit) 1st Word: 0001|0010|0001|0000|0100|0011|0010|0001| (first 2nd Word: 0000|0100|0011|0001|0100|0011|0010|0001| 8 field 3rd Word: 0011|0010|0010|0010|0001|0001|0001|0001| elements) --------------------------------------------------- 4th Word: 0000|0000|0000|0000|0000|0000|0000|0011| (second 5th Word: 0000|0000|0000|0000|0000|0000|0000|0001| 8 field 6th Word: 0000|0000|0000|0000|0000|0000|0000|0100| elements) VVV higher memory addresses VVV

A "`cvec`

" (one of our compressed vectors) is a **GAP** "Data object" (that is with `TNUM`

equal to `T_DATOBJ`

). The first machine word in its bag is a pointer to its type, from the second machine word on the `Word`

s containing the above data are stored. The bag is exactly long enough to hold the type pointer and the data `Word`

s.

But how does the system know, over which field the vector is and how long it is? The type of a **GAP** object consists of 3 pieces: A family, a bit list (for the filters), and one **GAP** object for "defining data". The additional information about the vector is stored in the third piece, the defining data, and is called a "`cvecclass`

".

A `cvecclass`

is a positional object with 5 components:

Position | Name | Content |

1 | `IDX_fieldinfo` |
a `fieldinfo` object, see below |

2 | `IDX_len` |
the length of the vector as immediate GAP integer |

3 | `IDX_wordlen` |
the number of `Word` s used as immediate GAP integer |

4 | `IDX_type` |
a GAP type used to create new vectors |

5 | `IDX_GF` |
a GAP object for the base field |

6 | `IDX_lens` |
a list holding lengths of vectors in `cvecclasses` for the same field |

7 | `IDX_classes` |
a list holding `cvecclass` es for the same field with lengths as in entry number 6 |

In position 5 we have just the **GAP** finite field object `GF(p,d)`

. The names appear as symbols in the code.

The field is described by the **GAP** object in position 1, a so-called `fieldinfo`

object, which is described in the following table:

Position |
Name | Content |

1 | `IDX_p` |
p as an immediate GAP integer |

2 | `IDX_d` |
d as an immediate GAP integer |

3 | `IDX_q` |
q=p^d as a GAP integer |

4 | `IDX_conway` |
a GAP string containing the coefficients of the Conway Polynomial as unsigned int [] |

5 | `IDX_bitsperel` |
number of bits per element of the prime field (`bitsperel` ) |

6 | `IDX_elsperword` |
prime field elements per `Word` (`elsperword` ) |

7 | `IDX_wordinfo` |
a GAP string containing C data for internal use |

8 | `IDX_bestgrease` |
the best grease level (see Section 5.8) |

9 | `IDX_greasetabl` |
the length of a grease table using best grease |

10 | `IDX_filts` |
a filter list for the creation of new vectors over this field |

11 | `IDX_tab1` |
a table for GF(q) to `[0..q-1]` conversion if q ≤ 65536 |

12 | `IDX_tab2` |
a table for `[0..q-1]` to GF(q) conversion if q ≤ 65536 |

13 | `IDX_size` |
0 for q ≤ 65536, otherwise 1 if q is a GAP immediate integer and 2 if not |

14 | `IDX_scafam` |
the scalars family |

Note that **GAP** has a family for all finite field elements of a given characteristic p, vectors over a finite field are then in the collections family of that family and matrices are in the collections family of the collections family of the scalars family. We use the same families in the same way for compressed vectors and matrices.

The following limits come from the above mentioned data format or other internal restrictions (an "immediate integer" in **GAP** can take values between 2^{-28} and (2^{28})-1 inclusively on 32bit machines and between 2^{-60} and (2^{60})-1 on 64bit machines):

The prime p must be an immediate integer.

The degree d must be smaller than 1024 (this limit is arbitrarily chosen at the moment and could be increased easily).

The Conway polynomial must be known to

**GAP**.The length of a vector must be an immediate integer.

The implementation of `cmats`

(compressed matrices) is done mainly on the **GAP** level without using too many C parts. Only the time critical parts for some operations for matrices are done in the kernel.

A `cmat`

object is a positional object with at least the following components:

Component name | Content |

`len` |
the number of rows, can be 0 |

`vecclass` |
a `cvecclass` object describing the class of rows |

`scaclass` |
a `cscaclass` object holding a reference to `GF(p,d)` |

`rows` |
a list containing the rows of the matrix, which are `cvec` s |

`greasehint` |
the recommended greasing level |

The `cvecclass`

in the component `vecclass`

determines the number of columns of the matrix by the length of the rows.

The length of the list in component `rows`

is `len+1`

, because the first position is equal to the integer 0 such that the type of the list `rows`

can always be computed efficiently. The rows are then stored in positions 2 to `len+1`

.

The component `greasehint`

contains the greasing level for the next matrix multiplication, in which this matrix occurs as the factor on the right hand side (if greasing is worth the effort, see Section 5.8).

A `cmat`

can be "pre-greased", which just means, that a certain number of linear combinations of its rows is already precomputed (see Section 5.8). In that case, the object is in the additional filter `HasGreaseTab`

and the following components are bound additionally:

Component name | Content |

`greaselev` |
the grease level |

`greasetab` |
the grease table, a list of lists of `cvecs` |

`greaseblo` |
the number of greasing blocks |

`spreadtab` |
a lookup table for indexing the right linear combination |

This section describes the external representation of matrices. A special data format is needed here, because of differences between computer architectures with respect to word length (32bit/64bit) and endianess. The term "endianess" refers to the fact, that different architectures store their long words in a different way in memory, namely they order the bytes that together make up a long word in different orders.

The external representation is the same as the internal format of a 32bit machine with little endianess, which means, that the lower significant bytes of a `Word`

are stored in lower addresses. The reasons for this decision are firstly that 64bit machines can do the bit shifting to convert between internal and external representation easier using their wide registers, and secondly, that the nowadays most popular architectures i386 and AMD64 use both little endianess, such that conversion is only necessary on a minority of machines.

`cmat`

fileThe header of a `cmat`

file consists of 5 words of 64bit each, that are stored in little endian byte order:

the magic value "`GAPCMat1` " as ASCII letters (8 bytes) in this order |

the value of p to describe the base field |

the value of d to describe the base field |

the number of rows of the matrix |

the number of columns of the matrix |

After these 40 bytes follow the data words as described above using little endian 32bit `Word`

s as in the internal representation on 32bit machines.

Note that the decision to put not more than twice as many prime field elements into a 64bit `Word`

than would fit into a 32bit `Word`

makes the conversion between internal and external representation much easier to implement.

generated by GAPDoc2HTML