Goto Chapter: Top 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 Bib Ind

People and computers spend a large amount of time with searching. Dictionaries are an abstract data structure which facilitates searching for objects. Depending on the kind of objects the implementation will use a variety of possible internal storage methods that will aim to provide the fastest possible access to objects. These internal methods include

**Hash Tables**for objects for which a hash function has been defined.

**Direct Indexing**if the domain is small and cheaply enumerable

**Sorted Lists**if a total order can be computed easily

**Plain lists**for objects for which nothing but an equality test is available.

The standard way to use dictionaries is to first create a dictionary (using `NewDictionary`

(28.2-1), and then to store objects (and associated information) in it and look them up.

For the creation of objects the user has to make a few choices: Is the dictionary only to be used to check whether objects are known already, or whether associated information is to be stored with the objects. This second case is called a *lookup dictionary* and is selected by the second parameter of `NewDictionary`

(28.2-1).

The second choice is to indicate which kind of objects are to be stored. This choice will decide the internal storage used. This kind of objects is specified by the first parameter to `NewDictionary`

(28.2-1), which is a "sample" object.

In some cases however such a sample object is not specific enough. For example when storing vectors over a finite field, it would not be clear whether all vectors will be over a prime field or over a field extension. Such an issue can be resolved by indicating in an (optional) third parameter to `NewDictionary`

(28.2-1) a *domain* which has to be a collection that will contain all objects to be used with this dictionary. (Such a domain may also be used internally to decide that direct indexing can be used).

The reason for this choice of giving two parameters is that in some cases no suitable collection of objects has been defined in **GAP** - for example for permutations there is no object representing the symmetric group on infinitely many points.

Once a dictionary has been created, it is possible to use `RepresentationsOfObject`

(13.4-1) to check which representation is used by **GAP**.

In the following example, we create a dictionary to store permutations with associated information.

gap> d:=NewDictionary((1,2,3),true);; gap> AddDictionary(d,(1,2),1); gap> AddDictionary(d,(5,6),9); gap> AddDictionary(d,(4,7),2); gap> LookupDictionary(d,(5,6)); 9 gap> LookupDictionary(d,(5,7)); fail

A typical example of this use would be in an orbit algorithm. The dictionary would be used to store the elements known in the orbit together with their respective orbit positions.

We observe that this dictionary is stored internally by a sorted list. On the other hand, if we have an explicit, sorted element list, direct indexing is to be used.

gap> RepresentationsOfObject(d); [ "IsComponentObjectRep", "IsNonAtomicComponentObjectRep", "IsDictionaryDefaultRep", "IsListDictionary", "IsListLookupDictionary", "IsSortDictionary", "IsSortLookupDictionary" ] gap> d:=NewDictionary((1,2,3),true,Elements(SymmetricGroup(5)));; gap> RepresentationsOfObject(d); [ "IsComponentObjectRep", "IsNonAtomicComponentObjectRep", "IsDictionaryDefaultRep", "IsPositionDictionary", "IsPositionLookupDictionary" ]

(Just indicating `SymmetricGroup(5)`

as a third parameter would still keep the first storage method, as indexing would be too expensive if no explicit element list is known.)

The same effect happens in the following example, in which we work with vectors: Indicating only a vector only enables sorted index, as it cannot be known whether all vectors will be defined over the prime field. On the other hand, providing the vector space (and thus limiting the domain) enables the use of hashing (which will be faster).

gap> v:=GF(2)^7;; gap> d:=NewDictionary(Zero(v),true);; gap> RepresentationsOfObject(d); [ "IsComponentObjectRep", "IsNonAtomicComponentObjectRep", "IsDictionaryDefaultRep", "IsListDictionary", "IsListLookupDictionary", "IsSortDictionary", "IsSortLookupDictionary" ] gap> d:=NewDictionary(Zero(v),true,v);; gap> RepresentationsOfObject(d); [ "IsComponentObjectRep", "IsNonAtomicComponentObjectRep", "IsDictionaryDefaultRep", "IsPositionDictionary", "IsPositionLookupDictionary" ]

This section contains the formal declarations for dictionaries. For information on how to use them, please refer to the previous section 28.1. There are several ways how dictionaries are implemented: As lists, as sorted lists, as hash tables or via binary lists. A user however will just have to call `NewDictionary`

(28.2-1) and obtain a "suitable" dictionary for the kind of objects she wants to create. It is possible however to create hash tables (see 28.4) and dictionaries using binary lists (see `DictionaryByPosition`

(28.3-1)).

The use of two objects, `obj` and `objcoll` to parametrize the objects a dictionary is able to store might look confusing. However there are situations where either of them might be needed:

The first situation is that of objects, for which no formal "collection object" has been defined. A typical example here might be subspaces of a vector space. **GAP** does not formally define a "Grassmannian" or anything else to represent the multitude of all subspaces. So it is only possible to give the dictionary a "sample object".

The other situation is that of an object which might represent quite varied domains. The permutation \((1,10^6)\) might be the nontrivial element of a cyclic group of order 2, it might be a representative of \(S_{{10^6}}\). In the first situation the best approach might be just to have two entries for the two possible objects, in the second situation a much more elaborate approach might be needed.

An algorithm that creates a dictionary will usually know a priori, from what domain all the objects will be, giving this domain permits to use a more efficient dictionary.

This is particularly true for vectors. From a single vector one cannot decide whether a calculation will take place over the smallest field containing all its entries or over a larger field.

`‣ NewDictionary` ( obj, look[, objcoll] ) | ( function ) |

creates a new dictionary for objects such as `obj`. If `objcoll` is given the dictionary will be for objects only from this collection, knowing this can improve the performance. If `objcoll` is given, `obj` may be replaced by `false`

, i.e. no sample object is needed.

The function tries to find the right kind of dictionary for the basic dictionary functions to be quick. If `look` is `true`

, the dictionary will be a lookup dictionary, otherwise it is an ordinary dictionary.

As there are situations where the approach via binary lists is explicitly desired, such dictionaries can be created deliberately.

`‣ DictionaryByPosition` ( list, lookup ) | ( function ) |

creates a new (lookup) dictionary which uses `PositionCanonical`

(21.16-3) in `list` for indexing. The dictionary will have an entry `dict``!.blist`

which is a bit list corresponding to `list` indicating the known values. If `look` is `true`

, the dictionary will be a lookup dictionary, otherwise it is an ordinary dictionary.

`‣ IsDictionary` ( obj ) | ( category ) |

A dictionary is a growable collection of objects that permits to add objects (with associated values) and to check whether an object is already known.

`‣ IsLookupDictionary` ( obj ) | ( category ) |

A *lookup dictionary* is a dictionary, which permits not only to check whether an object is contained, but also to retrieve associated values, using the operation `LookupDictionary`

(28.3-6).

`‣ AddDictionary` ( dict, key[, val] ) | ( operation ) |

adds `key` to the dictionary `dict`, storing the associated value `val` in case `dict` is a lookup dictionary. If `key` is not an object of the kind for which the dictionary was specified, or if `key` is known already to `dict`, the results are unpredictable.

`‣ KnowsDictionary` ( dict, key ) | ( operation ) |

checks, whether `key` is known to the dictionary `dict`, and returns `true`

or `false`

accordingly. `key` *must* be an object of the kind for which the dictionary was specified, otherwise the results are unpredictable.

`‣ LookupDictionary` ( dict, key ) | ( operation ) |

looks up `key` in the lookup dictionary `dict` and returns the associated value. If `key` is not known to the dictionary, `fail`

is returned.

These sections describe some particularities for hash tables. These are intended mainly for extending the implementation - programs requiring hash functionality ought to use the dictionary interface described above.

We hash by keys and also store a value. Keys cannot be removed from the table, but the corresponding value can be changed. Fast access to last hash index allows you to efficiently store more than one array of values –this facility should be used with care.

This code works for any kind of object, provided you have a `DenseIntKey`

(28.5-1) method to convert the key into a positive integer. This method should ideally be implemented efficiently in the core.

Note that, for efficiency, it is currently impossible to create a hash table with non-positive integers.

The crucial step of hashing is to transform key objects into integers such that equal objects produce the same integer.

The actual function used will vary very much on the type of objects. However **GAP** provides already key functions for some commonly encountered objects.

`‣ DenseIntKey` ( objcoll, obj ) | ( operation ) |

returns a function that can be used as hash key function for objects such as `obj` in the collection `objcoll`. Typically, `objcoll` will be a large domain. If the domain is not available, it can be given as `false`

in which case the hash key function will be determined only based on `obj`. (For a further discussion of these two arguments see `NewDictionary`

(28.2-1)).

The function returned by `DenseIntKey`

is guaranteed to give different values for different objects. If no suitable hash key function has been predefined, `fail`

is returned.

`‣ SparseIntKey` ( objcoll, obj ) | ( operation ) |

returns a function that can be used as hash key function for objects such as `obj` in the collection `objcoll`. In contrast to `DenseIntKey`

(28.5-1), the function returned may return the same key value for different objects. If no suitable hash key function has been predefined, `fail`

is returned.

Dense hash tables are used for hashing dense sets without collisions, in particular integers. Keys are stored as an unordered list and values as an array with holes. The position of a value is given by the function returned by `DenseIntKey`

(28.5-1), and so `KeyIntDense`

must be one-to-one.

`‣ DenseHashTable` ( ) | ( function ) |

Construct an empty dense hash table. This is the only correct way to construct such a table.

Sparse hash tables are used for hashing sparse sets. Stores keys as an array with fail denoting an empty position, stores values as an array with holes. Uses the result of calling `SparseIntKey`

(28.5-2)) of the key. `DefaultHashLength`

is the default starting hash table length; the table is doubled when it becomes half full.

In sparse hash tables, the integer obtained from the hash key is then transformed to an index position by taking it modulo the length of the hash array.

`‣ SparseHashTable` ( [intkeyfun[, startlength]] ) | ( function ) |

Construct an empty sparse hash table. This is the only correct way to construct such a table. If the argument `intkeyfun` is given, this function will be used to obtain numbers for the keys passed to it. If also `startlength` is given, the hash table will be initalized at that size.

`‣ DoubleHashArraySize` ( hash ) | ( function ) |

Double the size of the hash array and rehash all the entries. This will also happen automatically when the hash array is half full.

Goto Chapter: Top 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 Bib Ind

generated by GAPDoc2HTML