Goto Chapter: Top 1 2 3 4 5 6 7 8 9 10 11 Ind
 [Top of Book]  [Contents]   [Previous Chapter]   [Next Chapter] 

5 Object serialisation (Pickling)
 5.1 Result objects
 5.2 Pickling and unpickling
 5.3 Extending the pickling framework

5 Object serialisation (Pickling)

The idea of "object serialisation" is that one wants to store nearly arbitrary GAP objects to disk or transfer them over the network. To this end, one wants to convert them to a byte stream that is platform independent and can later be converted back to a copy of the same object in memory, be it in the same GAP process or another one maybe even on another machine. The main problem here are the vast amount of different types occurring in GAP and the possibly highly self-referential structure of GAP objects.

The IO package contains a framework to implement object serialisation and implementations for most of the basic data types in GAP. The framework is easily extendible to other types and takes complete care of self-references and corresponding problems. It builds upon the buffered I/O functions described in Section 4. We start by describing the user interface.

5.1 Result objects

The following static objects are used to report about success or failure of the (un-)pickling operations:

5.1-1 IO_Error
‣ IO_Error( global variable )

This object is returned if an error occurs.

5.1-2 IO_Nothing
‣ IO_Nothing( global variable )

This object is returned when there is nothing to return, for example if an unpickler (see IO_Unpickle (5.2-2)) encounters the end of a file.

5.1-3 IO_OK
‣ IO_OK( global variable )

This object is returned if everything went well and there is no other canonical value to return to indicate this.

The only thing you can do with these special values is to compare them to each other and to other objects.

5.2 Pickling and unpickling

5.2-1 IO_Pickle
‣ IO_Pickle( f, ob )( operation )

Returns: IO_OK or IO_Error

The argument f must be an open, writable File object. The object ob can be an arbitrary GAP object. The operation "pickles" or "serialises" the object ob and writes the result into the File object f. If everything is OK, the unique value IO_OK is returned and otherwise the unique value IO_Error. The resulting byte stream can be read again using the operation IO_Unpickle (5.2-2) and is platform- and architecture independent. Especially the question whether a system has 32 bit or 64 bit wide words and the question of endianess does not matter.

Note that not all of GAP's object types are supported but it is relatively easy to extend the system. This package supports in particular boolean values, integers, permutations, rational numbers, finite field elements, cyclotomics, strings, polynomials, rational functions, lists, records, compressed vectors and matrices over finite fields (objects are uncompressed in the byte stream but recompressed during unpickling), and straight line programs.

Self-referential objects built from records and lists are handled correctly and are restored completely with the same self-references during unpickling.

5.2-2 IO_Unpickle
‣ IO_Unpickle( f )( operation )

Returns: IO_Error or a GAP object

The argument f must be an open, readable File object. The operation reads from f and "unpickles" the next object. If an error occurs, the unique value IO_Error is returned. If the File object is at end of file, the value IO_Nothing is returned. Note that these two values are not picklable, because of their special meaning as return values of this operation here.

5.2-3 IO_ClearPickleCache
‣ IO_ClearPickleCache( )( function )

Returns: Nothing

This function clears the "pickle cache". This cache stores all object pickled in the current recursive call to IO_Pickle (5.2-1) and is necessary to handle self-references. Usually it is not necessary to call this function explicitly. Only in the rare case (that should not happen) that a pickling or unpickling operation enters a break loop which is left by the user, the pickle cache has to be cleared explicitly using this function for later calls to IO_Pickle (5.2-1) and IO_Unpickle (5.2-2) to work!

5.3 Extending the pickling framework

The framework can be extended for other GAP object types as follows:

For pickling, a method for the operation IO_Pickle (5.2-1) has to be installed which does the work. If the object to be pickled has subobjects, then the first action of the method is to call the function IO_AddToPickled with the object as argument. This will put it into the pickle cache and take care of self-references. Arbitrary subobjects can then be pickled using recursive calls to the operation IO_Pickle (5.2-1) handing down the same File object into the recursion. The method must either return IO_Error in case of an error or IO_OK if everything goes well. Before returning, a method that has called IO_AddToPickled must call the function IO_FinalizePickled without arguments under all circumstances. If this call is missing, global data for the pickling procedure becomes corrupt!

Every pickling method must first write a 4 byte magic value such that later during unpickling of the byte stream the right unpickling method can be called (see below). Then it can write arbitrary data, however, this data should be platform- and architecture independent, and it must be possible to unpickle it later without "lookahead".

Pickling methods should usually not go into a break loop, because after leaving the user has to call IO_ClearPickleCache (5.2-3) explicitly!

Unpickling is implemented as follows: For every 4 byte magic value there must be a function bound to that value in the record IO_Unpicklers. If the unpickling operation IO_Unpickle (5.2-2) encounters that magic value, it calls the corresponding unpickling function. This function just gets one File object as argument. Since the magic value is already read, it can immediately start with reading and rebuilding the serialised object in memory. The method has to take care to restore the object including its type completely.

If an object type has subobjects, the unpickling function has to first create a skeleton of the object without its subobjects, then call IO_AddToUnpickled on this skeleton, before unpickling subobjects. If things are not done in this order, the handling of self-references down in the recursion will not work! An unpickling function that has called IO_AddToUnpickled at the beginning has to call IO_FinalizeUnpickled without arguments before returning under all circumstances! If this call is missing, global data for the unpickling procedure becomes corrupt!

Of course, unpickling functions can recursively call IO_Unpickle (5.2-2) to unpickle subobjects. Apart from this, unpickling functions can use arbitrary reading functions on the File object. However, they should only read sequentially and never move the current file position pointer otherwise. An unpickling function should return the newly created object or the value IO_Error if an error occurred. They should never go into a break loop, because after leaving the user has to call IO_ClearPickleCache (5.2-3) explicitly!

Perhaps the best way to learn how to extend the framework is to study the code for the basic GAP objects in the file pkg/io/gap/pickle.gi.

 [Top of Book]  [Contents]   [Previous Chapter]   [Next Chapter] 
Goto Chapter: Top 1 2 3 4 5 6 7 8 9 10 11 Ind

generated by GAPDoc2HTML