Goto Chapter: Top 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 Bib Ind

Starting with version 4.5, **GAP** has built-in support for floating-point numbers in machine format, and allows package to implement arbitrary-precision floating-point arithmetic in a uniform manner. For now, one such package, **Float** exists, and is based on the arbitrary-precision routines in **mpfr**.

A word of caution: **GAP** deals primarily with algebraic objects, which can be represented exactly in a computer. Numerical imprecision means that floating-point numbers do not form a ring in the strict **GAP** sense, because addition is in general not associative (`(1.0e-100+1.0)-1.0`

is not the same as `1.0e-100+(1.0-1.0)`

, in the default precision setting).

Most algorithms in **GAP** which require ring elements will therefore not be applicable to floating-point elements. In some cases, such a notion would not even make any sense (what is the greatest common divisor of two floating-point numbers?)

Floating-point numbers can be input into **GAP** in the standard floating-point notation:

gap> 3.14; 3.14 gap> last^2/6; 1.64327 gap> h := 6.62606896e-34; 6.62607e-34 gap> pi := 4*Atan(1.0); 3.14159 gap> hbar := h/(2*pi); 1.05457e-34

Floating-point numbers can also be created using `Float`

, from strings or rational numbers; and can be converted back using `String,Rat,Int`

.

**GAP** allows rational and floating-point numbers to be mixed in the elementary operations `+,-,*,/`

. However, floating-point numbers and rational numbers may not be compared. Conversions are performed using the creator `Float`

:

gap> Float("3.1416"); 3.1416 gap> Float(355/113); 3.14159 gap> Rat(last); 355/113 gap> Rat(0.33333); 1/3 gap> Int(1.e10); 10000000000 gap> Int(1.e20); 100000000000000000000 gap> Int(1.e30); 1000000000000000019884624838656

Floating-point numbers may be directly input, as in any usual mathematical software or language; with the exception that every floating-point number must contain a decimal digit. Therefore `.1`

, `.1e1`

, `-.999`

etc. are all valid **GAP** inputs.

Floating-point numbers so entered in **GAP** are stored as strings. They are converted to floating-point when they are first used. This means that, if the floating-point precision is increased, the constants are reevaluated to fit the new format.

Floating-point numbers may be followed by an underscore, as in `1._`

. This means that they are to be immediately converted to the current floating-point format. The underscore may be followed by a single letter, which specifies which format/precision to use. By default, **GAP** has a single floating-point handler, with fixed (53 bits) precision, and its format specifier is `'l'`

as in `1._l`

. Higher-precision floating-point computations is available via external packages; **float** for example.

A record, `FLOAT`

(19.2-5), contains all relevant constants for the current floating-point format; see its documentation for details. Typical fields are `FLOAT.MANT_DIG=53`

, the constant `FLOAT.VIEW_DIG=6`

specifying the number of digits to view, and `FLOAT.PI`

for the constant \(\pi\). The constants have the same name as their C counterparts, except for the missing initial `DBL_`

or `M_`

.

Floating-point numbers may be created using the single function `Float`

(19.2-1), which accepts as arguments rational, string, or floating-point numbers. Floating-point numbers may also be created, in any floating-point representation, using `NewFloat`

(19.2-1) as in `NewFloat(IsIEEE754FloatRep,355/113)`

, by supplying the category filter of the desired new floating-point number; or using `MakeFloat`

(19.2-1) as in `NewFloat(1.0,355/113)`

, by supplying a sample floating-point number.

Floating-point numbers may also be converted to other **GAP** formats using the usual commands `Int`

(14.2-3), `Rat`

(17.2-6), `String`

(27.7-6).

Exact conversion to and from floating-point format may be done using external representations. The "external representation" of a floating-point number `x`

is a pair `[m,e]`

of integers, such that `x=m*2^(-1+e-LogInt(AbsInt(m),2))`

. Conversion to and from external representation is performed as usual using `ExtRepOfObj`

(79.16-1) and `ObjByExtRep`

(79.16-1):

gap> ExtRepOfObj(3.14); [ 7070651414971679, 2 ] gap> ObjByExtRep(IEEE754FloatsFamily,last); 3.14

Computations with floating-point numbers never raise any error. Division by zero is allowed, and produces a signed infinity. Illegal operations, such as `0./0.`

, produce `NaN`

's (not-a-number); this is the only floating-point number `x`

such that `not EqFloat(x+0.0,x)`

.

The IEEE754 standard requires `NaN`

to be non-equal to itself. On the other hand, **GAP** requires every object to be equal to itself. To respect the IEEE754 standard, the function `EqFloat`

(19.2-6) should be used instead of `=`

.

The category a floating-point belongs to can be checked using the filters `IsFinite`

(30.4-2), `IsPInfinity`

(19.2-9), `IsNInfinity`

(19.2-9), `IsXInfinity`

(19.2-9), `IsNaN`

(19.2-9).

Comparisons between floating-point numbers and rationals are explicitly forbidden. The rationale is that objects belonging to different families should in general not be comparable in **GAP**. Floating-point numbers are also approximations of real numbers, and don't follow the same rules; consider for example, using the default **GAP** implementation of floating-point numbers,

gap> 1.0/3.0 = Float(1/3); true gap> (1.0/3.0)^5 = Float((1/3)^5); false

`‣ Float` ( obj ) | ( function ) |

`‣ NewFloat` ( filter, obj ) | ( operation ) |

`‣ MakeFloat` ( sample, obj, obj ) | ( operation ) |

Returns: A new floating-point number, based on `obj`

This function creates a new floating-point number.

If `obj` is a rational number, the created number is created with sufficient precision so that the number can (usually) be converted back to the original number (see `Rat`

(Reference: Rat) and `Rat`

(17.2-6)). For an integer, the precision, if unspecified, is chosen sufficient so that `Int(Float(obj))=obj`

always holds, but at least 64 bits.

`obj` may also be a string, which may be of the form `"3.14e0"`

or `".314e1"`

or `".314@1"`

etc.

An option may be passed to specify, it bits, a desired precision. The format is `Float("3.14":PrecisionFloat:=1000)`

to create a 1000-bit approximation of \(3.14\).

In particular, if `obj` is already a floating-point number, then `Float(obj:PrecisionFloat:=prec)`

creates a copy of `obj` with a new precision. prec

`‣ Rat` ( f ) | ( attribute ) |

Returns: A rational approximation to `f`

This command constructs a rational approximation to the floating-point number `f`. Of course, it is not guaranteed to return the original rational number `f` was created from, though it returns the most `reasonable' one given the precision of `f`.

Two options control the precision of the rational approximation: In the form `Rat(f:maxdenom:=md,maxpartial:=mp)`

, the rational returned is such that the denominator is at most `md` and the partials in its continued fraction expansion are at most `mp`. The default values are `maxpartial:=10000`

and `maxdenom:=2^(precision/2)`

.

`‣ Cyc` ( f[, degree] ) | ( attribute ) |

Returns: A cyclotomic approximation to `f`

This command constructs a cyclotomic approximation to the floating-point number `f`. Of course, it is not guaranteed to return the original rational number `f` was created from, though it returns the most `reasonable' one given the precision of `f`. An optional argument `degree` specifies the maximal degree of the cyclotomic to be constructed.

The method used is LLL lattice reduction.

`‣ SetFloats` ( rec[, bits][, install] ) | ( function ) |

Installs a new interface to floating-point numbers in **GAP**, optionally with a desired precision `bits` in binary digits. The last optional argument `install` is a boolean value; if false, it only installs the eager handler and the precision for the floateans, without making them the default.

`‣ FLOAT` | ( global variable ) |

This record contains useful floating-point constants:

**DECIMAL_DIG**Maximal number of useful digits;

**DIG**Number of significant digits;

**VIEW_DIG**Number of digits to print in short view;

**EPSILON**Smallest number such that \(1\neq1+\epsilon\);

**MANT_DIG**Number of bits in the mantissa;

**MAX**Maximal representable number;

**MAX_10_EXP**Maximal decimal exponent;

**MAX_EXP**Maximal binary exponent;

**MIN**Minimal positive representable number;

**MIN_10_EXP**Minimal decimal exponent;

**MIN_EXP**Minimal exponent;

**INFINITY**Positive infinity;

**NINFINITY**Negative infinity;

**NAN**Not-a-number,

as well as mathematical constants `E`

, `LOG2E`

, `LOG10E`

, `LN2`

, `LN10`

, `PI`

, `PI_2`

, `PI_4`

, `1_PI`

, `2_PI`

, `2_SQRTPI`

, `SQRT2`

, `SQRT1_2`

.

`‣ EqFloat` ( x, y ) | ( operation ) |

Returns: Whether the floateans `x` and `y` are equal

This function compares two floating-point numbers, and returns `true`

if they are equal, and `false`

otherwise; with the exception that `NaN`

is always considered to be different from itself.

`‣ PrecisionFloat` ( x ) | ( attribute ) |

Returns: The precision of `x`

This function returns the precision, counted in number of binary digits, of the floating-point number `x`.

`‣ SignBit` ( x ) | ( attribute ) |

`‣ SignFloat` ( x ) | ( attribute ) |

Returns: The sign of `x`.

The first function `SignBit`

returns the sign bit of the floating-point number `x`: `true`

if `x` is negative (including `-0.`

) and `false`

otherwise.

The second function `SignFloat`

returns the integer `-1`

if `x<0`, `0`

if `x=0` and `1`

if `x>0`.

`‣ IsPInfinity` ( x ) | ( property ) |

`‣ IsNInfinity` ( x ) | ( property ) |

`‣ IsXInfinity` ( x ) | ( property ) |

`‣ IsFinite` ( x ) | ( property ) |

`‣ IsNaN` ( x ) | ( property ) |

Returns `true`

if the floating-point number `x` is respectively \(+\infty\), \(-\infty\), \(\pm\infty\), finite, or `not a number', such as the result of `0.0/0.0`

.

`‣ Cos` ( f ) | ( operation ) |

`‣ Sin` ( f ) | ( operation ) |

`‣ Tan` ( f ) | ( operation ) |

`‣ Sec` ( f ) | ( operation ) |

`‣ Csc` ( f ) | ( operation ) |

`‣ Cot` ( f ) | ( operation ) |

`‣ Asin` ( f ) | ( operation ) |

`‣ Acos` ( f ) | ( operation ) |

`‣ Atan` ( f ) | ( operation ) |

`‣ Cosh` ( f ) | ( operation ) |

`‣ Sinh` ( f ) | ( operation ) |

`‣ Tanh` ( f ) | ( operation ) |

`‣ Sech` ( f ) | ( operation ) |

`‣ Csch` ( f ) | ( operation ) |

`‣ Coth` ( f ) | ( operation ) |

`‣ Asinh` ( f ) | ( operation ) |

`‣ Acosh` ( f ) | ( operation ) |

`‣ Atanh` ( f ) | ( operation ) |

`‣ Log` ( f ) | ( operation ) |

`‣ Log2` ( f ) | ( operation ) |

`‣ Log10` ( f ) | ( operation ) |

`‣ Log1p` ( f ) | ( operation ) |

`‣ Exp` ( f ) | ( operation ) |

`‣ Exp2` ( f ) | ( operation ) |

`‣ Exp10` ( f ) | ( operation ) |

`‣ Expm1` ( f ) | ( operation ) |

`‣ CubeRoot` ( f ) | ( operation ) |

`‣ Square` ( f ) | ( operation ) |

`‣ Ceil` ( f ) | ( operation ) |

`‣ Floor` ( f ) | ( operation ) |

`‣ Round` ( f ) | ( operation ) |

`‣ Trunc` ( f ) | ( operation ) |

`‣ Atan2` ( y, x ) | ( operation ) |

`‣ FrExp` ( f ) | ( operation ) |

`‣ LdExp` ( f, exp ) | ( operation ) |

`‣ AbsoluteValue` ( f ) | ( operation ) |

`‣ Norm` ( f ) | ( operation ) |

`‣ Hypothenuse` ( x, y ) | ( operation ) |

`‣ Frac` ( f ) | ( operation ) |

`‣ SinCos` ( f ) | ( operation ) |

`‣ Erf` ( f ) | ( operation ) |

`‣ Zeta` ( f ) | ( operation ) |

`‣ Gamma` ( f ) | ( operation ) |

Standard math functions.

**GAP** provides a mechanism for packages to implement new floating-point numerical interfaces. The following describes that mechanism, actual examples of packages are documented separately.

A package must create a record with fields (all optional)

**creator**a function converting strings to floating-point;

**eager**a character allowing immediate conversion to floating-point;

**objbyextrep**a function creating a floating-point number out of a list

`[mantissa,exponent]`

;**filter**a filter for the new floating-point objects;

**constants**a record containing numerical constants, such as

`MANT_DIG`

,`MAX`

,`MIN`

,`NAN`

.

The package must install methods `Int`

, `Rat`

, `String`

for its objects, and creators `NewFloat(filter,IsRat)`

, `NewFloat(IsString)`

.

It must then install methods for all arithmetic and numerical operations: `PLUS`

, `Exp`

, ...

The user chooses that implementation by calling `SetFloats`

(19.2-4) with the record as argument, and with an optional second argument requesting a precision in binary digits.

Complex arithmetic may be implemented in packages, and is present in **float**. Complex numbers are treated as usual numbers; they may be input with an extra "i" as in `-0.5+0.866i`

. They may also be created using `NewFloat`

(19.2-1) with three arguments: the float filter, the real part, and the imaginary part.

Methods should then be implemented for `Norm`

, `RealPart`

, `ImaginaryPart`

, `ComplexConjugate`

, ...

`‣ Argument` ( z ) | ( attribute ) |

Returns the argument of the complex number `z`, namely the value `Atan2(ImaginaryPart(z),RealPart(z))`

.

Interval arithmetic may also be implemented in packages. Intervals are in fact efficient implementations of sets of real numbers. The only non-trivial issue is how they should be compared. The standard `EQ`

tests if the intervals are equal; however, it is usually more useful to know if intervals overlap, or are disjoint, or are contained in each other.

Note the usual convention that intervals are compared as in \([a,b]\le[c,d]\) if and only if \(a\le c\) and \(b\le d\).

`‣ Sup` ( x ) | ( attribute ) |

Returns the supremum of the interval `x`.

`‣ Inf` ( x ) | ( attribute ) |

Returns the infimum of the interval `x`.

`‣ Mid` ( x ) | ( attribute ) |

Returns the midpoint of the interval `x`.

`‣ AbsoluteDiameter` ( x ) | ( attribute ) |

`‣ Diameter` ( x ) | ( attribute ) |

Returns the absolute diameter of the interval `x`, namely the difference `Sup(x)-Inf(x)`

.

`‣ RelativeDiameter` ( x ) | ( attribute ) |

Returns the relative diameter of the interval `x`, namely `(Sup(x)-Inf(x))/AbsoluteValue(Min(x))`

.

`‣ IsDisjoint` ( x1, x2 ) | ( operation ) |

Returns `true`

if the two intervals `x1`, `x2` are disjoint.

`‣ IsSubset` ( x1, x2 ) | ( operation ) |

Returns `true`

if the interval `x1` contains `x2`.

`‣ IncreaseInterval` ( x, delta ) | ( operation ) |

Returns an interval with same midpoint as `x` but absolute diameter increased by `delta`.

`‣ BlowupInterval` ( x, ratio ) | ( operation ) |

Returns an interval with same midpoint as `x` but relative diameter increased by `ratio`.

`‣ BisectInterval` ( x ) | ( operation ) |

Returns a list of two intervals whose union equals the interval `x`.

Goto Chapter: Top 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 Bib Ind

generated by GAPDoc2HTML