See: Description
Interface  Description 

DoubleConstants 
Constants needed for various algorithms for the
double type. 
DoubleModConstants 
Constants needed for various modular arithmetic operations for the
double type. 
DoubleRadixConstants 
Constants related to different radixes for the
double data type. 
FloatConstants 
Constants needed for various algorithms for the
float type. 
FloatModConstants 
Constants needed for various modular arithmetic operations for the
float type. 
FloatRadixConstants 
Constants related to different radixes for the
float data type. 
IntConstants 
Constants needed for various algorithms for the
int type. 
IntModConstants 
Constants needed for various modular arithmetic operations for the
int type. 
IntRadixConstants 
Constants related to different radixes for the
int data type. 
LongConstants 
Constants needed for various algorithms for the
long type. 
LongModConstants 
Constants needed for various modular arithmetic operations for the
long type. 
LongRadixConstants 
Constants related to different radixes for the
long data type. 
Parallelizable 
Any task that can use a
ParallelRunner to execute operations in parallel. 
Class  Description 

AbstractConvolutionBuilder 
Abstract base class for creating convolutions of suitable type for the specified length.

AbstractDataStorageBuilder 
Abstract base class for a data storage creation strategy.

AbstractNTTBuilder 
Abstract base class for creating Number Theoretic Transforms suitable for the
specified length, based on available memory configured in the
ApfloatContext . 
AbstractStepFNTStrategy 
Abstract superclass for stepbased FNT strategies.

DiskDataStorage 
Abstract base class for diskbased data storage, containing the common
functionality independent of the element type.

DoubleAdditionBuilder 
Creates additions for the specified radix and the
double element type. 
DoubleAdditionStrategy 
Basic addition strategy for the
double element type. 
DoubleApfloatBuilder 
Builder class for building
ApfloatImpl implementations with the
double data element type. 
DoubleApfloatImpl 
Immutable apfloat implementation class for the
double data element type. 
DoubleBaseMath 
Mathematical operations on numbers in a base.

DoubleBuilderFactory 
Factory class for getting instances of the various builder classes needed
to build an
ApfloatImpl with the double data element type. 
DoubleCarryCRTBuilder 
Creates carryCRT related objects, for the
double type. 
DoubleCarryCRTStepStrategy 
Class for performing the final steps of a threemodulus
Number Theoretic Transform based convolution.

DoubleConvolutionBuilder 
Creates convolutions of suitable type for the
double type. 
DoubleCRTMath 
Basic arithmetic for calculating the Chinese Remainder
Theorem.

DoubleDataStorageBuilder 
Default data storage creation strategy for the
double data type. 
DoubleDiskDataStorage 
Diskbased data storage for the
double element type. 
DoubleElementaryModMath 
Elementary modulo arithmetic functions for
double data. 
DoubleFactor3NTTStepStrategy 
Steps for the factor3 NTT.

DoubleKaratsubaConvolutionStrategy 
Convolution strategy using the Karatsuba algorithm.

DoubleMatrixBuilder 
Creates matrix operations objects, for the
double type. 
DoubleMatrixStrategy 
Optimized matrix transposition methods for the
double type. 
DoubleMediumConvolutionStrategy 
Mediumlength convolution strategy.

DoubleMemoryArrayAccess 
Array access class based on a
double[] . 
DoubleMemoryDataStorage 
Memory based data storage implementation for the
double
element type. 
DoubleModMath 
Modulo arithmetic functions for
double data. 
DoubleNTTBuilder 
Creates Number Theoretic Transforms for the
double type. 
DoubleNTTConvolutionStepStrategy 
Steps of a threeNTT convolution for the
double type. 
DoubleNTTStepStrategy 
Common methods to calculate Fast Number Theoretic Transforms
in parallel using multiple threads.

DoubleScramble 
Functions to perform bitreverse ordering of
double data. 
DoubleShortConvolutionStrategy 
Short convolution strategy.

DoubleTableFNT 
Fast Number Theoretic Transform that uses lookup tables
for powers of n:th root of unity and permutation indexes.

DoubleTableFNTStrategy 
Fast Number Theoretic Transform strategy that uses lookup tables
for powers of n:th root of unity and permutation indexes.

DoubleWTables 
Helper class for generating and caching tables of powers of the n:th root of unity.

Factor3NTTStrategy 
A transform that implements a 3point transform on
top of another Number Theoretic Transform that does
transforms of length 2^{n}.

FloatAdditionBuilder 
Creates additions for the specified radix and the
float element type. 
FloatAdditionStrategy 
Basic addition strategy for the
float element type. 
FloatApfloatBuilder 
Builder class for building
ApfloatImpl implementations with the
float data element type. 
FloatApfloatImpl 
Immutable apfloat implementation class for the
float data element type. 
FloatBaseMath 
Mathematical operations on numbers in a base.

FloatBuilderFactory 
Factory class for getting instances of the various builder classes needed
to build an
ApfloatImpl with the float data element type. 
FloatCarryCRTBuilder 
Creates carryCRT related objects, for the
float type. 
FloatCarryCRTStepStrategy 
Class for performing the final steps of a threemodulus
Number Theoretic Transform based convolution.

FloatConvolutionBuilder 
Creates convolutions of suitable type for the
float type. 
FloatCRTMath 
Basic arithmetic for calculating the Chinese Remainder
Theorem.

FloatDataStorageBuilder 
Default data storage creation strategy for the
float data type. 
FloatDiskDataStorage 
Diskbased data storage for the
float element type. 
FloatElementaryModMath 
Elementary modulo arithmetic functions for
float data. 
FloatFactor3NTTStepStrategy 
Steps for the factor3 NTT.

FloatKaratsubaConvolutionStrategy 
Convolution strategy using the Karatsuba algorithm.

FloatMatrixBuilder 
Creates matrix operations objects, for the
float type. 
FloatMatrixStrategy 
Optimized matrix transposition methods for the
float type. 
FloatMediumConvolutionStrategy 
Mediumlength convolution strategy.

FloatMemoryArrayAccess 
Array access class based on a
float[] . 
FloatMemoryDataStorage 
Memory based data storage implementation for the
float
element type. 
FloatModMath 
Modulo arithmetic functions for
float data. 
FloatNTTBuilder 
Creates Number Theoretic Transforms for the
float type. 
FloatNTTConvolutionStepStrategy 
Steps of a threeNTT convolution for the
float type. 
FloatNTTStepStrategy 
Common methods to calculate Fast Number Theoretic Transforms
in parallel using multiple threads.

FloatScramble 
Functions to perform bitreverse ordering of
float data. 
FloatShortConvolutionStrategy 
Short convolution strategy.

FloatTableFNT 
Fast Number Theoretic Transform that uses lookup tables
for powers of n:th root of unity and permutation indexes.

FloatTableFNTStrategy 
Fast Number Theoretic Transform strategy that uses lookup tables
for powers of n:th root of unity and permutation indexes.

FloatWTables 
Helper class for generating and caching tables of powers of the n:th root of unity.

IntAdditionBuilder 
Creates additions for the specified radix and the
int element type. 
IntAdditionStrategy 
Basic addition strategy for the
int element type. 
IntApfloatBuilder 
Builder class for building
ApfloatImpl implementations with the
int data element type. 
IntApfloatImpl 
Immutable apfloat implementation class for the
int data element type. 
IntBaseMath 
Mathematical operations on numbers in a base.

IntBuilderFactory 
Factory class for getting instances of the various builder classes needed
to build an
ApfloatImpl with the int data element type. 
IntCarryCRTBuilder 
Creates carryCRT related objects, for the
int type. 
IntCarryCRTStepStrategy 
Class for performing the final steps of a threemodulus
Number Theoretic Transform based convolution.

IntConvolutionBuilder 
Creates convolutions of suitable type for the
int type. 
IntCRTMath 
Basic arithmetic for calculating the Chinese Remainder
Theorem.

IntDataStorageBuilder 
Default data storage creation strategy for the
int data type. 
IntDiskDataStorage 
Diskbased data storage for the
int element type. 
IntElementaryModMath 
Elementary modulo arithmetic functions for
int data. 
IntFactor3NTTStepStrategy 
Steps for the factor3 NTT.

IntKaratsubaConvolutionStrategy 
Convolution strategy using the Karatsuba algorithm.

IntMatrixBuilder 
Creates matrix operations objects, for the
int type. 
IntMatrixStrategy 
Optimized matrix transposition methods for the
int type. 
IntMediumConvolutionStrategy 
Mediumlength convolution strategy.

IntMemoryArrayAccess 
Array access class based on a
int[] . 
IntMemoryDataStorage 
Memory based data storage implementation for the
int
element type. 
IntModMath 
Modulo arithmetic functions for
int data. 
IntNTTBuilder 
Creates Number Theoretic Transforms for the
int type. 
IntNTTConvolutionStepStrategy 
Steps of a threeNTT convolution for the
int type. 
IntNTTStepStrategy 
Common methods to calculate Fast Number Theoretic Transforms
in parallel using multiple threads.

IntScramble 
Functions to perform bitreverse ordering of
int data. 
IntShortConvolutionStrategy 
Short convolution strategy.

IntTableFNT 
Fast Number Theoretic Transform that uses lookup tables
for powers of n:th root of unity and permutation indexes.

IntTableFNTStrategy 
Fast Number Theoretic Transform strategy that uses lookup tables
for powers of n:th root of unity and permutation indexes.

IntWTables 
Helper class for generating and caching tables of powers of the n:th root of unity.

LongAdditionBuilder 
Creates additions for the specified radix and the
long element type. 
LongAdditionStrategy 
Basic addition strategy for the
long element type. 
LongApfloatBuilder 
Builder class for building
ApfloatImpl implementations with the
long data element type. 
LongApfloatImpl 
Immutable apfloat implementation class for the
long data element type. 
LongBaseMath 
Mathematical operations on numbers in a base.

LongBuilderFactory 
Factory class for getting instances of the various builder classes needed
to build an
ApfloatImpl with the long data element type. 
LongCarryCRTBuilder 
Creates carryCRT related objects, for the
long type. 
LongCarryCRTStepStrategy 
Class for performing the final steps of a threemodulus
Number Theoretic Transform based convolution.

LongConvolutionBuilder 
Creates convolutions of suitable type for the
long type. 
LongCRTMath 
Basic arithmetic for calculating the Chinese Remainder
Theorem.

LongDataStorageBuilder 
Default data storage creation strategy for the
long data type. 
LongDiskDataStorage 
Diskbased data storage for the
long element type. 
LongElementaryModMath 
Elementary modulo arithmetic functions for
long data. 
LongFactor3NTTStepStrategy 
Steps for the factor3 NTT.

LongKaratsubaConvolutionStrategy 
Convolution strategy using the Karatsuba algorithm.

LongMatrixBuilder 
Creates matrix operations objects, for the
long type. 
LongMatrixStrategy 
Optimized matrix transposition methods for the
long type. 
LongMediumConvolutionStrategy 
Mediumlength convolution strategy.

LongMemoryArrayAccess 
Array access class based on a
long[] . 
LongMemoryDataStorage 
Memory based data storage implementation for the
long
element type. 
LongModMath 
Modulo arithmetic functions for
long data. 
LongNTTBuilder 
Creates Number Theoretic Transforms for the
long type. 
LongNTTConvolutionStepStrategy 
Steps of a threeNTT convolution for the
long type. 
LongNTTStepStrategy 
Common methods to calculate Fast Number Theoretic Transforms
in parallel using multiple threads.

LongScramble 
Functions to perform bitreverse ordering of
long data. 
LongShortConvolutionStrategy 
Short convolution strategy.

LongTableFNT 
Fast Number Theoretic Transform that uses lookup tables
for powers of n:th root of unity and permutation indexes.

LongTableFNTStrategy 
Fast Number Theoretic Transform strategy that uses lookup tables
for powers of n:th root of unity and permutation indexes.

LongWTables 
Helper class for generating and caching tables of powers of the n:th root of unity.

MessagePasser<K,V> 
Message passing helper class for parallel codes.

ParallelRunnable 
Abstract class for a
Runnable that can be run in parallel by
multiple threads. 
ParallelRunner 
Class for running
ParallelRunnable objects in parallel using
multiple threads. 
ParallelThreeNTTConvolutionStrategy 
Convolution using three Number Theoretic Transforms
and the CRT to get the final result, using multiple threads in parallel.

Scramble 
Functions to perform bitreverse ordering of data.

SixStepFNTStrategy 
Fast Number Theoretic Transform that uses a "sixstep"
algorithm to calculate a long transform more efficiently on
cachebased memory architectures.

StepCarryCRTStrategy 
Class for performing the final step of a threemodulus
Number Theoretic Transform based convolution.

ThreeNTTConvolutionStrategy 
Convolution using three Number Theoretic Transforms
and the Chinese Remainder Theorem to get the final result.

TwoPassFNTStrategy 
Fast Number Theoretic Transform that uses a "twopass"
algorithm to calculate a very long transform on data that
resides on a mass storage device.

Exception  Description 

ApfloatInternalException 
Exception indicating some unexpected apfloat
implementation specific error situation.

BackingStorageException 
Exception indicating a backing storage failure.

ImplementationMismatchException 
Exception indicating a different implementation of the apfloat SPI
being used in two operands of a calculation.

RadixMismatchException 
Exception indicating a different radix being used in two operands
of a calculation.

TransformLengthExceededException 
Exception indicating that the "size" of the numbers used in a
multiplication is too large.

The org.apfloat.internal
package contains four different
implementations of the apfloat SPI, each based on a different primitive
element type:
IntBuilderFactory
, based on element type
int
: This is the default implementation used by apfloat.
It works well for 32bit platforms that perform integer operations fast
(including integer multiplication), and can multiply double
s
and convert between double
and int
with adequate
performance. This applies to most workstations today (Intel x86 processors
and compatibles, in particular processors with SSE2 support, and most RISC
architectures). You can do calculations up to roughly 226 million digits
(in radix 10) with this implementation, which should be enough for most
purposes.LongBuilderFactory
, based on element type
long
: This implementation uses the 64bit long
integer as the elementary type for all data storage and manipulation. It
usually is faster than the int
version on 64bit architectures
if you have a JVM that actually uses the 64bit features of the processor.
In some places it uses also double
arithmetic, so the processor
should be able to perform doubleprecision floating point operations as well
as convert between double
and long
, for decent
performance. For example, on x8664 and SPARC the 64bit long
version is faster than the 32bit int
version. You can use the
long
implementation on 32bit platforms too, however the
performance per element is less than half of the int
version,
even if roughly twice as much data is processed per element. The upside
is that this implementation can do much bigger calculations: up to about
3.5 * 10^{15} digits in radix 10.DoubleBuilderFactory
, based on element type
double
: This implementation exists generally only as a
curiosity. It will typically perform worse than the long
version, and it's only able to do calculations with about 1/20 of its
maximum digit length. The only situation where using the double
version might make sense is on a platform that performs floatingpoint
arithmetic well, but performs integer arithmetic extremely badly. Finding
such a platform today might be difficult, so generally it's advisable to
use the long
version instead, if you have a 64bit platform
or need the most extreme precision.FloatBuilderFactory
, based on element type
float
: This version is also only a curiosity. The main
downside is that it can only perform calculations up to about 1.3
million radix10 digits. The perdigit performance is also typically
less than that of the int
version. Unless you have a
computer that performs floatingpoint arithmetic extraordinarily well
compared to integer arithmetic, it's always advisable to use the
int
version instead.Type  Pentium 4  Athlon XP  Athlon 64 (32bit)  Athlon 64 (64bit)  UltraSPARC II 

Int  100%  100%  100%  100%  100% 
Long  40%  76%  59%  95%  132% 
Double  45%  63%  59%  94%  120% 
Float  40%  43%  46%  42%  82% 
Compared to the java.math.BigInteger
class with different digit
sizes, the apfloat relative performance with the same CPUs is as follows:
(Test was done with apfloat 1.1 using Sun's Java 5.0 server VM calculating 3^{n} and converting the result to decimal.)
This benchmark suggests that for small numbers – less than roughly 200 decimal
digits in size – the BigInteger
/ BigDecimal
classes
are probably faster, even by an order of magnitude. Using apfloats is only beneficial
for numbers that have at least a couple hundred digits, or of course if some
mathematical functions are needed that are not available for BigInteger
s
or BigDecimal
s. The results can be easily explained by the smaller overhead
that BigInteger
s have due to their simpler implementation. When the size
of the mantissa grows, the O(n log n) complexity of apfloat's FFTbased multiplication
makes apfloat considerably faster than the steady O(n^{2}) implementation
of the BigInteger
class. For numbers with millions of digits,
multiplication using BigInteger
s would be simply unfeasible, whereas for
apfloat it would not be a problem at all.
All of the above apfloat implementations have the following features (some of the links
point to the int
version, but all four versions have similar classes):
IntMemoryDataStorage
) or on disk
(IntDiskDataStorage
).IntShortConvolutionStrategy
),
using a simple O(n^{2}) long multiplication algorithm for small numbers,
with low overhead (IntMediumConvolutionStrategy
),
using the Karatsuba multiplication algorithm for slightly larger numbers,
with some more overhead (IntKaratsubaConvolutionStrategy
),
or using a Number Theoretic Transform (NTT) done using three different moduli,
and the final result calculated using the Chinese Remainder Theorem
(ThreeNTTConvolutionStrategy
), for big numbers.IntTableFNTStrategy
) when the entire transform
fits in the processor cache, "sixstep" NTT when the transform fits in the
main memory (SixStepFNTStrategy
),
and a diskbased "twopass" NTT strategy when the whole transform doesn't
fit in the available memory (TwoPassFNTStrategy
).ApfloatInternalException
.
This exception, or various subclasses can be thrown in different situations, for
example:
IOException
can be thrown in any of the disk operations,
if e.g. a file can't be created, or written to if the disk is full.*.ap
and they are by default created in the current working directory. When the objects
are garbage collected, the temporary files are deleted. However, garbage collection
may not work perfectly at all times, and in general there are no guarantees that
it will happen at all. So, depending on the program being executed, it may be
beneficial to explicitly call System.gc()
at some point to ensure
that unused temporary files are deleted. However, VM vendors generally warn
against doing this too often, since it may seriously degrade performance. So,
figuring out how to optimally call it may be difficult. If the file deletion fails
for some reason, some temporary files may be left on disk after the program
exits. These files can be safely removed after the program has terminated.Many parts of the program are parallelized i.e. are processed with multiple threads in parallel. Parallelization is done where it has been easy to implement and where it is efficient. E.g. the "sixstep" NTT is parallelized, because the data is in matrix form in memory and it's easy and highly efficient to process the rows of the matrix in parallel. Other places where parallelization is implemented are the inplace multiplication of transform results and the carryCRT operation. However in both of these algorithms the process is parallelized only if the data is in memory  if the data was stored on disk then the irregular disk seeking could make the parallel algorithm highly inefficient.
Many sections of the code are not parallelized, where it's obvious that parallelization would not bring any benefits. Examples of such cases are addition, subtraction and matrix transposition. While parallel algorithms for these operations could certainly be implemented, they would not bring any performance improvement. The bottleneck in these operations is memory or I/O bandwidth and not CPU processing time. The CPU processing in addition and subtraction is highly trivial; in matrix transposition it's outright nonexistent  the algorithm only moves data from one place to another. Even if all the data was stored in memory, the memory bandwidth would be the bottleneck. E.g. in addition, the algorithm only needs a few CPU cycles per element to be processed. However moving the data from main memory to CPU registers and back to main memory needs likely significantly more CPU cycles than the addition operation itself. Parallelization would therefore not improve efficiency at all  the total CPU load might appear to increase but when measured in wallclock time the execution would not be any faster.
Since the core functionality of the apfloat implementation is based on the
original C++ version of apfloat, no significant new algorithms have been
added (although the architecture has been otherwise greatly beautified e.g. by
separating the different implementations behind a SPI, and applying all kinds
of patterns everywhere). Thus, there are no different implementations for e.g.
using a floatingpoint FFT instead of a NTT, as the SPI (org.apfloat.spi
)
might suggest. However the default implementation does implement all the
patterns suggested by the SPI – in fact the SPI was designed for the
default implementation.
The class diagram for an example apfloat that is stored on disk is shown below.
Note that all the aggregate classes can be shared by multiple objects that point
to the same instance. For example, multiple Apfloats can point to the same
ApfloatImpl, multiple ApfloatImpls can point to the same DataStorage etc. This
sharing happens in various situations, e.g. by calling floor()
,
multiplying by one etc:
The sequence diagram for creating a new apfloat that is stored on disk is as follows. Note that the FileStorage class is a private inner class of the DiskDataStorage class:
The sequence diagram for multiplying two apfloats is as follows. In this case a NTT based convolution is used, and the resulting apfloat is stored in memory:
Most of the files in the apfloat implementations are generated from templates
where a template tag is replaced by int/long/float/double
or
Int/Long/Float/Double
. Also the byte size of the element type is
templatized and replaced by 4/8/4/8. The only files that are individually
implemented for each element type are:
BaseMath.java CRTMath.java ElementaryModMath.java ModConstants.java
org.apfloat.spi