Mercury is a fully declarative functional / logic programming language. Unlike many other functional languages, it supports non-deterministic programming and automatic backtracking (or other search strategies); unlike many other logic languages, it supports static typing and efficient compilation to C, Java, C# and Erlang.
Like many other languages, it offers multi-precision integers, in the form of the integer.m part of its standard library. In another article I described some work on parallelization of the bignum multiplication. Recently, in a thread on the m-users mailing list, a user reported his experience using bignum arithmetic in Mercury, stating that performance was severely lacking behind other languages.
Investigating at bit, it became obvious that Mercury was the only language using a native implementation base on pure, immutable data-structures (lists of 14 bit digits) for bignum arithmetic. The other languages either use proprietary low-level implementations or the GMP library.
While there still is a potential to optimize bignum division in integer.m, it became obvious that in order to achieve comparable performance, one would need the binding to an external library. Being a fan of suckless software, I had read about libtommath, which is an efficient but simple library for multi-precision arithmetic, written in fully portable ISO-C with very clean and understandable code.
The data-structure for libtommath is defined as a simple C structure:
/* the infamous mp_int structure */ typedef struct { int used, alloc, sign; mp_digit *dp; } mp_int;
It contains the number of used digits, the number of allocated digits, the sign
of the number and finally the array containing the digits. mp_digit
is defined
dependent on the system on which the library is compiled.
Using this type in Mercury is surprisingly easy. We can simply declare a new
foreign type mp_int
as being a pointer to a C mp_int
structure:
:- pragma foreign_type("C", mp_int, "mp_int*") where equality is equal, comparison is mp_cmp.
In order to have correct comparison and unification for the new type, we have to declare which predicates to use for equality and comparison.
To communicate in the other direction, i.e., from C to Mercury, in particular to
report the return status of operations on mp_int
, one declares a Mercury
discriminated union type for the different possibilities of return values from
the library.
:- type mp_result_type ---> mp_result_okay ; mp_result_out_of_mem ; mp_result_invalid_input.
The 3 possibilities of this type are then mapped explicitly on the values from the header declaration of the library:
:- pragma foreign_enum("C", mp_result_type/0, [ mp_result_okay - "MP_OKAY", mp_result_out_of_mem - "MP_MEM", mp_result_invalid_input - "MP_VAL" ]).
mp_int
Values and the Garbage CollectorCreating an mp_int
is done by calling the Mercury predicate mp_init/2
which
calls the mp_init/3
predicate, defined as a direct call into the library. If the
library call does not succeed, i.e., the mp_int
could not be created, an error
is thrown.
:- pred mp_init(int::in, mp_int::out) is det.
mp_init(N, Res) :-
mp_init(N, Result, Res0),
( Result = mp_result_okay ->
Res = Res0
;
error("could not initialize mp_int")
).
One problem that arises when connecting a garbage collected language like Mercury to a library written in C is how one treats memory allocation on the different heaps.
On the Mercury side, this is quite easy. It is sufficient to allocate memory on the Mercury heap using the defined macros. In this way, the garbage collector is aware of the allocated variables and will be able to release them if required.
:- pred mp_init(int::in, mp_result_type::out, mp_int::out) is det. :- pragma foreign_proc("C", mp_init(Value::in, Result::out, Mp_Int::out), [will_not_call_mercury, promise_pure, thread_safe], " int initResult, opResult; Mp_Int = MR_GC_NEW_ATTRIB(mp_int, MR_ALLOC_ID); initResult = mp_init(Mp_Int); if (initResult == MP_OKAY) { opResult = mp_set_int(Mp_Int, Value); Result = opResult; } else Result = initResult; ").
Here, one allocates GC-collected memory for Mp_Int
, calls the initialization
function of the library and, if successful, calls the function to set the value
of the mp_int
. The problem is, that memory allocated on the C side is not yet
under control of the garbage collector.
For libtommath, as pointed out in a message on m-users, all memory allocations
and releases are declared as macros and can be redefined at compile time. It is
therefore possible to replace any call to malloc
and free
via calls to the Boehm
garbage collector, i.e., making all allocated memory subject to the GC, once it
is no longer referenced.
Using the same principle as above, any function exported from the library can be called via a Mercury predicate or function. For each declaration, one can specify whether there are side-effects (promise_pure), whether and Mercury predicate will be called from the C function (will_not_call_mercury) and whether the call is thread safe (for more details see the language reference). For libtommath, all calls are declared as pure, thread safe and without calls to Mercury.
An example for such a call is the divide_with_rem
predicate that calculates the
quotient and remainder of two mp_int
values.
divide_with_rem(A, B, Quot, Rem) :- ( is_zero(B) -> throw(math.domain_error("mp_int.quot_with_rem: division by zero")) ; mp_quot_rem(A, B, Result, Quot0, Rem0), ( Result = mp_result_okay, Quot = Quot0, Rem = Rem0 ; Result = mp_result_out_of_mem, error("could not initialize mp_int") ; Result = mp_result_invalid_input, throw(math.domain_error( "mp_int.quot_with_rem: could not compute quotient and remainder")) ) ).
The corresponding foreign predicate is defined as follows:
:- pred mp_quot_rem(mp_int::in, mp_int::in, mp_result_type::out, mp_int::out, mp_int::out) is det. :- pragma foreign_proc("C", mp_quot_rem(A::in, B::in, Result::out, Quot::out, Rem::out), [will_not_call_mercury, promise_pure, thread_safe], " int initResult1, initResult2, opResult; Quot = MR_GC_NEW_ATTRIB(mp_int, MR_ALLOC_ID); Rem = MR_GC_NEW_ATTRIB(mp_int, MR_ALLOC_ID); initResult1 = mp_init(Quot); initResult2 = mp_init(Rem); if (initResult1 == MP_OKAY && initResult2 == MP_OKAY) { opResult = mp_div(A, B, Quot, Rem); Result = opResult; } else Result = initResult1 != MP_OKAY ? initResult1 : initResult2; ").
Of course, this could be simplified a bit, but as the Mercury compiler creates C code that is given to a C compiler, there is an additional optimization step and superfluous variables will be optimized away.
Simple Boolean foreign predicates are even easier to declare:
:- pred is_zero(mp_int::in) is semidet. :- pragma foreign_proc("C", is_zero(A::in), [will_not_call_mercury, promise_pure, thread_safe], " SUCCESS_INDICATOR = mp_iszero(A) ? MR_TRUE : MR_FALSE; ").
Where MR_TRUE
/ MR_FALSE
indicate whether the foreign predicate succeeded or
not.
A program that uses the mp_int
type needs to be linked with a libtommath library
that is compiled using the redefined MALLOC/FREE macros. The command line for
mmc is as follows:
$ mmc --link-object libtommath.a $ADDITIONAL_PARAMETERS $PROGRAM
The OP of the original thread posted a program for the probabilistic Miller-Rabin primality test using integer types. For a number with ca. 2000 decimal digits, the integer version required 464s while the mp_int version required 0.81s, i.e., more than 2 orders of magnitude faster. For a more elaborate program for integer factorization based on Pollard's rho algorithm, the factor was around 14, for numbers around 60-80 decimal digits.
To summarize, creating a binding to an external C library to use from within Mercury (using C grades) is relatively easy. The most challenging part is probably the correct incorporation of the memory management, i.e., having both the Mercury heap as well as the external heap under control of the Boehm-Demers-Weiser garbage collector.
Further code reading on github.