Mathematics of Information Technology and Complex Systems


Homepage
 
Project Highlights
 
Milestones
 
Research
 
Sample Papers
 
Team Members
 
Partner Organizations
 
Students
 
Publications
 
Presentations
 
Events
 
Seminar Series
 
MITACS Home
 

Research

State of Knowledge in the Field

Symbolic algebra systems such as Maple and Mathematica have reached a remarkable degree of sophistication over the last twenty years. Difficult problems, such as exact integration of elementary functions and polynomial factorization, have been attacked with considerable success. These systems have incorporated many of the most important algorithms of the twentieth century, including the Fast Fourier Transform, lattice reduction with the LLL algorithm, Groebner bases, and the Risch decision procedure for elementary function integration. As a result, they can now effectively deal with large parts of the standard mathematics curriculum (and can outperform most of our undergraduates).

There is a strong argument that these mathematical software systems are the most significant part in a paradigm shift in how mathematics is done. Certainly they have become a central research tool in many subareas of mathematics both from an exploratory and from a formal point of view. Initially, the objective of symbolic algebra systems was to capture as much symbolic and exact mathematics as possible. As a basic algorithmic understanding was achieved, a second objective arose, namely, that of efficiency; how to avoid "intermediate expression swell". Most recently, the numerical computing environment has been extended to encompass arbitrary precision arithmetic and algebraic problems for approximate mathematical objects, and to deal in such an environment with the more usual questions of analysis. Basically one would like to be able to incorporate the usual methods of numerical analysis into an exact or infinite precision environment.

The problems are readily apparent and difficult. For example, how does one usefully define and implement arbitrary precision numerical quadrature? How does one deal with branch cuts of analytic functions? How does one deal consistently with logarithm? (Even this isn't completely worked out.) More ambitiously, how does one create a similar analysis for differential equations? The goal is to marry the algorithms of analysis and algebra with symbolic and exact computation, and have them execute efficiently. Sometimes this means we must first go back and speed up the core algebraic calculations.

One of the most important of these core calculations is symbolic linear algebra. While there has been tremendous progress in numerical linear algebra, i.e., with machine floating point numbers, over the past four decades, progress in symbolic linear algebra has been comparatively recent. Systems such as Maple are being asked to handle ever larger matrices over a multitude of domains. Naive algorithms fail dramatically through an inability to control expression swell, and through their inability to exploit structure in the input system. This project addresses these problems on a number of fronts. The LinBox library is a multi-national project (France, USA, Canada) and the first major attempt to address symbolic linear algebra for sparse matrices in a consistent way. Through generic programming techniques and sophisticated black-box algorithms it provides a flexible and important tool in dealing with these problems. A related concentration is on employing standard, highly optimized, numerical libraries (such as BLAS and ATLAS) for dense symbolic linear algebra. When dealing with hybrid symbolic- numeric problems, specifically for multivariate polynomials with approximate coefficients, we often encounter so- called "structured least squares" problems. We will investigate the application of new numerical technology for structured least squares problems to the domain of symbolic-numeric algorithms for polynomials.

Computer algebra systems rely on (literally) hundreds of individual developments of two varieties: mathematical development of new and faster algorithms and software design improvements. The great success of the symbolic algebra systems has been their mathematical generality and wealth of algorithms. Some of the areas to which the principals on this grant have contributed include:

  • modular algorithms for polynomial GCDs and resultants
  • polynomial factorization and Groebner basis computation
  • simplification of elementary functions (with branch cuts)
  • simplification of various special functions
  • intelligent interactive database of special functions
  • exact solution of ODEs
  • an algebraic solver for differential (ordinary and partial) equations
  • fast arbitrary precision arithmetic
  • high precision numerical quadrature
  • structured total least squares methods for approximate polynomials
  • integer relation algorithms and inverse symbolic computation (PSLQ)
  • finding all zeros of arbitrary analytic functions in arbitrary regions
  • solving systems of sparse linear diophantine equations
  • methods for accurate arbitrary-precision linear algebra
  • algorithms for linear algebra over finite fields
  • normal forms of linear systems of differential and difference operators

Each of these areas continues to offer many challenges.

Project Objectives

The emphasis of the various projects is the development and implementation of software for finding exact as opposed to numerical solutions to mathematical problems. This is an overlap of computer science with pure mathematics, requiring very deep analysis, and consequently is of later development. Many pure researchers develop software tools for their specific needs. These efforts often overlap because of the narrowness of applicability of these tools and the education required for their specific use. The investment required to adapt these tools for wider applicability is huge; however, the importance and benefits of providing such good working tools to the general mathematical community is difficult to overestimate. This is an expanding area of expertise, offering considerable opportunity in both academic and industrial environments.

Objectives of the projects in the MOCAA consortium may be summarized as follows:

  • new algorithm development for the various problems
  • software development in our various projects
  • testing of products in actual research
  • interaction with industrial suppliers of software
  • delivery of products to general users
  • educating users on availability and function of products
  • providing a forum for interaction of researchers

Methodology

Canada's position as a major participant in the development of mathematical software stems from the early eighties and the very successful creation of Maple at the University of Waterloo. Systems such as Maple, and its main competitor Mathematica, and the related numerically-oriented system Matlab, are now the primary research and development tools in mathematics. These are now substantial sized businesses, with Maple employing over one hundred personnel. Even at its current size, Maple interacts closely with the research groups at the the three principal sites of this proposal. Indeed these labs are Maple's primary source of both leading-edge research and prototype development. This unique synergistic relationship continues to foster a thriving computer algebra community in Canada.

The principal methodology of this proposal is a greater emphasis on the analytic properties of symbolic mathematical objects. While current systems deal most successfully with algebraic objects, many serious applications require the solution of analytic problems connected with definite integrals, series and differential equations. All the elementary notions of analysis, such as continuity and differentiability, have to be given precise computational meaning. The first challenge involves representing the mathematical objects and their analytic properties in a computationally feasible way. This must be followed by algorithmic developments to allow the handling of problems such as the analysis of functions given by programs, the automatic simplification of complicated analytic formulae and the recognition of when two very different mathematical expressions represent the same object. In recent years, a great deal of success has also come by mixing numeric and symbolic (exact and inexact) methods. This can arise both in the context of approximate components on mathematical objects (such as polynomials with approximate coefficients), and in the use of arbitrary (but ultimately inexact) precision to establish exact results.

One of the debates that has played out in symbolic computation since its inception is that of compiled library and kernel code versus code written within the computer algebra system's (generally interpreted) language. The balance chosen between these two approaches varies dramatically between different systems. Recent advances in programming languages for computing with generic and late-binding types, along with new external interfaces into the Maple object space open another possibility of writing external libraries which tightly couple with the Maple system. A dynamically instantiated library such as LinBox could directly work within the Maple memory model, and apply its generic algorithms to the sophisticated types Maple provides. The approach offers a tremendous opportunity both for the high-performance it will provide, and as an examination of this approach to library implementation.

The problems under investigation are determined in conjunction with Maplesoft Inc. on the basis of our common interests. As a corporation, Maple is comfortable with and understands that it will get the best product from people doing what most interests them. In this sense it is a very satisfactory partnership.

At each of the three major sites of this consortium, we run weekly meetings and seminars. As well, symbolic computation is well established in the curricula at the University of Waterloo, Simon Fraser University, and the University of Western Ontario, with regular graduate and undergraduate course offerings.

Differential Equations Project

Leaders:Dr. Robert Corless and Dr. Michael Monagan
Participants: E. Cheb-Terab, Greg Reid

Algebraic methods for finding exact solutions of differential equations

This project aims at developing new algorithms for finding exact solutions to ODEs, PDEs and systems of them, and secondly, tools for manipulating, simplifying, decomposing system of ODEs and PDEs (with algebraic constraints). We have published a number of papers relating to new methods for solving ODEs and PDEs and systems of them and have implemented and installed in Maple significant pieces of code for their treatment.

The determination of exact solutions to differential equations is a challenge. Few algorithms are known, and they work mostly for linear equations and provided the solution is Liouvillian and the equation has rational coefficients. For the linear case not admitting Liouvillian solutions, few algorithms are known. For the non-linear case, the methods based on finding point symmetries or restricted forms of integration factors cover a quite small portion of the problem (albeit, this portion is the one we see being used in modelling, most probably because the related differential equations are actually the only ones we know how to solve exactly). The main motivation here is that, having more general algorithms for finding exact solutions, implemented in a user-friendly CAS environment, will remove some of the present barriers researchers have when modelling real and theoretical problems.

The main techniques involved in this project are: symmetries, integrating factors, formulation of equivalence problems for the non-linear case and determination of (as general as possible) hypergeometric and Heun type solutions for the linear case. All these methods are considered in the framework of both ODEs and PDEs.

Simplification and Special Functions Project

Leader: Dr. Michael Monagan
Participants: J. Carette, R. Corless, D. Jeffrey, E. Cheb-Terrab

Simplification is perhaps the most important capability of a general purpose computer algebra system, and is often accomplished by recognizing and manipulating special functions. The goal of this project is to extend the database of known special functions in Maple and to develop tools for reliably simplifying formulae containing the elementary functions, special functions, sums and integrals.

Simplification involves identifying algebraic relationships between constants and functions present in the input. This problem is partly organizational: how do we represent all known algebraic relationships between elementary and special functions such that that all these relationships will be found? Some of these identities may only be valid only on a certain domain, and hence we must deal with branch cuts. Ultimately, application of one identity may lead to a more compact result than another identity, and this must be more carefully quantified and effective techniques developed to determine this. These points are considered more specifically as follows.

i) Knowledge database of elementary and special functions.

We are developing support for new special functions in Maple, e.g., the family of Heun functions, and new special functions in mathematics, e.g., the Lambert W function. This requires being able to evaluate these functions numerically in the complex plane, generate series expansions about all points, including infinity, and to simplify formulae involving related functions. Thus, our goal is to develop an algorithmically accessible knowledge database which supersedes the standard tables of Abramowitz & Stegun (1964) and Gradshteyn & Rhyzik (1980) and is in a sense complementary to the new Digital Library of Mathematical functions.

We need to build the database and access algorithms in such a way that the algorithms can determine algebraic relationships between different functions present in the user input from the database. Information about series expansions, differential equation representations, etc. should also be available in the knowledge database. This leads naturally to the idea of an advisor program which will assist the user in moving from one representation of a function to an equivalent one and accessing the knowledge database.

ii) Simplification of expressions involving cases and branch cuts.

The identity is not valid for all x and y in the complex plane. The reason for this is that in all Scientific Computation Software, the function denotes the Principal-Branch logarithm. A discussion of why this is so has been published recently by Jeffrey & Norman, SIGSAM Bulletin, Vol 38, p 57-67 (2004). Thus, a problem arises in the algorithmic simplification of expressions containing , and . The approach we have taken in previous work is to correct the identity by adding a correction term, in this example, introducing the unwinding number K(z) defined so that is valid for all complex x and y.

The problem has several aspects: (i) identifying and fixing previously implemented algebraic identities that are valid on certain domains only, (ii) providing mechanisms for the user to tell the computer algebra system information about x and y so that the system can decide, e.g., if , and thus remove the unwinding number from the output, and (iii) developing algorithms in analysis to determine where the identity is valid. Recent work on this, and its connection to the OpenMath standard for representing elementary functions, has been published in the Annals of Mathematics and Artificial Intelligence (2002).

iii) Algorithmic simplification of expressions with multiple representations.

As an example, consider the problem of representing a polynomial using a monomial basis 1, x, x2,..., or using a Chebyshev basis T0(x)=1, T1(x)=x, T2(x)=2x2-1, etc. Now consider the polynomials (a) x10000-2x500+1 and (b) T10000(x)-2T500(x)+1. Clearly, the Chebyshev basis is preferred for this polynomial because of the demonstrated sparsity in that basis while the representation in the monomial basis would be dense, with 10,000 terms.

Based on ideas from Kolmogorov complexity, we have developed a bi-modal theory for measuring the complexity of a representation for a formula which gives a clear cutoff point so that a simplifier can algorithmically decide which of two or more representations to use. The goal now is to compute some of these cutoff points and build a simplifier based on this theory and to design efficient algorithms which find a compact representations when such a representation exists. Many problems of this kind arise in the daily usage of computer algebra systems.

Number Theory

Leaders: Dr. Peter Borwein, Dr. Ron Ferguson

a) One component of the project (in conjunction with Apple Canada interns) involves very large scale grid based computational research. The tools and problems live at the interface of computational number theory and combinatorics. We are in the process of commissioning, in the research centre IRMACS, a 150 node G5 based Xgrid cluster (again with Apple Canada support).

A long term goal is to marry the grid technology with symbolic manipulation packages seamlessly. In the short run we are using both technologies to attack some classical mathematics problems like the "Merit Factor" problem of Golay.

b) We have also been involved in significant pieces of number theory related development in Maple and elsewhere (specifically LLL, PSLQ and IDENTIFY). As a second component we are interested in potentially completely rebuilding and redesigning the now somewhat obsolete extant "numtheory" package.

Some of the most important and widely used Maple functions either live in this package or are essential to the integer arithmetic that underlies it. While the full scope of this large project remains to be mapped out some individual components are clear.

  • Implement a modern primality tester
  • Implement a state of art factoring package.
  • Improve code for discrete logarithms, using an index-calculus method or recent "Kangaroo" version of the Pollard rho algorithm.

Symbolic Linear Algebra

Leader: Dr. Mark Giesbrecht
Participants: Robert Corless, George Labahn, Arne Storjohann and Stephen Watt

i) The LinBox Generic Library for Symbolic Linear Algebra:

The goal of the LinBox project is to address the many challenges in symbolic, exact and arbitrary precision linear algebra which arise in modern symbolic computation algorithms. Our approach is a high-performance component library, targeted at solving large, sparse and structured systems over symbolic and exact domains. This group of more than thirteen researchers (and many more postdocs and graduate students), including

  • B. Caviness, E. Kaltofen, A. Lobo, C. Meyer, B.D. Saunders, Q. Xiang (USA)
  • J.-G. Dumas, T. Gautier, J.-L. Roch, G. Villard (France)
  • W. Eberly, M. Giesbrecht, A. Storjohann (Canada)

is implementing this library in C++, with a specific eye towards generic data types, and flexible sparse and black- box matrix representations. Some of the current work is described at http://www.linalg.org. This work is supported by the NSF under their Information Technology Research (ITR) program, and by the CNRS in France. One of our goals under this current proposal is to provide an interface to LinBox from the Maple Computer Algebra system. LinBox's ability to work effectively over generic domains, and in particular the rich algebraic expression domains of Maple, will make it ideal in this role.

Software libraries for symbolic linear algebra have not been developed and integrated to the same degree as for numerical computations. Only in the past decade have algorithmic techniques and (mainstream) programming language developments allowed us to even consider such a project. The LinBox approach is to parameterize algorithms both by the underlying domain and by the matrix and vector classes. Matrices are generally black boxes, opaque functions evaluating matrix-vector products, and the algorithms are descendants of the Krylov-type iterative numerical algorithms (albeit quite radically modified to be effective over finite fields). A guiding principle of the design of LinBox is that this genericity should not come at the expense of performance. Our initial experience has supported this goal.

Many of the contexts in which large, sparse symbolic matrices occur involve matrices over very small finite fields, such as the integers modulo two, or modulo a word-sized prime. Applications include algorithms for discrete logarithm computations and integer factorization, but more generally matrices over small finite fields form the basis for many lifting algorithms for basic linear algebra in symbolic domains. Small fields are in fact a difficult case for many black-box algorithms. Eberly, Giesbrecht and Hovinen are considering a variant of the classical Lanczos algorithm, known as the bi-orthogonalizing block Lanczos method. The analysis and robustness of this has been explored by Eberly and in the Masters thesis of Hovinen. A full implementation of this method is being completed in LinBox, as well as the more heuristic (but very fast) block Lanczos variant of Peter Montgomery.

ii) Efficient Linear Algebra with (Sparse) Integer Matrices:

The goal of this sub-project is to design and implement new algorithms specifically for the exact solution of linear algebra problems on integer matrices. The focus is on input matrices with large dimension and possibly with some additional features such as sparsity or small entry size. Very recently invented techniques such as the shifted number system (for obtaining certified exact results using approximate arithmetic) and high-order lifting and integrality certification (for incorporating matrix multiplication) have led to reductions to integer matrix multiplication (and hence theoretically nearly optimal) algorithms for integer linear system solving and determinant computation. Our goal is twofold. First, we wish to extend these techniques to computing other invariants such as the rank, diophantine solutions and the Hermite and Smith canonical form. Second, we wish to develop practical variants of these algorithms that are suitable candidates for implementation. The main technique to achieve the second goal is to exploit the highly optimized and portable software libraries GMP (for large integer arithmetic) and ATLAS/BLAS (for numerical linear algebra). Our approach is to develop variants of the integer matrix algorithm whose dominant cost can be expressed in terms of small-integer matrix multiplications, which can be implemented using Level 3 BLAS.

Recently we have also found algorithms for sparse integer matrices which require a sub-cubic number of machine operators for solving sparse linear systems over the integers. These algorithms use standard matrix and integer arithmetic, and are relatively space efficient. We expect that these algorithm will perform well in practice and will pursue their implementation as well other improvements. Similar techniques may also apply to polynomial matrices.

iii) Structured Total Least Squares for Approximate Polynomial Computation:

The structured total least squares (STLS) problem finds the nearest singular matrix to a given matrix, which also has the same "structure" as the input. This is important for our purposes, as it is straightforward to formulate many problems on polynomials with approximate (floating-point) coefficients as STLS problems. While there is no known algorithm which guarantees a globally best solution except in special cases, there are recent effective heuristics, which often achieve very good results in practice.

Consider for example, the problem of finding the "approximate GCD" of two (possibly multivariate) polynomials, that is, finding nearby polynomials which have a non-trivial GCD. This can be recast as finding the nearest singular resultant matrix. The structure of such resultant matrices can be defined as any linear combination of a fixed set of basis matrices. Recent approaches have involved using the singular value decomposition (SVD) on the resultant matrix (sometimes followed by some form of iterative improvement), but this does not preserve structure, which must somehow be recovered.

Many other approximate polynomial problems can be recast as STLS problems: approximate factorization (of multivariate polynomials: Gao, Kaltofen, May, Yang & Zhi, 2004), approximate decomposition of polynomials (Giesbrecht, Kaltofen & May 2004) and approximate division. We have also begun considering applications to approximate ordinary differential operators (sometimes called Ore operators). Some of these problems have been considered in the M.Math. thesis of Botting (2004) at the University of Waterloo. Structured linear algebraic algorithms in support of solutions of these problems are being considered in the Ph.D. thesis of Amiraslani at Western. Botting, Giesbrecht and May are developing a Structured Total Least Squares package in Maple, based on the Riemannian SVD and perhaps other algorithms (such as the Structured Total Least Norm), which can easily be employed for approximate polynomial problems. While we do not anticipate, at least initially, that this will lead to the fastest algorithms available for approximate polynomials, such a package will provide a general purpose tool for quickly building numerically robust algorithms for structured problems. We are also exploring the robustness, performance and limits of this approach, and how its specialization to particular structures can in fact yield very fast and possibly provably stable algorithms.

iv) Systems of linear differential and recurrence equations:

A central component in many computational approaches to solving systems of linear differential and recurrence equations, as well as differential-algebraic systems of equations, is to reduce the system to a canonical form. Ore (or skew) polynomials effectively capture both differential and difference operators as a ring of non-commutative polynomials. Differential polynomials are defined over a so-called differential field K such that for all a in K, , where a' results from the application of a "derivation" operator to a (e.g., if K=C(t), this derivation might be standard differentiation with respect to t). Such polynomials act as linear differential operators on K. Similarly, difference polynomials are such that , for a in K, where is an automorphism of K (e.g., when K=C(t), the shift is one such automorphism).

Reducing systems of Ore operators into a normal form is a common first step in their solution by a closed form or special function. Many classical normal forms for commutative linear algebra, such as the Smith, Hermite and Frobenius forms, have well-established analogues over the Ore polynomials. However, these canonical forms can also lead to much higher degrees within the system, corresponding to much higher orders in the scalar differential equations which must be solved. Giesbrecht, Labahn and Zhang have recently defined the Popov normal form for Ore operators as a low-degree alternative. This is an analogue of the weak-Popov form of a polynomial matrix, and in some sense is equivalent to a reduced (integer) lattice in the sense of the LLL algorithm. Prototype algorithms to reduce to the differential-Popov form have been developed and implemented, though considerable work needs to be done.

Symbolic and Numeric Integration and Summation

Leaders: Dr. Keith Geddes, Dr. George Labahn, and Dr. Michael Monagan

i) Definite Integration and Summation:

Significant progress has been made in the past 10 years in finding closed form solutions of definite integrals, particularly those found in Handbooks of Integrals. The primary method used in these case is based on conversion to Meijer G functions. However more work remains to be done in order to solve all such problems. In addition, there is a more general approach recently proposed by Salvy (2000), based on convolutions of holonomic functions combined with Mellin transforms which shows great promise to extend the type of closed form solutions that can be determined. We propose to study and extend Salvy's method using different integral transforms. In addition, we plan to look closely to the conditions under which answers are valid. This detail will be particularly important when integrands have symbolic parameters, in which case the validity conditions will be functions of these parameters. Similar issues exist in the context of definite summation.

ii) Indefinite Integration:

Computer algebra systems have had some success in finding closed form solutions for integrals of elliptic functions. However there remain many open questions, particularly in the case of integrands expressed as elliptic functions in Jacobi form. For example, there are often multiple representations of integrands when these are expressed using more than one Jacobi function, with each representation providing a different closed form solution. In this case there is the issue of finding the simplest closed form solution, in particular for use in further computations (e.g. definite integration). Similarly, we propose to consider those cases where answers involve a minimal set of algebraic extensions, again making results more useful for further work.

iii) Symbolic-numeric algorithms for multidimensional integration:

Based on the recent PhD thesis of Chapman (2003) we propose to design and implement a computer algebra library for multidimensional integration, applicable to a wide variety of integrands. We expect that our novel techniques applied to structured families of problems are capable of breaking the "curse of dimensionality" that afflicts numerical multiple integration. For some specific families of integrands with special structure over hyper- rectangular regions, the thesis of Carvajal (2004) has already demonstrated the success of these techniques. We believe that it will be possible to extend the applicability of these methods to a much wider class of integrands by employing techniques such as transformation of variables (to map the region of integration onto a hyper-rectangle) and symmetrization of the integrand.

iv) The method of creative telescoping and hypergeometric terms:

Zeilberger's algorithm, also known as the method of creative telescoping, has been shown to be a very useful tool in a wide range of applications. These include finding closed forms of definite sums of hypergeometric terms, certifying large classes of identities in combinatorics and in the theory of special functions. One problem with the algorithm is that it might not terminate even though the expected output exists and can be computed. In this work, we try to fill in this gap of the algorithm to make sure that it returns the expected output in all cases.

We and Ha Le will also consider computing bormal and canonical forms of hypergeometric terms. This work is useful in various computer algebra algorithms operating on (q-) hypergeometric terms. Previously, we proposed four different canonical forms of a hypergeometric term. These forms allow one to represent a hypergeometric term efficiently, as each of them is optimal in some sense. We currently try to find a unifying approach to construct these four canonical forms, and to extend the result to the q-hypergeometric case.

We will also study algorithms which solve the additive decomposition problem of (q-) hypergeometric terms, and will try to propose a particular canonical form of rational functions which might provide efficiency improvement to these algorithms, and which might also prove the minimality of the decomposition for the q-hypergeometric case.