'Experimental Mathematics and High-Performance Computing',
Recent developments in ``experimental mathematics'' have underscored the value of high-performance computing in modern mathematical research. The most frequent computations that arise here are high-precision (typically several-hundred-digit accuracy) evaluations of integrals and series, together with integer relation detections using the ``PSLQ'' algorithm. Some recent highlights in this arena include: (2) the discovery of ``BBP'-type formulas for various mathematical constants, including pi and log(2); (3) the discovery of analytic evaluations for several classes of multivariate zeta sums; (4) the discovery of Apery-like formulas for the Riemann zeta function at integer arguments; and (5) the discovery of analytic evaluations and linear relations among certain classes of definite integrals that arise in mathematical physics. The talk will include a live demo of the ``experimental mathematician's toolkit''.
Many parallel computational algorithms involve dividing the problem into several smaller tasks and running each task in isolation in parallel. Often these tasks are the same procedure over a set of varying parameters. Inter-process communication might not be needed, but the results of one task may influence what subsequent tasks need to be performed. I will discuss the concept of job generators, or custom-written tasks that generate other tasks and process their feedback. I would discuss this specifically in the context of integer factorization.
'Parallel Computation Tools for Research: A Wishlist',
'Disk-Based Parallel Computing: A New Paradigm',
One observes that 100 local commodity disks of an array have approximately the same streaming bandwidth as a single RAM subsystem. Hence, it is proposed to treat a cluster as if it were a single computer with tens of terabytes of data, and with RAM serving as cache for disk. This makes feasible the solution of truly large problems that are currently space-limited. We also briefly summarize other recent activities of our working group: lessons from supporting ParGAP and ParGCL; progress toward showing that 20 moves suffice to solve Rubik's cube; lessons about marshalling from support of ParGeant4 (parallelization of a million-line program at CERN); and experiences at the SCIEnce workshop (symbolic-computing.org), part of a 5-year, 3.2 million euro, European Union project. Our new distributed checkpointing package now provides a distributed analog of a SAVE-WORKSPACE command, for use in component-based symbolic software, such as SAGE."""),
'Interactive Parallel Supercomputing: Today: MATLAB(r) and Python coming Cutting Edge: Symbolic Parallelism with Mathematica(r) and MAPLE(r)',
"""Star-P is a unique technology offered by Interactive Supercomputing after
nurturing at MIT. Star-P through its abstractions is solving the ease of use
problem that has plagued supercomputing. Some of the innovative features of
Star-P are the ability to program in MATLAB, hook in task parallel codes
written using a processor free abstraction, hook in existing parallel codes,
and obtain the performance that represents the HPC promise. All this is
through a client/server interface. Other clients such as Python or R could
be possible. The MATLAB, Python, or R becomes the "browser." Parallel
computing remains challenging, compared to serial coding but it is now that
much easier compared to solutions such as MPI. Users of MPI can plug in
their previously written codes and libraries and continue forward in Star-P.
Numerical computing is challenging enough in a parallel environment,
symbolic computing will require even more research and more challenging
problems to be solved. In this talk we will demonstrate the possibilities
and the pitfalls.
'Tech X Corp.',
'Interactive Parallel Computing using Python and IPython',
Interactive computing environments, such as Matlab, IDL and
Mathematica are popular among researchers because their
interactive nature is well matched to the exploratory nature of
research. However, these systems have one critical weakness:
they are not designed to take advantage of parallel computing
hardware such as multi-core CPUs, clusters and supercomputers.
Thus, researchers usually turn to non-interactive compiled
languages, such as C/C++/Fortran when parallelism is needed.
In this talk I will describe recent work on the IPython project
to implement a software architecture that allows parallel
applications to be developed, debugged, tested, executed and
monitored in a fully interactive manner using the Python
programming language. This system is fully functional and allows
many types of parallelism to be expressed, including message
passing (using MPI), task farming, shared memory, and custom user
defined approaches. I will describe the architecture, provide an
overview of its basic usage and then provide more sophisticated
examples of how it can be used in the development of new parallel
algorithms. Because IPython is one of the components of the SAGE
system, I will also discuss how IPython's parallel computing
'Parallel computation of Grobner bases in the Weyl algebra',
The usual machinery of Grobner bases can be applied to non-commutative algebras
of the so-called solvable type. One of them, the Weyl algebra, plays the
central role in the computations with $D$-modules. The practical complexity of
the Grobner bases computation in the Weyl algebra is much higher than in the
(commutative) polynomial rings, therefore, calling naturally for parallel
computation. We have developed an algorithm to perform such computation
employing the master-slave paradigm. Our implementation, which has been carried
out in C++ using MPI, draws ideas from both Buchberger algorithm and
Faugere's $F_4$. It exhibits better speedups for the Weyl algebra in
comparison to polynomial problems of the similar size.
'James Madison University',
'MPMPLAPACK: The Massively Parallel Multi-Precision Linear Algebra Package',
For several decades, researchers in the applied fields have had access
to powerful linear algebra packages designed to run on massively
parallel systems. Libraries such as ScaLAPACK and PLAPACK provide a
rich set of functions (usually based on BLAS) for performing linear
algebra over single or double precision real or complex data.
However, such libraries are of limited use to researchers in discrete
mathematics who often need to compute with multi-precision data types.
This talk will cover a massively parallel multi-precision linear
algebra package that I am attempting to write. The goal of this C/MPI
library is to provide drop-in parallel functionality to existing
number theory and algebraic geometry programs (such as Pari, Sage, and
Macaulay2) while preserving enough flexibility to eventually become a
full multi-precision version of PLAPACK. I will describe some
architectural assumptions, design descisions, and benchmarks made so
far and actively solicit input from the audience (I'll buy coffee for
the person who suggests the best alternative to the current name).
Speaker('Marc Moreno Maza',
'Component-level Parallelization of Triangular Decompositions',
We discuss the parallelization of algorithms for solving polynomial systems symbolically by way of triangular decompositions. We introduce a component-level parallelism for which the number of processors in use depends on the geometry of the solution set of the input system. Our long term goal is to achieve an efficient multi-level parallelism: coarse grained (component) level for tasks computing geometric objects in the solution sets, and medium/fine grained level for polynomial arithmetic such as GCD/resultant computation within each task.
Component-level parallelism belongs to the class of dynamic irregular parallel applications, which leads us to address the following questions: How to discover and use geometrical information, at an early stage of the solving process, that would be favorable to component-level parallel execution and load balancing? How to use this level of parallel execution to effectively eliminate unnecessary computations? What implementation mechanisms are feasible?
We report on the effectiveness of the approaches that we have applied, including ``modular methods'', ``solving by decreasing order of dimension'', ``task cost estimation for guided scheduling''. We have realized a preliminary implementation on a SMP using multiprocessed parallelism in Aldor and shared memory segments for data communication. Our experimentation shows promising speedups for some well-know problems. We expect that this speedup would add a multiple factor to the speedup of medium/fine grained level parallelization as parallel GCD/resultant computations.
'UMass Boston / MIT',
'Structure and Representations of Real Reductive Lie Groups: A Computational Approach',
I work with David Vogan (MIT) on the Atlas of Lie Groups and Representations. This is a project to make available information about representations of semi-simple Lie groups over real and p-adic fields. Of particular importance is the problem of the unitary dual: classifying all of the irreducible unitary representations of a given Lie group.
I will present some of the main ideas behind the current and very preliminary version of the software. I will provide some examples also. Currently, we are developing sequential algorithms that are implemented in C++. However, because of time and space complexity we are slowly moving in the direction of parallel computation. For example, David Vogan is experimenting with multi-threads in the K-L polynomials computation module.
This talk is in memory of Fokko du Cloux, the French mathematician who, until a few months ago, was the lead developer. He died this past November.
'Parallel sparsening and simplification of systems of equations',
In a Groebner Basis computation the guiding principle for pairing and
`reducing' equations is a total ordering of monomials or of derivatives for
differential Groebner Bases. If reduction based on an ordering is replaced by
reduction to minimize the number of terms of an equation through another
equation then on the downside the resulting (shorter) system does depend on the
order of pairing of equations for shortening but on the upside there are number
of advantages that makes this procedure a perfect addition/companion to the
Groebner Basis computation. Such features are:
* In contrast to Groebner Basis computations, this algorithm is safe in the sense that it does not need any significant amount of memory, even not temporarily.
* It is self-enforcing, i.e. the shorter equations become, the more useful for shortening other equations they potentially get.
* Equations in a sparse system are less coupled and a cost effective elimination strategy (ordering) is much easier to spot (for humans and computers) than for a dense system.
* Statistical tests show that the probability of random polynomials to factorize increases drastically the fewer terms a polynomial has.
* By experience the shortening of partial differential equations increases their chance to become ordinary differential equations which are usually easier to solve explicitly.
* The likelihood of shortenings to be possible is especially high for large overdetermined systems. This is because the number of pairings goes quadratically with the number of equations but for overdetermined systems, more equations does not automatically mean more unknowns to occur which potentially obstruct shortening by introducing terms that can not cancel.
* The algorithm offers a fine grain parallelization in the computation to shorten one equation with another one and a coarse grain parallelization in that any pair of two equations of a larger system can be processed in parallel. In the talk we will present the algorithm, show examples supporting the above statements and give a short demo.