Bohemian Eigenvalues

14 minute read

A repository with some of the code described here can be found on GitHub: BHIME-Project.

Bohemian eigenvalues are the distribution of the eigenvalues of a random matrix. More specifically, this article explores matrices of low dimension (typically no larger than $10 \times 10$) whose entries are integers of bounded height. The name “Bohemian” is intended as a mnemonic and is derived from “Bounded Height Integer Matrix Eigenvalues” (BHIME). Here I present an overview of some preliminary results arising from this project.

The Bohemian Eigenvalue Project originated in 2014 while I was enrolled in a graduate course on experimental mathematics led by my PhD supervisor Rob Corless. One of the initial goals was to visualize the distributions of eigenvalues of random structured matrices. As the project evolved, many different types of distributions were explored, as I discuss here.

Before I jump into the details of the project, I present one of my favourite examples, and one of many interesting images that have come from this project. Consider a random $5 \times 5$ matrix where the entries are drawn uniformly from the set $\{-1, 0, 1\}$. This distribution contains $3^{25} = 847,\!288,\!609,\!443$ different matrices. This will be used as a running example throughout the remainder of this article. Figure 1 is a density plot of the $7,963,249$ distinct Bohemian eigenvalues from this set of matrices. Details on how this image was produced, and how I know the exact number of eigenvalues, can be found in the computation section.

All eigenvalues of 5x5 matrices with entries from \{-1, 0, 1\} — **Figure 1:** Density plot in the complex plane of all Bohemian eigenvalues from a random $5 \times 5$ matrix with entries drawn uniformly from the set $\{-1, 0, 1\}$. This image is viewed on $-3.3-3i$ to $3.3 + 3i$. A high resolution version can be found here.

Figure 1 leads to a few obvious questions about this particular Bohemian eigenvalue distribution:

What is the size of the regions where no eigenvalues appear? We call these eigenvalue exclusion zones.
How does this change if we look at $6 \times 6$ matrices instead?
Why is there a diffraction pattern appearing around the origin?
Why are some eigenvalue exclusion zones star shaped?
Why is the boundary around the distribution of eigenvalues not circular? This may be hard to see but it does flatten out at the top and bottom.

Unfortunately, I am currently unable to answer any of these questions.

Roots of Polynomials

Before I discuss Bohemian eigenvalues further, there is some closely related work on the distributions of the roots of polynomials. Consider all degree 19 monic polynomials with coefficients in $\{-1, 1\}$, these are known as Littlewood polynomials. $2^{19} = 524,\!288$ such polynomials exist. Figure 2 shows a density plot in the complex plane of the set of all roots.

All roots of monic degree 19 polynomials with coefficients sampled from \{-1, +1\}. — **Figure 2:** Density plot in the complex plane of the set of roots of monic degree 19 polynomials with coefficients in $\{-1, 1\}$. This image is viewed on $-1.7-1.7i$ to $1.7 + 1.7i$. A high resolution version can be found here.

Exploring the roots of polynomials can be restated as a Bohemian eigenvalue problem. Consider a $19 \times 19$ Frobenius companion matrix such that the values in the final column (or row depending on your preference) are sampled uniformly from the set $\{-1, 1\}$. Clearly, $2^{19}$ such matrices exist, and the set of characteristic polynomials of these matrices is equivalent to the set of monic degree 19 Littlewood polynomials mentioned previously. The Bohemian eigenvalues from this distribution of matrices is equivalent to the distribution of the roots of the degree 19 Littlewood polynomials. Hence, we can restate any problem that explores the roots of polynomials as a Bohemian eigenvalue problem by simply using the companion matrix.

Some of the key people who have studied the roots of such polynomials, and related polynomials include: Dan Christensen, John Baez, and Peter Borwein. Numerous references include:

Computation

Exploring Bohemian eigenvalues can be difficult as the eigenvalues of a large number of matrices (typically more than 1 million) must be computed before the distributions of these problems begin to converge. Numerous techniques were used in the exploration of the distributions and their properties.

Make Your Own Bohemian Eigenvalue Plot

Writing a Matlab script to sample random matrices and compute their eigenvalues can be done in a few simple lines. Using the same example as in Figure 1, the following Matlab script will compute the eigenvalues of 1 million random $5 \times 5$ matrices where the entries are sampled from $\{-1, 0, 1\}$:

% A matrix for storing the eigenvalues
eigVals = zeros(1e6, 5);

for i = 1:1e6
    % Generate a random matrix
    A = randsample([-1, 0, 1], 25, true);
    A = reshape(A, [5, 5]);

    % Compute and store the eigenvalues
    eigVals(i, :) = eig(A);
end

% Convert the eigVals matrix to a vector
eigVals = reshape(eigVals, [5e6, 1]);

The eigVals vector now contains 5 million complex values representing eigenvalues from this class. Creating a density plot in the complex plane of these 5 million complex numbers can be done in 2 lines of code thanks to the bivariate histogram function (hist3) in Matlab.

d = hist3([imag(eigVals), real(eigVals)], [1000, 1000]);
imagesc(log(d));

And just like that, in only 9 lines of Matlab code, you have your very own Bohemian eigenvalue plot (almost as nice as Figure 1). Here is an example of what the output will look like:

If you are feeling adventurous (and you have the Parallel Computing Toolbox installed in Matlab), you can change that for loop to a parfor loop for a nice speedup. Or you could try changing the colormap (try colormap hot).

A General Purpose Matlab Program for Producing Bohemian Eigenvalue Images

If you can produce a Bohemian eigenvalue plot in 9 lines, why does my implementation contain thousands of lines? There are a few reasons:

The code I have written has been interfaced such that it is easy use when quickly exploring various classes of matrices.
I have optimized the code to be (probably) as fast as you can get pure Matlab code to perform this computation.
My implementation automatically produces a readme file with information about the image produced (this can be extremely useful when quickly exploring lots of classes of matrices and forgetting to keep track of what they are).
What if you want to explore more eigenvalues than what you can hold in memory? My implementation can take care of that.

Achieving all of that in a user-friendly package is by no means easy.

When 847 Billion Matrices Is Too Much for Matlab

You may have noticed that Figure 1 is the result of computing the eigenvalues of 847 billion matrices. This corresponds to 1,839,102 unique characteristic polynomials and 7,963,249 unique eigenvalues. I will discuss how I determined the exact number of unique characteristic polynomials and eigenvalues in a moment.

If you take the small script above, and modify it slightly to ensure all possible matrices are explored, and try to run it, you will very quickly run out of memory. In fact, it would require over 60TB of memory to store all the eigenvalues. Even if you broke the problem into chunks that are small enough to fit into memory, the time needed to finish this computation would be enormous (estimated to be over 3 months on my computer).

Symmetries

There are of course symmetries within the set of matrices that can be used to reduce the number of matrices one must explore. Some of the obvious symmetries include:

Both $A$ and $-A$ are in the set of matrices, and the eigenvalues of $A$ are the negative (reflection across the imaginary axis) of the eigenvalues of $-A$.
$A$ and $A^T$ have the same eigenvalues.
Any upper or lower triangular matrices have their eigenvalues along the diagonal; the remaining values in the matrix are irrelevant.
Let $P$ be a permutation matrix. $PAP^{-1}$ will have the same eigenvalues of $A$ and also be in the original set.
For any class of matrices where the entries are real-valued, the set of eigenvalues will be symmetric across the real axis as a consequence on the complex conjugate root theorem. This symmetry could reduce the number of eigenvalues stored by (approximately) half.
For any class of matrices where the entries are sampled from a symmetric set (if the set contains $x$, it must also contain $-x$), the set of eigenvalues will be symmetric across the imaginary axis. This symmetry could also reduce the number of eigenvalues stored by (approximately) half.

These symmetries, and the many more that likely exist, are interesting in themselves, but provide very little advantage computationally.

Distributions of Characteristic Polynomials

You may have noticed that I quoted the exact number of characteristic polynomials as 1,839,102, which you may also notice is much less than the 847 billion matrices we started with. In fact, computing the roots of 1.8 million polynomials can be done in less than a minute on (most) modern computers. One other important piece of information is the exact distribution of these characteristic polynomials within the original set of matrices. That is, how many times each characteristic polynomial appears. From this distribution, Figure 1 can be reproduced exactly!

The next obvious question is how did I find the distribution of characteristic polynomials? The answer is somewhat disappointing. I did an exhaustive computation of the characteristic polynomial for each of the 847 billion matrices, and stored them in a nice data structure. With the power of OpenMP with C++, gcc’s maximum optimization level, and a nice (unique) encoding of the characteristic polynomials as unsigned 64-bit integers, I was able to complete this task in under 6 hours on a desktop computer. The result is a 44.5 MB file of all characteristic polynomials and their densities. ~~This file is available on the GitHub repository for the project in the Data directory here.~~ The Characteristic Polynomial Database archives characteristic polynomial files.

Counting the Eigenvalues

Now that we know the exact distribution of the characteristic polynomials, we can count the exact number of distinct eigenvalues using symbolic computation. Clearly, with 1.8 million characteristic polynomials of degree 5, an upper bound on the number of eigenvalues is about 9 million. The exact number turns out to be a little smaller at 7,963,249. This is the result of some characteristic polynomials sharing roots, and some having multiple roots.

Counting the exact number of eigenvalues can be done as follows:

Let $S$ be the set of all characteristic polynomials from the class of matrices for which we want to count all distinct eigenvalues.
Let $S_{\text{sqrfree}}$ be the set of square-free factors from computing the square-free factorization of each element in $S$.
Let $G$ be a gcd-free basis of $S_{\text{sqrfree}}$.
The number of distinct roots in $S$ is the sum of the degrees of the polynomials in $G$.

Sounds simple enough, and luckily Maple has an implementation of the gcd-free basis algorithm. In practice, this works well when the number of polynomials in $S$ is small. Unfortunately, 1.8 million polynomials, as in the $5 \times 5$ matrix example, is not considered small when working in Maple.

In this case I resorted to using modular computation with my own gcd-free basis implementation in Maple. This implementation is by no means optimal; in fact, it could use a lot of improvement, but it was capable of determining (with high probability) that there are 7,963,249 distinct eigenvalues in the class of $5 \times 5$ matrices with entries sampled from $\{-1, 0, 1\}$. This computation did take slightly over a month to run, and was run modulo a number of different prime numbers including one on the order of $10^{18}$.

Symbolic methods are great, except that they are horribly slow. For counting distinct roots we are unfortunately forced to use symbolic methods as the numerical error in the eigenvalues makes it difficult to decide if eigenvalues are actually equal, or only nearly so.

Some Properties

Consider $n \times n$ matrices where the entries are drawn uniformly from the set $\{-1, 0, 1\}$. Table 1 contains a number of properties of this class of matrices for various values of $n$.

Matrix Size	Number of Matrices	Number of Characteristic Polynomials	Number of Eigenvalues
OEIS	A060722	A272658	A271570
$1 \times 1$	$3^1 = 3$	$3$	$3$
$2 \times 2$	$3^{2^2} = 81$	$16$	$21$
$3 \times 3$	$3^{3^2} = 19,\!683$	$209$	$375$
$4 \times 4$	$3^{4^2} = 43,\!046,\!721$	$8,\!739$	$24,\!823$
$5 \times 5$	$3^{5^2} = 847,\!288,\!609,\!443$	$1,\!839,\!102$	$7,\!963,\!249$
$6 \times 6$	$3^{6^2} = 150,\!094,\!635,\!296,\!999,\!121$	?	?

Table 1: Properties of the class of $n \times n$ matrices where the entries are sampled from $\{-1, 0, 1\}$. Links to the Online Encyclopedia of Integer Sequences (OEIS) are given.

Some Fun Examples

Diffraction Patterns

This is another one of my favourite examples (okay…they’re all my favourite) that leads to some interesting diffraction patterns in the resulting plot. This example looks at $5 \times 5$ matrices where the entries are drawn uniformly from the set $\{-20, -1, 0, 1, 20\}$. Figure 4 shows the resulting density plot of the Bohemian eigenvalues in the complex plane.

Eigenvalues of a sample of 100 million 5x5 matrices with entries drawn uniformly from the set \{-20, -1, 0, 1, 20\}. — **Figure 4:** Density plot in the complex plane of the Bohemian eigenvalues from a sample of 100 million random $5 \times 5$ matrices with entries drawn uniformly from the set $\{-20, -1, 0, 1, 20\}$. This image is viewed on $-55-45i$ to $55 + 45i$. A high resolution version can be found here.

Fibonacci Numbers

This example explores $5 \times 5$ matrices with entries sampled from the set of Fibonacci numbers up to (and including) 514,229. A sample of 500 million matrices was used. The resulting Bohemian eigenvalue image can be seen in Figure 5. One thing you may notice about this plot is the lack of symmetry across the imaginary axis. This is a result of the original set of values that the entries are sampled from not being symmetric. That is, given a matrix $A$, $-A$ is not in the set of possible matrices for this distribution.

Eigenfish

This is by far one of the most visually interesting classes that I have explored in this project. Here I will only present some of my favourite images from this class of matrices. I have decided to call these particular distributions “eigenfish” as many of the images resemble sea creatures (see Figure 7).

As an example, consider a random matrix of the form

\[\begin{bmatrix} 0 & 0 & 0 & U\\ -1 & -1 & 1 & 0\\ V & 0 & 0 & 0\\ -1 & -1 & -1 & -1 \end{bmatrix}\]

where $U$ and $V$ are drawn independently from a continuous uniform distribution on $(-5, 5)$. This is of course an infinite set of matrices, yet the eigenvalues from this set are bounded by the region seen in Figure 6. This region is the projection of the semi-algebraic set defined by the solutions of $f(\lambda, U, V) = \mathrm{det}(A - \lambda I) = 0$ onto the complex plane in $\lambda$.

Eigenvalues from a sample of 30 million matrices with 2 fixed, uniformly varying parameters. — **Figure 6:** Plot on the complex plane of the eigenvalues from a sample of 30 million matrices, viewed on $-3-3.7i$ to $3 + 3.7i$. This image is shaded based on the density of eigenvalues at each point (pixel) such that darker regions are of lower density.

Endless numbers of these images can be produced and Figure 7 presents 12 such images. Each image is the eigenvalues from the set of matrices where two entries (fixed location in matrix) vary uniformly between -1 and 1.

Several plots of the sets of eigenvalues from classes of 3x3 matrices where 2 entries are fixed and vary uniformly. — **Figure 7:** 12 Bohemian eigenvalue distributions from classes of $3 \times 3$ matrices where 2 entries (fixed) are sampled independently from a continuous uniform distribution on $[-1, 1]$.

Acknowledgements

There are a few people who have helped with the progress of this project so far. First, we would like to thank Daniel Lichtblau for verifying our findings for the number of unique characteristic polynomials for several classes of matrices, as well as for his suggestions for improving our method of counting characteristic polynomials. We also thank Erich Kaltofen for pointing out references to Tao and Vu for the asymptotic distributions of related problems. Finally, we would like to thank Sonia Gupta, Jonathan Brino-Tarasoff and Venkat Balasubramanian for their contributions to this project.

Steven Thornton