Neural Network for solving these linear algebra problems

Intro

There are several questions on this site about whether or not machine learning can solve specific problems. The answer (in my words) seems to be:

Yes, trivially, if you choose a model to learn your specific problem, but you sometimes may choose a model that can't represent/approximate the correct hypothesis.

I would like to choose a neural network model where, a priori, all I know is that the input is a linear algebra kind of function.

The Problem

I have an unknown function $f:M(n) \to \mathbb{R}$ where $M(n)$ is the set of $n \times n$ matrices. This function is a linear algebra-kind of function rather than a function that views $M(n)$ as an image. I would like to build a neural network to approximate $f$, and to increase my confidence that I chose a good neural network architecture, I want to verify first that this architecture can solve a collection of known linear algebra-type of problems.

The Input

To start, there is the question of how I should handle arbitrarily sized matrix inputs.

My first idea is that the first $n^2$ entries should store matrix entries $m_{i, j}$ with $i,j \leq n$. So a $3\times 3$ matrix should fit within the first $9$ inputs. I would then like to drop out the remaining inputs.

The main problem with this approach is that, unlike in image recognition, entries of $0$ have important implications in linear algebra. A row of $0$s would set the matrice's determinant to $0$, for example.

Another approach is to drop-out input entries and pre-specified subsets of the hidden layers as well. If the max input size is a $p\times p$ matrix, then each hidden layer should have $kp^2$ neurons for some $k \in \mathbb{N}$. For an $n\times n$ input, I would only have $kn^2$ of these neurons active.

The Linear Algebra Problems

I want to design a neural network architecture that can represent the following hypotheses:

  1. determinant
  2. max entry $\max_{i, j \leq n}(m_{i,j})$
  3. Frobenius norm $\sqrt{ \sum_{i, j \leq n} m_{i, j}^2 }$
  4. Maximum sum of entries in rows or columns $\max_{i \leq n}(\sum_{j \leq n} m_{i, j} )$
  5. largest singular value/eigenvalue
  6. Trace

I don't think these are too difficult, but it would be nice if the model could be easily adapted to include non-scalar outputs such as

  1. Matrix inverse
  2. Singular value decomposition
  3. LU Decomposition
  4. QR decomposition
  5. Eigenvectors

And other kinds of matrix decompositions.

The Question

The weirdness of this problem for me comes from 2 parts. The first is how to handle arbitrarily sized inputs. The second is how to deal with both max problems and linear algebra problems simultaneously. Max-pooling might help with the max entry problem, but it might interfere with QR decomposition. The true hypothesis might even be a composition of a max-problem and a linear algebra type problem.

Are there any rules-of-thumb or neural network architecture design tricks that can help me deal with this collection of problems simultaneously?

Topic linear-algebra machine-learning-model model-selection neural-network

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.