## Chapter 5: Data Repositories: Matrices and Arrays

### Of Arrays and Vectors

These are just two basic data structures in computer science, both of which are intended to be essentially data repositories.
Arrays include one-dimensional collections of data – also called Vectors – and
Matrices, which expand the first to two or more dimensions; whatever number of dimensions your applications might need.

Starting with Arrays, the key concept upon which arrays have a physical order is that of an indexing value. Languages vary, but here we will be referring to the index of the first array element as being 0, then next as 1, and so on as in the following figure; each of the boxes so indexed contain whatever you want them to contain, integers, structures, pointers to objects, pointers to other arrays, what have you: It is just an ordered set, ordered only by the layout of the set in the array via indexing, about the simplest way of organizing the contents of those sets for you to access.

Because of that indexing, if we named the whole thing Array in some programming language, we often refer to the first of the elements in that set as Array, the next as Array, and so on. If you have an integer value that you want to use to work though this array that you happen to called Index, any of them can be referred to as Array[Index] where Index could take on any of the integer values from 0 through N-1, where N is the number of elements the array.

As 2-dimensional array – a Matrix – is conceptually like the following: conceptually laid out as rows (horizontal) and as columns (vertically), here N rows by M columns. In a language like Java, if you intend to work down through the rows of a given column, it is the first of the two index values that gets advanced faster. So, as in the above figure, the first index value corresponds to the row, the second to the column. Still, it is just a 2-dimensional set of boxes, each containing some type of data of your choosing, no matter how you perceive it.

As an aside, there is a physical mapping concerning how these are really mapped into memory that you might want to be aware of. Some languages map adjacent elements of each column into contiguous memory. Others map the adjacent elements of individual rows into contiguous memory. It can matter to performance, so please look in Wikipedia under row- and column-major order for a more complete definition.

As one more quirk relating to Java &ndash a strictly object-oriented language – and a few other languages, the layout of memory is not exactly as in the above figure and is neither strictly row- or column-major. For those, a 2-dimensional matrix is really an array (1-dimensonal) of arrays (also 1-dimensional) as in the following figure: The leftmost array only contains the addresses of the first entry of the arrays to the right. To get to the element at – say – [1,2] of the matrix, the compiler generates code to find a pointer at index 1 of the left-most array, thereby finding the horizontal 1-dimensional array, and from there finds the entry at index 2. From your point of view, no matter how your matrix is physically laid out, this too is just 2-dimensional array. So, why does this matter? With the leftmost array representing pointers to the columns, it is the adjacent elements of a given row that are contiguous in memory. So, it can be faster to advance the index representing the column at the faster rate (while holding the row index constant) than the index representing the rows.

As mentioned earlier, each of those elements – those memory blocks – can contain pretty much anything that you want each to hold, physically as well as conceptually. They can, of course, hold primitive data types such as integers and floating-point values, but you can also place data structures and objects there as well. Each element is just contiguous blocks of bytes after all. Or you can have those array entries also contain pointers to those same structures and objects – with these objects containing whatever you like as in the following: Some languages – C/C++ comes to mind – allow you to do it either way. The data can be in the matrix elements themselves or can be in a separate object in memory addressed by the matrix elements. You decide. Others, like JAVA, even though this might be perceived in the languages as just a 2-dimensional array of objects, in reality what you get is what you see above. Fine, they are all matrices. And every single element of that array/matrix has the same datatype as every other.

### Matrix Operations: Addition, Subtraction, Multiplication

Let's suppose that we have two 2-dimensional arrays (A and B) both holding simple integers and we want to add them to produce a third (C) as in the following: What are we really doing? Nothing more than choosing one integer from corresopnding location in each – A and B – and adding those, placing the result into a corresonding element of the third C. Normally that means that for a given indexed element (row, column) into each matrix, we add the integers found there and put the result into the corresponding location in C. Same thing for subtraction. Pseudo-code for doing this follows: Matrix multiplication takes some additional thought, and you will be seeing code for doing so in a moment. The essence of it though is to multiply the elements of a row in matrix A by corresponding elements of a column in matrix B, add all of those together, and place the result of that into one element of matrix C. Each single element of C happens to allows be the intersection point of A's row and B's column. In our figure below, we will do that 16 (4 X 4) times.  For example, to find the result for element C[0,0], we work left through the columns of row 0 of A (in blue) and the rows of column 0 of B (also in blue), doing our multiply-adds. Follow the same procedure for the red columns and element. Of course, ultimately, every element of Matrix C must have been produced.

The two source matrices, though, do not need to be square matrices. A requirement is that the number of columns in A (the left-hand matrix) is equal to the number of rows in B (the right-hand matrix). So, for example, we could multiply a 2 X 3 matrix by a 3 X 2 matrix and get a 2 X 2 matrix. In general, given the multiplication two source matrices we can determine the size of the target matrix as in the following:

M by N * N by P → M by P

where the first represents the matrix size of the rows and the second the columns; row by columns.

Not mentioned yet is that you can multiply any size matrix by a single integer – a scalar. The effect is much like repeatedly adding the matrix to itself. Perhaps also of interest, we can multitple two matrices together and produce what amounts to a scalar, a single item matrix. 