Go to other chapters or to other branches of math
Additional Material for this Chapter
Exercises for this Chapter
Answers to Exercises
Computer programs

Chapter 6
Linear Transformations

The reader can recall in Volume D the linear array and its algebraic properties were discussed, and then followed the application of arrays to geometry, producing position vectors. The common notation (x,y) and (x,y,z) are for horizontal linear arrays which are labels for an arbitrary point in the two-dimensional plane, and for an arbitrary point in three-dimensional space respectively. However, in many of the following discussions, there will be vertical linear arrays, which also locate the same points in the plane and in space:
These vertical arrays may be denoted by the transpose (x,y)' and (x,y,z)' respectively (using the apostraphe in place of the  t). They locate the same arbitrary points as the horizontal arrays.

Section 1:   Matrices as Functions

[1.1] (Linear expressions) A linear (homogenous) expression in x and y is an algebraic expression of the form   αx + βy,   where α and β are any real numbers. Similarly, a linear (homogeneous) expression in x,y, and z is an algebraic expression of the form   αx + βy + γz,   where α, β and γ are real numbers. The real numbers α, β and γ are called coefficients in those expressions.

The term "homogeneous" more accurately describes the linear expressions that will be discussed. But it will most often be omitted.

It is easy to extend this definition to include linear expressions in more variables, like x,y,z,w,... .However, this will not be done. Instead, discussions will involve only algebraic ideas that relate directly to the familiar geometric objects of point, line, plane and (3-dimensional) space. These four objects will be called linear objects in these discussions. They are to be distinguished from other geometric objects like circles, spheres, sine curves, helices,... .

In most discussions the coefficients of expressions will be integers, or, occasionally, rational numbers.

It is simple to find values for the linear expressions

(*)      5x + 6y    and    8x - 7y
if numerical values are given to x and y. If x and y are given values 3 and 4 respectively, then the expressions become
5(3) + 6(4) = 39    and    8(3) - 7(4) = - 4
By giving other values to x and y simultaneously, the two expressions have numerical values. Form the horizontal arrays (3,4)'and (39,-4) . To the first array the linear expressions associate the second array. This action can be written
(3,4) --> (39,-4)
The expressions can also be applied to vertical arrays:
(3,4)' --> (39,-4)
The reader can easily verify:
(1,-1)' --> (-1,15)',   (4,1)' --> (26,25)',   (α,β)' --> (5α + 6β, 8α - 7β)'
Consider these arrays as locating points in a plane. For each point (x,y)' in the plane the expressions assign a unique point (5x+6y,8x-7y)'. But this action satisfies the definition of a function F, carrying points in a plane onto points in the same plane:
(**)      F(x,y) = (5x+6y,8x-7y)
Here,
F(3,4) = (39,-4),   F(1,-1) = (-1,15),   F(4,1) = (26,25)
It is convenient to use the same function F for vertical arrays:
F(3,4)' = (39,-4)',   F(1,-1)' = (-1,15)',   F(4,1)' = (26,25)'
Using the coefficients of the linear expressions (*), form a 2x2 matrix:
Thinking of M (x,y)' as a matrix product of a 2x2 and a 2x1 matrices:
Compare this equation involving matrices with the definition (**) of function F, intuitively speaking, M and F do the same thing to points in the plane. More exactly, M and F perform the same action on the coordinates of every point in the plane. The only difference is that by notation F involes horizontal arrays, but M involves vertical arrays.


There is a similar discussion for 3x3 matrices and points in space. Given the linear expressions,
x + 2y + 3z,   4x + 5y + 6z,   7x + 8y + 9z
the function F can be formed: F(x,y,z) = (x+2y+3z, 4x+5y+6z, 7x+8y+9z) and the matrix M that corresponds to F is:
The reader can verify that the product   M(x,y,z)'   =   (x+2y+3z, 4x+5y+6z, 7x+8y+9z)'.


The linear expressions
2x + 3y    5x - 2y    -4x + y
receive values for x and y and return values for the three expressions. Therefore, they carry points from a plane into space:
(x,y)' --> (2x+3y, 5x-2y, -4x+y)'
The function for this action is F defined by F(x,y) = (2x+3y, 5x-2y, -4x+y).   The matrix is
For example, Let M act on point (6,7)' and let F act on the same point (6,7):
M(6,7)' = (2(6)+3(7), 5(6)-2(7), -4(6)+1(7))' = (33, 14, -17)'
F(6,7) = (33, 14, -17).
By the matrix product a non-square matrix "changes" a column array of some length into a column array of a different length. The corresponding function carries an array of some length onto an array of a different length. Intuitively speaking, they both carry points in some dimension onto points in another dimension.

[1.2] (Corresponding functions) The correspoinding function F for a matrix M is obtained by defining F by F(x,y) = (linear array in x and y with coefficients from row 1 of M, linear array in x and y with coefficients from row 2 of M,....). If three variables are involved, F(x,y,z) = (linear array in x,y and z with coefficients from row 1 of M, linear array in x,y and z with coefficients from row 2 of M,....)



Section 2:   Linear Transformations

Subsets of geometric points can form recognizable objects: lines, circles, squares, cones,... . Some of these objects are more fundamental to discussions here: points, lines, planes, entire 3 dimensional space. They may be called basic affine objects. They have dimensions 0,1,2, and 3 respectively. They are fundamental parts of the subjects of plane and solid geometry. If they contain special points called origins, then they are basic linear objects. The term "basic" will often be omitted in these discussions.

Since linear objects have origins, position vectors exist. These have been denoted by p,q,r which locate points P,Q,R and can be expressed as   p = OP,   q = OQ,   r = OR.   For a geometric motivation for linear transformations click here.

[2.1] (Linear transformations) A function F from a linear object to a linear object is a linear transformation if and only if it satisfies the following two conditions:
  (a) F(p + q) = F(p) + F(q);     (homomorphism)
  (b) F(λp) = λF(p)     (homogeneous)
where p and q are any (position) vectors in the first linear object and λ is any real number.

Example:
Consider the function F from any linear object onto itself, defined by F(p) = 2p, for every point P in the first linear object.. Then F(q) = 2q, F(r) = 2r for position vectors q,r.... Intuitively speaking, F stretches any position vector to another vector pointing in the same direction but twice the length. F satisfies both conditions (a) and (b) of [2.1]:
  (a) F(p + q) = 2(p + q) = 2p + 2q = F(p) + F(q).
  (b) F(λp) = 2(λp) = λ(2p) = λF(p).

The following is a generalization of the defintion [2.1] of a linear transformation.

[2.2] Every linear transformation carries any linear combination of position vectors onto a linear combination of position vectors with the same coefficients.
Notation: If F is a linear transformation then F(λp + σq + ... + ωr) = λF(p) + σF(q) + ... + ωF(r).


In a linear object position vectors locate points, and to those points are "attached" linear arrays. Therefore, linear arrays can be represented by position vectors. If a position vector p in a plane locates a point P(x,y), then p = (x,y). Similarly, if p in space locates point P(x,y,z) then p = (x,y,z). As a result,definition [2.1] may be stated for linear arrays by replacing p and q by their equals (x1,y1) and (x2,y2):   F(p) = F(x1,y1), and F(q) = F(x2,y2). Similarly if F is from space: F(p) = F(x1,y1,z1), and F(q) = F(x2,y2,z2).   (Here the two F's denote different functions because they involve different linear objects, namely plane and space.)

Example: The function F defined by F(x,y) = (x + y, x - y) is a linear transformation from a plane onto itself. It carries point (1,1) onto point (2,0), point (5,3) onto (8,2). Click here to see the proof that it satisfies both conditions

(a)      F((x1,y1) + (x2,y2)) = F(x1,y1) + F(x2,y2)
and
(b)      F(λ(x,y)) = λF(x,y)                                  

The following functions F,G,H are also linear transformations:

F(x,y,z) = (2x + 3y + 4z, 4x - y - 3z, x +6y + z),      G(x,y,z) = (4x - y + z, 2x + 7y +5z),      H(x,y) = (x + y, x - y, 3x + 2y)
From the lengths of the arrays involved, it is easy to see that
F carries space into space,         G carries space into a plane;         H carries a plane into space

[2.3] A function from a plane into a linear object is a linear transformation if and only if carries an arbitrary point (x,y) onto a point whose coordinates are linear expressions of x and y. A function from space is a linear transformation if and only if it carries an arbitrary point (x,y,z) onto a point whose coordinates are linear expressions of x,y and z.



The pair of special unit arrays in a plane   (1,0) = i and (0,1) = j   as well as the triple of special unit arrays in space   (1,0,0) = i, (0,1,0) = j and (0,0,1) = k   play special roles with linear transformations.

Example: suppose F(1,0) = (4,5) and F(0,1) = (6,7). Onto what point does F carry the arbitrary point (x,y)? Notice that array (x,y) = x(1,0) + y(0,1). Then F(x,y) = xF(1,0) + yF(0,1) = x(4,5) + y(6,7) = (4x + 6y, 5x + 7y). Therefore, F carries the arbitrary point (x,y) onto the point (4x + 6y, 5x + 7y). This fact completely defines and determines F.

[2.4a] (Linear transformation from the plane is determined by images of two special points) From the identity

F(x,y) = xF(1,0) + yF(0,1)
a linear transformation F from the plane into a linear object is completely determined by the two image points F(1,0) and F(0,1).

[2.4b] (Linear transformation from space is determined by images of three special points) From the identity

F(x,y,z) = xF(1,0,0) + yF(0,1,0) + zF(0,0,1)
a linear transformation F from space into a linear object is completely determined by the three image points F(1,0,0), F(0,1,0), and F(0,0,1).

The expression xF(1,0) + yF(0,1) is actually an array whose coordinates are linear expressions in x and y. If F(1,0) = (α1, β1) and F(0,1) = (α2, β2) then

(#)        xF(1,0) + yF(0,1) = (α1x + β1y, α2x + β2y)
In a similar way, it is easy to show that if F(1,0,0) = (α1, β1, γ1), F(0,1,0) = (α2, β2, γ2) and F(0,0,1) = (α3, β3, γ3) then
(##)        xF(1,0,0) + yF(0,1,0) + zF(0,0,1) = (α1x + β1y + γz1, α2x + β2y + γz2, α3x + β3y + γ3z)
By eliminating or adding coordinates itis possible to produce further arguments supporting theorem [2.3].

The coefficients of the linear expressions in (#) and (##) can be collected into matrices M2 and M3:

[2.5a] (Linear transformations from a plane into some linear object) If F is any linear transformation from a plane into some linear object then F(x,y)' = M(x,y)' where M is the matrix

Example. In the discussion just before [1.2] F was defined by F(x,y) = (2x+3y, 5x-2y, -4x+y). Then F(1,0) = (2, 5, -4) and F(0,1) = (3, -2, 1). These become the column arrays (2, 5, -4)' and (3, -2, 1)' in the corresponding matrix M:

[2.5b] (Linear transformations from space into some linear object) If F is any linear transformation from space into some linear object then F(x,y,z)' = M(x,y,z)' where M is the matrix

Example. In the discussion before [1.2] F was defined by F(x,y,z) = (x+2y+3z, 4x+5y+6z, 7x+8y+9z) and the matrix M that corresponds to F is:
Then F(1,0,0) = (1, 4, 7), F(0,1,0) = (2, 5, 8), F(0,0,1) = (3, 6, 9) from which the column arrays (1, 4, 7)', (2, 5, 8)', (3, 6, 9)' in matrix M are formed.

Recall that if a function F carries a point P onto a point Q, and a function G carries point Q onto a point R, then the composite GF carries P onto R.

[2.6] (Composition of linear transformations) The composite of two linear transformations is a linear transformation.
Notation: if F and G are linear transformations, then GF is a linear transformation.

F carries a position vector p + q onto a position vector F(p + q). Then G carries F(p + q) onto GF(p + q). But GF(p + q) = G(F(p) + F(q)) = GF(p) + GF(p + q). Also GF(λp) = G(λF(p)) = λGF(b>p); Therefore GF is a linear transformation. As was mentioned earlier, linear transformations and their corresponding matrices behave the same way, except for notation. Therefore, the following statement comes as no surprise.

[2.7] (Composition and matrix product) The matrix that corresponds to the composite of two linear transformations is equal to the product in the same order of the matrices that correspond to the individual linear transformations.
Notation: If M is the matrix that corresponds to linear transformation F, and N is the matrix that corresponds to linear transformation G, then the (matrix) product MN of the two matrices is the matrix that corresponds to the composite GF of the two transformations.