Introduction Computational Ingineering Matlab
Introduction Computational Ingineering Matlab
Introduction Computational Ingineering Matlab
Computational
Engineering with
MATLAB®
Introduction to Computational Engineering with MATLAB® aims to teach readers how to use
MATLAB® programming to solve numerical engineering problems. The book focuses on computa-
tional engineering with the objective of helping engineering students improve their numerical prob-
lem-solving skills. The book cuts a middle path between undergraduate texts that simply focus on
programming and advanced mathematical texts that skip over foundational concepts, feature cryptic
mathematical expressions, and do not provide sufficient support for novices.
Although this book covers some advanced topics, readers do not need prior computer programming
experience or an advanced mathematical background. Instead, the focus is on learning how to lever-
age the computer and software environment to do the hard work. The problem areas discussed are
related to data-driven engineering, statistics, linear algebra, and numerical methods. Some example
problems discussed touch on robotics, control systems, and machine learning.
Features:
• Demonstrates through algorithms and code segments how numeric problems are solved with
only a few lines of MATLAB® code.
• Quickly teaches the basics and gets readers started programming interesting problems as soon
as possible.
• No prior computer programming experience or advanced math skills required.
• Suitable for undergraduate students who have prior knowledge of college algebra, trigonometry,
and are enrolled in Calculus I.
• MATLAB® script files, functions, and datasets used in examples are available for download from
http://www.routledge.com/9781032221410.
Tim Bower is an Associate Professor of Robotics and Automation Engineering Technology and Com-
puter Systems Technology at Kansas State University Salina. He received the B.S. Electrical Engineering
degree from Kansas State University (K-State) in 1987 and the M.S. Electrical Engineering degree from
the University of Kansas in 1990. He was a Senior Member of the Technical Staff at Sprint’s Local Tele-
phone Division from 1989 to 1998. From 1998 to 2003, he was a systems administration manager and in-
structor at Kansas State University in Manhattan Kansas while taking graduate course work in Comput-
er Science. He joined the faculty of K-State’s campus in Salina Kansas in 2004. He teaches undergraduate
courses related to programming in C, Python, and MATLAB®, robotics programming, machine vision,
numerical computation, operating systems, data structures and algorithms, and systems administration.
Away from teaching, he enjoys spending time with his wife, three grown children, and five grandchildren.
Numerical Analysis and Scientific Computing Series
Series Editors:
Frederic Magoules, Choi-Hong Lai
Timothy Bower
Kansas State University Salina, USA
MATLAB® is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks
does not warrant the accuracy of the text or exercises in this book. This book’s use or discussion of
MATLAB® software or related products does not constitute endorsement or sponsorship by The
MathWorks of a particular pedagogical approach or particular use of the MATLAB® software.
Reasonable efforts have been made to publish reliable data and information, but the author and
publisher cannot assume responsibility for the validity of all materials or the consequences of their use.
The authors and publishers have attempted to trace the copyright holders of all material reproduced
in this publication and apologize to copyright holders if permission to publish in this form has not
been obtained. If any copyright material has not been acknowledged please write and let us know so
we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access www.copyright.com
or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923,
978-750-8400. For works that are not available on CCC please contact mpkbookspermissions@tandf.
co.uk
Trademark notice: Product or corporate names may be trademarks or registered trademarks and are
used only for identification and explanation without intent to infringe.
Publisher’s note: This book has been prepared from camera-ready copy provided by the authors.
Preface xvii
1 MATLAB Programming 1
1.1 The MATLAB Development Environment . . . . . . . . . . . 2
1.1.1 Using the IDE . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 How to get help . . . . . . . . . . . . . . . . . . . . . 4
1.2 Variables and Values . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Command Window Calculator . . . . . . . . . . . . . 4
1.2.2 Identifier Names . . . . . . . . . . . . . . . . . . . . . 6
1.2.3 Calling Functions . . . . . . . . . . . . . . . . . . . . . 6
1.2.4 Numeric Data Types . . . . . . . . . . . . . . . . . . . 7
1.2.5 Simple Arrays . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.6 Clearing Variables . . . . . . . . . . . . . . . . . . . . 9
1.2.7 Some Pre-defined Constants . . . . . . . . . . . . . . . 9
1.3 MATLAB Scripts . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.1 Displaying Results . . . . . . . . . . . . . . . . . . . . 11
1.3.2 Adding Sections . . . . . . . . . . . . . . . . . . . . . 11
1.3.3 Comments . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4 Input and Output . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4.1 Input Function . . . . . . . . . . . . . . . . . . . . . . 12
1.4.2 Output Functions . . . . . . . . . . . . . . . . . . . . 13
1.4.2.1 disp . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.2.2 fprintf . . . . . . . . . . . . . . . . . . . . . 14
1.5 For Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5.1 Code Blocks . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5.2 For Loop Syntax . . . . . . . . . . . . . . . . . . . . . 15
1.5.3 Colon Sequences . . . . . . . . . . . . . . . . . . . . . 16
1.5.4 Application of For Loops in MATLAB . . . . . . . . . . 17
1.5.5 Fibonacci Sequence . . . . . . . . . . . . . . . . . . . . 17
1.5.6 First Plot . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5.7 A Multi-line Plot . . . . . . . . . . . . . . . . . . . . . 19
1.6 Control Constructs . . . . . . . . . . . . . . . . . . . . . . . 19
1.6.1 Selection Statements . . . . . . . . . . . . . . . . . . . 21
1.6.1.1 If Construct . . . . . . . . . . . . . . . . . . 21
vii
viii Contents
1.6.1.2 Else . . . . . . . . . . . . . . . . . . . . . . . 21
1.6.1.3 Elseif . . . . . . . . . . . . . . . . . . . . . . 22
1.6.1.4 Switch–Case Construct . . . . . . . . . . . . 22
1.6.1.5 Example Selection Statements . . . . . . . . 23
1.6.2 While Loop . . . . . . . . . . . . . . . . . . . . . . . . 23
1.6.3 Example Control Constructs—sinc . . . . . . . . . . . 24
1.6.4 Continue and Break . . . . . . . . . . . . . . . . . . . 24
1.6.4.1 Continue . . . . . . . . . . . . . . . . . . . . 25
1.6.4.2 Break . . . . . . . . . . . . . . . . . . . . . . 26
1.6.4.3 Continue and Break Example . . . . . . . . . 26
1.7 Vectors and Matrices in MATLAB . . . . . . . . . . . . . . . 27
1.7.1 Matrix Generating Functions . . . . . . . . . . . . . . 30
1.7.2 Scalar—Vector Arithmetic . . . . . . . . . . . . . . . . 31
1.7.3 Element-wise Arithmetic . . . . . . . . . . . . . . . . . 32
1.7.4 Vector and Matrix Indices . . . . . . . . . . . . . . . . 32
1.7.4.1 Ranges of Indices . . . . . . . . . . . . . . . 33
1.7.4.2 Accessing Data in Matrices . . . . . . . . . . 33
1.7.5 Delete Vector or Matrix Data . . . . . . . . . . . . . . 34
1.7.6 Linear and Logarithmic Spaced Vectors . . . . . . . . 35
1.8 MATLAB Functions . . . . . . . . . . . . . . . . . . . . . . . 36
1.8.1 Syntax of a Function . . . . . . . . . . . . . . . . . . . 36
1.8.2 Calling a Function . . . . . . . . . . . . . . . . . . . . 37
1.8.3 Example Function . . . . . . . . . . . . . . . . . . . . 38
1.8.4 Function Handles . . . . . . . . . . . . . . . . . . . . . 39
1.9 Functions Operating on Vectors . . . . . . . . . . . . . . . . 40
1.9.1 Replacing Loops with Vectorized Code . . . . . . . . . 40
1.9.2 Vectors as Input Variables . . . . . . . . . . . . . . . . 40
1.9.3 Logical Vectors . . . . . . . . . . . . . . . . . . . . . . 40
1.9.4 Sinc Revisited . . . . . . . . . . . . . . . . . . . . . . . 43
1.10 Importing Data Into MATLAB . . . . . . . . . . . . . . . . . 44
1.10.1 Saving and Loading Workspace Data . . . . . . . . . . 44
1.10.2 Import Tool . . . . . . . . . . . . . . . . . . . . . . . . 44
1.10.3 Reading Tables . . . . . . . . . . . . . . . . . . . . . . 45
1.10.4 Dealing with Missing Data . . . . . . . . . . . . . . . 45
1.10.5 Exporting Table Data . . . . . . . . . . . . . . . . . . 49
1.11 Text Strings in MATLAB . . . . . . . . . . . . . . . . . . . . 49
1.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Bibliography 403
Index 409
Preface
Frank W. Boreham
1919
This book is for people that are curious about technology. It is for people
that must always ask, “How does that work?”, “What is under the hood?”,
and “Can I come up with a better solution?”. Such people are naturally drawn
to the fields of mathematics, science, and engineering. When the mind of these
people is in hot pursuit of a fresh idea, time is of the essence. Efficient tools,
proficiency, and knowledge are needed to prototype, simulate, visualize, and
analyze. Perhaps, the most obvious tool that they will deploy is the com-
puter. With the speed and storage capacity of modern computers along with
advanced computational software, your personal computer is a powerful en-
gineering tool. My hope in writing this book is that readers will press their
computers into the service of problem solving. We call this computational
engineering.
Computational engineering uses numerical computing, data analysis, visu-
alization, and software tools to model, analyze, and solve a variety of science
and engineering problems. Writing computer programs is merely a component
of the process toward the goals of computational engineering. Mathematics,
scientific knowledge, abstract reasoning, logic, and common sense are also
components of the process.
Successful professionals in all branches of science and engineering use com-
puters to their best advantage. But computers have strict rules about how they
are programmed. However, that does not mean that effective engineers need
to also be computer scientists. A few software development environments have
become particularly popular with engineers that allow users with limited pro-
gramming experience to quickly solve a variety of numerical problems and
plot data. Modern software tools for engineers are not yet able to relax the
strict rules of programming, but they can accomplish some amazing things
xvii
xviii Preface
with just a few lines of code. The primary software tool that we will use is
Matlab ®1 from The MathWorks, Inc.
Think of the material in this book in the same way that you think about
long division. I’m glad that I learned how to do long division in elementary
school. But when presented with a division problem, I either do a rough es-
timate in my head or I reach for a calculator. I seldom use pencil and paper
to do long division. In the same manner, engineers look to computers as a
tool for solving engineering problems as much or more than pencil, paper, or
even hand held calculators. Computers allow us to work quickly, accurately,
and produce impressive results. Just as it was important to learn about long
division in elementary school, we need to discuss some mathematics concepts
that are used in computational engineering. But the mathematics that we
cover will directly apply to problems and will be solved by computers.
Some of the algorithms that we will take advantage of are quite advanced,
but are pre-developed and available for our use. To affectively apply these
algorithms, we need to understand the basic concepts of what they do and
the relationship of the input variables to the output. So we strive for a balance.
Some basic knowledge about the algorithms is needed. If I can write a simple
implementation of an algorithm, then I really understand it and can properly
apply it to problems. However, it is not necessary to develop implementations
of algorithms at the level of robustness, numerical accuracy, and speed as the
functions found in Matlab.
problems. The books that I found that address the material I wanted to cover
are at the graduate school level, which would frustrate undergraduate stu-
dents. So I decided to write notes for the course that would cut a middle
path. Although the course covers some advanced topics, students do not need
prior computer programming experience or advanced math skills. Instead, the
focus is on learning how to leverage the computer and software environment
to do the hard work.
Since the origin of book came from my course notes, the style of writing is
more casual than formal. I try to explain what is needed while cutting through
advanced details that undergraduate level readers might find confusing. The
secret to achieving this objective is the source code. In graduate school, one of
my professors often told his students, “If you want to know how things really
work, read the source code”. That is the model that I have tried to apply.
Matlab code examples illustrate nearly every observation. In some cases,
the examples are a few commands typed in the Matlab Command Window.
In many cases, the examples are complete scripts and functions that feature
plots showing the data relationships. I find that examples with real numbers
always help when trying to understand an algorithm or math equation. I feel
that source code examples often teach concepts best because details can not
be omitted or assumed.
Content Comments
Determining what content to include is perhaps the question that has re-
quired the most thought in my course development. The question that I asked
myself was, “What do the students need?”. What do they need to learn to get
through more advanced courses and before they graduate? An introduction to
programming was, of course, considered obligatory. Then visually displaying
data with plots and graphs was quickly deemed essential. Beyond the prereq-
uisites, I tried to pick topics where a computer and a bit of computational
engineering knowledge might pay the best dividends.
Chapter 1 offers a fast–paced but sufficient introduction to Matlab. The
goal is to quickly get students programming with a some basic knowledge. As
a continuation of chapter 1, chapter 2 covers how to make effective two and
three dimensional data plots.
Chapter 3 provides an introduction to computational statistics. Although
coverage is given to basic statistical metrics, calculations, and common prob-
ability distributions, the focus is on computation and graphical display of
statistical information. The statistics chapter could have easily grown to be
much larger. The intent, however, is to provide a supplement to what students
learn in a course on statistics and probability.
xx Preface
Chapter 6 continues the discussion of linear algebra, but shifts the focus
from systems of equations to applications that take advantage of the eigen-
values and eigenvectors of a matrix. Since this is a new concept to many
undergraduate students, particular attention is given to what eigenvalues and
eigenvectors are and the significance of the relationship Ax = λx.
Computational numerical methods are discussed in chapter 7. This chapter
started as a short chapter focused entirely on understanding and using the
efficient functions provided by Matlab. However, the limited coverage did
not give an adequate appreciation of the algorithmic considerations. So the
coverage was extended to include more detailed algorithmic descriptions and
algorithm alternatives.
The appendices describe supporting material that is more complex and
is not strictly required for applying the material in the previous chapters,
but will provide curious readers with proofs, more detailed definitions, and
descriptions of some algorithms used by the Matlab functions. Most of the
material in the appendices relate to the linear algebra topics.
Preface xxi
Acknowledgments
I wish to thank the administration, faculty, staff, and students at Kansas
State University Salina for their support and encouragement. Thank you to my
colleague Kaleen Knoop and the Writing Center for making the Grammarly
program available. It helped me numerous times. I owe a particular debt of
gratitude to my colleague Gayan Samarasekara for reviewing and offering
corrections to chapter 3 on statistical data analysis.
I wish to thank the Taylor & Francis Group for giving me the opportunity
to fulfill a lifelong goal in publishing this work. Special thanks to editors Mansi
Kabra and Callum Fraser for their encouragement, patience, and advice.
I am deeply blessed and thankful for the emotional support and encourage-
ment from my wife, Pam. I would not have been able to complete the writing
without her. I’m also grateful for the encouragement and joy that comes from
our son, daughters, their spouses, and our grandchildren.
Tim Bower
Salina, Kansas
List of Program Files
xxiii
xxiv List of Program Files
DOI: 10.1201/9781003271437-1 1
2 Introduction to Computational Engineering With MATLAB ®
>> 2 + 1
ans =
3
MATLAB Programming 5
>> x = 2 + 1
x =
3
Tip: The keyboard arrow keys may be used to retrieve previously entered
commands.
Variable
When a value is given a name, it becomes a variable. Variables are used
to store data for display or future computations. The name given to a
variable is called an identifier, which just means that we choose the name.
The rules for what is a valid identifier are covered later.
Assignment Statement
Storing data in a variable with an equal sign (=) is formally called an
assignment statement in computer science terminology. A data value is
assigned to be held in a variable.
ans
Matlab knows that users might perform a calculation without saving
the result to a variable but later want to use the result. So any result not
saved to a variable is automatically stored in a variable called ans.
NOTE: Only the most recent unsaved result is saved in ans.
The mathematics operators used with simple variables and constant num-
bers are:
+ Addition
- Subtraction
* Multiplication
/ Division
^ Exponent
The order of operator precedence is what you would expect from basic
algebra.
1. parentheses, brackets
2. exponents, roots
3. multiplication, division
4. addition, subtraction
6 Introduction to Computational Engineering With MATLAB ®
When operations are at the same level, then the order execution is left to
right.
>> 2 * (3-4) / 5
ans =
-0.4000
>> sin(1)
ans =
0.8415
MATLAB Programming 7
What is a keyword?
>> atan2(1,1)
ans =
0.7854
Caution: A common error is to leave out the commas between the argu-
ments of a function.
Other numeric data types, such as signed integers and unsigned integers
are available if needed. Look up the documentation for the cast function for
a detailed list of data types.
Matlab treats all numeric data as a matrix (discussed later). Scalar values
are thus reported as having a size of 1×1.
>> a = 5
a =
5
>> whos
Name Size Bytes Class Attributes
a 1x1 8 double
format Command
>> x = zeros(1, 5)
x =
0 0 0 0 0
>> x(1) = 3
MATLAB Programming 9
x =
3 0 0 0 0
>> x(4) = 6
x =
3 0 0 6 0
>> x(1)
ans =
3
>> clear x
Name Meaning
i Imaginary number
j Imaginary number
Inf Infinity
pi Ratio of circle’s circumference to its diameter
NaN Not-a-Number
• Like the Command Window, scripts access variables from the global
Workspace.
• Create a script with the command edit myscript.m, or from the Home
tab, select either New Script, or New -> Script.
• It is possible to run just one or two lines of code at a time to verify the
expected results.
– Use the mouse to highlight a few lines of code, then press the F9
key on the keyboard to execute just those lines of code.
– Another way to execute a few lines is to make those lines a sec-
tion. The percent sign (%) is the beginning of a comment line. Two
percent signs and a space character (%%) at the beginning of a line
mark a section boundary. Select a section by clicking anywhere in
the section and then, in the editor tab ribbon, click on run section.
MATLAB Programming 11
• If the output would be too much to display on the screen; or you just
don’t want to see it, then add a semicolon after the command.
• Many commands, such as plotting commands, do not produce output
to the Command Window, so there is no difference between using a
semicolon or not.
>> b
b =
5
In a script, you may want to use disp or fprintf to show results. We discuss
them in more detail in section 1.4.2. You may also enter a variable name
without a semicolon to see its value in a script, but Matlab will display a
warning telling you that you should terminate a statement with a semicolon
in a script.
>> disp(b)
5
%% Title of a Section
Tip: An optional window in the IDE called Panel Titles displays the section
titles and allows a quick way to move to and select a section.
1.3.3 Comments
A few words of explanation can make it much easier to reuse a program
after you have forgotten the details. Adding text descriptions that are not
executed, called comments, allows you to explain the functionality of your
code. Text following a percent sign, %, is a comment. Comments can appear
on lines by themselves, or they can be appended to the end of a line.
The first contiguous block of comments after a function declaration (sec-
tion 1.8) is reserved for help information which is displayed when you use the
help or doc commands. Comments after the first blank line are not treated
as part of the help.
The input function evaluates what the user enters, so more complex data
such as a vector and matrix can also be entered.
1.4.2.1 disp
The disp function takes only one argument and generally formats the
output in an acceptable manner. If you want to display multiple outputs, such
as a text string and a number, there are two options. You can use separate
calls to disp, or pass one item to disp that is an array (row vector) where
each item in the array has the same data type. In the following example, the
num2str function converts a number to a string, thus both items in the array
(notice the square brackets) are strings.
1.4.2.2 fprintf
The fprintf function doesn’t provide as much help in formatting the
output, but offers the programmer complete flexibility to customize the ap-
pearance of the output. When the output is to be displayed in the Command
Window, the first argument to fprintf is a string called a format specifier.
The string may contain text to display and also information about the data
type and location for variables to be displayed in the output. Additional spe-
cial characters (called escape characters) are often used in the format specifier.
A new line is achieved with \n. A tab character is inserted in the output with
\t. To display an actual backslash, use two backslashes \\.
Any number of variables may be passed to fprintf, but they should match
the variable references in the format specifier. Each variable reference begins
with a percent sign (%) and uses a letter code to indicate the data type. In
addition, the variable reference can also specify information such as the num-
ber of characters to use when displaying the variables (called the field width)
and the justification (left or right) within the character field. Programming
beginners may need practice using fprintf. The examples below can serve
as a guide for how to use fprintf. Be sure to read Matlab’s documentation
for fprintf.
The data type codes in the format specifier are listed in table 1.3 followed
by some examples. Data type codes ’s’, ’d’, ’g’, ’f’, and ’e’ are the most
frequently used codes.
Letter Meaning
s string
c character
d integer
u unsigned integer
f floating point
g floating point, but fewer digits
e scientific notation
o octal
x hexadecimal
Here, sequence is a series of values (usually numbers). The sequence could also
be called a row vector. The variable idx sequentially gets the next value of
the sequence each time the code block runs.
16 Introduction to Computational Engineering With MATLAB ®
Here is an example for loop. The variable k is 1 the first time through the
loop. On the second iteration, k is 2. During the third a final execution of the
loop, k is 3.
for k = 1:3
disp([’Iteration: ’,num2str(k)])
disp(2*k)
end
Iteration: 1
2
Iteration: 2
4
Iteration: 3
6
>> 1:5
ans =
1 2 3 4 5
>> 3:7
ans =
3 4 5 6 7
>> 1.5:4.5
ans =
1.500 2.500 3.500 4.500
With three arguments, the first and third argument specify the range as
before, while the second argument gives the step size between items of the
sequence.
>> 1:2:5
ans =
1 3 5
>> -12:4:12
ans =
MATLAB Programming 17
-12 -8 -4 0 4 8 12
>> 0:0.5:3
ans =
0 0.500 1.000 1.500 2.000 2.500 3.000
>> 3:-1:0
ans =
3 2 1 0
Fibonacci sequence
The Fibonacci sequence is interesting because it is a series of numbers
that naturally occurs in nature. It is also a sequence that the computer sci-
ence education world has latched onto because it can be implemented var-
ious ways to teach programming concepts and to illustrate programming
strategies. A simple recursive function runs very slow because the program
recalculates values many times. Using dynamic programming greatly im-
proves the performance, and we will see in section 6.6.1 that there is also
a closed form (not iterative) equation for calculating Fibonacci sequence
values.
You should see a plot with a small circle at point (x = 1, y = 2). If you
plot another point, the first plot is replaced by the new one.
A peak ahead
A better way to code figure 1.2 is to pass a sequence of points to one plot
command as follows. We will discuss how to find the sequence of y axis data
points in section 1.7.3.
k = 0:5;
plot(k, k.^2, ’r*’);
MATLAB Programming 19
% File: firstPlot.m
%% Plot k^2 for k = 0 to 5
hold on
for k = 0:5
plot(k, k^2, ’r*’);
end
hold off
%% title and axis labels
title(’Y = k^2’)
xlabel(’k’)
ylabel(’k^2’)
% File: multiline.m
% Multiline plot with a for loop
x = -1.5:0.1:1.5;
style = ["-", "--", "-."];
hold on
for k = 1:3
plot(x, x.^k, style(k), ’LineWidth’, 2)
end
axis tight
legend(’y = x’, ’y = x^2’, ’y = x^3’, ’Location’, ’North’)
hold off
FIGURE 1.3: Plot where each iteration of a for loop adds a curve to the
plot.
(Boolean) expression determines which code block to execute and, in the case
of while loops, the number of executions. A logical expression is a statement
that evaluates to either true (1) or false (0). Each logical value is displayed in
Matlab as 1 or 0. To write logical expressions, we need relational and logical
operators.
Relational operators compare the relationship between two items and re-
turn either a true or false verdict. For example, using variables x and y we
might write a logical expression to see if x is greater than y as x > y. We
usually think of relational operators as comparing numeric values, but this is
not always the case. For example, the alphabetical order of text strings could
be compared as in "Bob" > "Bill". The relational operators are listed below.
1.6.1.1 If Construct
If the logical condition of an if statement is true, the code block runs. If
the condition is false, the code block is skipped.
if condition
code block
end
1.6.1.2 Else
Add an else statement to the if construct when there is an alternate
code block that should run when the condition is false.
if condition
code block 1
else
code block 2
end
22 Introduction to Computational Engineering With MATLAB ®
1.6.1.3 Elseif
Selection between multiple code blocks is achieve with any number of
elseif statements.
The final else statement is optional. Its code block runs when all of the
other logical condition statements are false.
if condition1
code block 1
elseif condition2
code block 2
elseif condition3
code block 3
else
code block 4
end
Note: Only the code block for the first true condition will run, even if
multiple conditions are true.
What is the output of the following code?
a = 1;
if a < 2
disp(2)
elseif a < 3
disp(3)
elseif a < 4
disp(4)
else
disp(1)
end
switch x
case value_1 % x == value_1
code block 1
case value_2 % x == value_2
MATLAB Programming 23
code block 2
case {value_3, value_4} % x == value_3, or value_4
code block 3
otherwise % x is none of the above
code block 4
end
% File: ifElse.m
loan = input(’Enter the loan amount: ’);
The loop will evaluate the condition and run the code block if it is true. Each
time after running the code block, the condition is re-evaluated for a possible
additional run. The loop stops when the condition is false.
while condition
code block
end
% File: sinc1.m
%% Manually find eps (myeps) using a while loop,
% then use myeps to prevent divide by zero and plot
% the sinc function.
epsilon = 1;
while (1 + epsilon) ~= 1
myeps = epsilon;
epsilon = epsilon / 2;
end
fprintf(’1 + %9.5g is the same as 1\n’, epsilon)
fprintf(’myeps = %9.5g\n’, myeps);
t = -10:0.1:10;
y = t; % just create y array for efficiency sake
% This could be vectorized, but this code illustrates a for
% loop and a selection statement.
for k = 1:length(t)
if t(k) == 0
x = myeps; % prevent a divide by zero error
else
x = t(k);
end
y(k) = sin(x)/x;
end
plot(t, y)
title(’Sinc Function’)
CODE 1.3: Manual find of eps and sinc (sin(x)/x) function plot.
1.6.4.1 Continue
The continue keyword causes execution of the current loop iteration to
skip past the remainder of the loop’s code block. Control returns to the begin-
ning of the loop where the loop condition is evaluated again to either advance
to the next loop iteration or exit the loop.
In the following pseudocode example, if the special_condition is true,
code block 2 is skipped and control moves back to evaluating loop_condition.
while loop_condition
code block 1
if special_condition
continue
end
code block 2
end
26 Introduction to Computational Engineering With MATLAB ®
Round–off Errors
The topic of eps, also called machine epsilon, brings up the important
topic of round–off errors and numerical stability. We saw here that if a
number smaller than eps is added to 1, the result is 1. We can think of
the difference between 1 and 1 + eps as the round–off error of the digital
numbers between 1 and 2. Because floating point numbers are stored in the
computer using a binary scientific notation, the round–off error doubles
for each increment of the exponent.
1.6.4.2 Break
The break keyword causes execution of the current loop to stop. Control
advances to the code after the loop.
In the following pseudocode example, if the special_condition is true,
code block 2 is skipped and the loop is finished.
while loop_condition
code block 1
if special_condition
break
end
code block 2
end
% File: continueBreakDemo.m
% Prompt the user for the length of items and
% calculate the average.
scalar
A scalar is a variable with size 1×1. That is, it has only one value. The
isscalar function tests if a variable is a scalar.
28 Introduction to Computational Engineering With MATLAB ®
row vector
A row vector is a variable of size 1×n, where n > 1. It has one row contain-
ing n values. Row vectors can be created with the colon operator, manually
entered, or created with one of several Matlab functions. The isvector
and function tests if a variable to is a vector, while the isrow function more
specifically tests if a variable is a row vector.
column vector
matrix
>> A = ones(4, 5)
A =
30 Introduction to Computational Engineering With MATLAB ®
A B
D
C
FIGURE 1.5: Matrices concatenated, D = [[A B]; C]
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
>> B = 2*ones(4, 1)
B =
2
2
2
2
>> C = 3*ones(2, 6)
C =
3 3 3 3 3 3
3 3 3 3 3 3
>> D = [[A B];C]
D =
1 1 1 1 1 2
1 1 1 1 1 2
1 1 1 1 1 2
1 1 1 1 1 2
3 3 3 3 3 3
3 3 3 3 3 3
Data arrays containing more than two dimensions are also possible. Color
images have size m×n×3, as they contain three monochrome images for red,
green, and blue.
for k=1:5
A(k) = k;
end
MATLAB Programming 31
>> A = zeros(1,5)
A =
0 0 0 0 0
>> A = A + 2
A =
2 2 2 2 2
>> A = A*3
A =
6 6 6 6 6
>> A = A/2
A =
3 3 3 3 3
>> A = A - 1
A =
2 2 2 2 2
32 Introduction to Computational Engineering With MATLAB ®
The element-wise exponent can be used between vectors and between a scalar
and a vector.
Note: Matrix and vector multiplication (A*B) and division like operations
(A\B and A/B) are covered in chapter 5.
>> A = [2 4 6 8]
A =
2 4 6 8
>> A(1)
ans =
2
>> A(4)
ans =
8
>> A(5)
Index exceeds the number of array elements (4).
>> A(5) = 10
A =
2 4 6 8 10
>> A = 1:10;
>> A(1:5)
ans =
1 2 3 4 5
>> A(6:end) = A(1:5)
A =
1 2 3 4 5 1 2 3 4 5
>> A(2:2:end-2) = [10 9 8 7]
A =
1 10 3 9 5 8 2 7 4 5
>> x = A(rowNum,colNum);
34 Introduction to Computational Engineering With MATLAB ®
You can use the Matlab keyword end as either a row or column index to
reference the last element. The following line of code accesses the last element
in A.
The following reads the data value that is in the row before the last row
and the column that is two columns before the last column.
>> x = A(end-1,end-2)
x =
72
To extract an entire row or column from an array, you can use a colon (:).
Think of the colon as saying all rows or all columns. This extracts a column
from A:
>> x = A(:,colNum);
>> w = []
w =
[]
>> isempty(w)
ans =
logical
1
>> v = 1:6
v =
1.00 2.00 3.00 4.00 5.00 6.00
>> v(2:4) = []
v =
1.00 5.00 6.00
>> x = linspace(0,10,5)
x =
0 2.5000 5.0000 7.5000 10.0000
>> y = linspace(-pi, pi);
>> whos
Name Size Bytes Class Attributes
x 1x5 40 double
y 1x100 800 double
36 Introduction to Computational Engineering With MATLAB ®
>> logspace(0,2,5)
ans =
1.0000 3.1623 10.0000 31.6228 100.0000
1. Does the code need to be executed more than once, especially with
different values?
2. Does the code provide a specific task? If so, putting it in a function
may help to draw attention to the task and thus improve the clarity
of the program.
• Functions may be placed either inside a script file or in their own file.
Functions inside a script are good for functions that would only be used
in that script. Place a function in its own file to make it generally avail-
able for other scripts and functions that you might develop.
This function has two return values and two input arguments.
The values of the input arguments and outputs are copied between the
Command Window, calling script, or function and the called function. The
variable names between the two are completely independent. They may be the
same name with no name space conflict or they may be completely different
names. The only important thing about the values and variables is the order
in the list of arguments and return values.
38 Introduction to Computational Engineering With MATLAB ®
Self Quiz
(Select all that apply) Which of the following are valid ways to call a
function called myFunction that takes three inputs arguments and returns
two output variables?
1. [out1, out2] = myFunction(in1, in2, in3)
2. out1 = myFunction(in1, in2, in3)
3. myFunction(in1, in2, in3)
Answer: All of the syntaxes shown are valid. The function call returns
only outputs that are requested. If no outputs are requested, the function
still runs and the first output is saved to ans.
We can test the function including the help information from the Command
Window. We can use the sum function to verify the results.
MATLAB Programming 39
>> sequenceSum(50:3:100)
ans =
1258
>> sum(50:3:100)
ans =
1258
t = linspace(0, 3*pi);
plot(t, f(t));
FYI, calculating π
The equation in the next example is the Maclaurin series for tan−1 1.
x3 x5 x7
tan−1 x = x − + − + ··· − 1 ≤ x ≤ 1
3 5 7
π
When we let x = 1, we get 4, which is a series that could be used to
estimate the value of π.
π 1 1 1
= tan−1 1 = 1 − + − + · · ·
4 3 5 7
The three code sections in code 1.6 find the same estimate for π, but the
vectorized code is faster.
% File: piEstimate.m
% Series calculation of pi via pi/4 (0.7849)
N = 1003;
%% For Loop Code
tic
mysum = 1;
sign = 1;
for n = 3:2:N
sign = -sign;
mysum = mysum + sign/n;
end
toc
disp([’For Loop Code: ’, num2str(4*mysum)])
%% Vectorized Code 1
tic
denom1 = 1:4:N;
denom2 = 3:4:N;
mysum = sum(1./denom1) - sum(1./denom2);
toc
disp([’Vectorized Code 1: ’, num2str(4*mysum)])
%% Vectorized Code 2
tic
n = 1:4:N;
mysum = sum(1./n - 1./(n+2));
toc
disp([’Vectorized Code 2: ’, num2str(4*mysum)])
>> A = 1:10
A =
1 2 3 4 5 6 7 8 9 10
>> A > 3 & A <8
ans =
1x10 logical array
0 0 0 1 1 1 1 0 0 0
Next we use a logical vector to modify the vector A. The mod function gives
the result of modulus division (the remainder after integer division). The first
expression below shows a 1 where the vector is odd and a 0 where it is even.
The second expression sets A to 0 for each odd value. It does so by first creating
42 Introduction to Computational Engineering With MATLAB ®
a temporary logical vector. Then, each element in A where the logical vector
is true is given the new value of 0.
>> mod(A,2)
ans =
1 0 1 0 1 0 1 0 1 0
>> A(mod(A,2) == 1) = 0
A =
0 2 0 4 0 6 0 8 0 10
find
The find function takes a logical vector as input and returns a vector
holding the indices where the logical vector is true.
The following example uses an integer random number generator to
make the sequence of numbers.
>> A
A =
8 4 6 2 7 3 7 7 8 5
>> find(A > 6)
ans =
1 5 7 8 9
nnz
Use the function nnz to count the number of nonzero or true values in a
logical array.
% File: sinc2.m
% Sinc function using a logical vector to prevent divide by 0.
t = linspace(-4*pi,4*pi,101);
t(t == 0) = eps;
y = sin(t)./t;
figure, plot(t, y)
hold on
plot([-4*pi 4*pi],[0 0], ’r’)
xticks(linspace(-4*pi,4*pi, 9))
xticklabels({’-4\pi’,’-3\pi’,’-2\pi’,’-\pi’,’0’,...
’\pi’,’2\pi’,’3\pi’,’4\pi’});
title(’Sinc Function’)
axis tight
hold off
CODE 1.7: The sinc (sin(x)/x) function implemented with logical vector
indexing.
Note: In code 1.7, we took care to set and label the x axis tick marks
appropriately for a trigonometry function. See chapter 2 for more
details about controlling the appearance of plots.
44 Introduction to Computational Engineering With MATLAB ®
FIGURE 1.7: The Import Data Tool is used to import desired data columns.
Notice also how missing data will be handled. A pull–down menu offers
choices. Replacing missing data with NaN (not a number) is often a good
choice.
The following tutorial shows some example code for dealing with
missing data. It uses the file auto-mpg.txt, which is a public domain
dataset available from the UCI Machine Learning Repository [20] as file
auto-mpg.data-original. This file is used because it represents real data
and because some data is missing, which we want to demonstrate strategies
for dealing with missing data. The file contains six data values about 398 car
models from 1970 to 1982. For our investigation, we will import only the data
columns for miles per gallon (mpg) in column 1, horsepower in column 4, and
weight in column 5. The most convenient way to import the data into Mat-
lab is to use the Import Data Tool. Figure 1.7 shows how column names are
entered at the top of each column and the desired columns are selected for
import. The data will be imported into a table and any missing values will be
replaced with NaN (not a number).
After the data is imported into a table, we verify the names of the table
fields. We will also save the data so that it is easier to load later. The saved
file is autompg.mat.
>> autompg.Properties.VariableNames
ans =
1x3 cell array
’mpg’ ’horsepower’ ’weight’
>> save autompg
We will put our commands into a script, but use the “Run Section” button
from the Edit tab so that we can take things one step at a time.
First, we will check the three data fields to see if any data is missing. The
ismissing function will return a logical array and the nnz function will count
how may values are missing.
% File: carTable.m
%% Dealing with missing data
% Run this code one section at time.
missing MPG
ans =
0
missing Weight
ans =
0
missing Horsepower
ans =
6
If we just want to calculate statistics of the data, we can ignore the miss-
ing data with the ’omitnan’ option. Most of the statistics functions have
this option. For more on statistics functions, see section 3.2 and check the
documentation of functions that you want to use.
disp(’mean Horsepower’)
mean(autompg.horsepower, ’omitnan’)
missing Horsepower
ans =
6
scatter(autompg.horsepower, autompg.mpg)
48 Introduction to Computational Engineering With MATLAB ®
Next after re-loading the data, we will use linear data interpolation to fill
in the missing data. Since we know that the horsepower data is correlated
to the mpg data, we can get reasonable results if we sort the data based on
the mpg field. The rather long output (not shown) of this step shows the
values before and after the filled in values and verifies that the estimates are
reasonable.
Whether it is better to delete rows with missing values or use interpo-
lation depends on the application, the data available, and a preferences of
individuals.
disp(’interpolated data’)
for i = idx’
disp(autompg.horsepower(i-2:i+2))
disp(’ ’)
end
>> writetable(’myTable.csv’);
By default, writetable uses the file name extension to determine the appro-
priate output format.
Note: There is a lot more information about Matlab tables that are not
included here. The tables tutorial in exercise 1.7 shows examples of
additional Matlab commands related to working with tables.
For more information about character arrays and strings, see MathWorks’
online documentation [45].
Cell Arrays
One solution for passing textual data to functions is to use a cell array
to store the data. Cell arrays are created with curly brackets { } instead of
square arrays used for numeric arrays [ ].
A good application for using cell arrays is to set the tick labels on the axis
of a plot, which is demonstrated in section 2.2.9.
Cell arrays can store references to many types of data besides strings. Cell
arrays hold references to data rather than the actual data. A cell array does
not require homogeneous data types. So, for example, a string, character array,
data structure, numeric vector, and a Matlab object may reside together in
the same cell array.
1.12 Exercises
variable called pi. Write equations for the area and circumference of a
circle. Store the results in variables named area and circ.
r = 4.5; % radius
% sides of a rectangle
a = 10.7;
b = 8.2;
% trapezoid dimensions
a = 7.4;
b = 4.8;
h = 5;
>> T = 5;
>> t = 0:15/100:15;
>> y = sin(2*pi*t/T);
>> plot(t, y);
2. Copy the code for creating the variables t and T into the script.
3. Add the line: y = zeros(1,101);. This lines gives the y array initial
values of zero.
4. Write a for loop where a variable called n has the values of 1, 3, 5, and
7 during the successive iterations of the loop.
5. In the code block of the for loop enter the fourier series sum from above.
Hint: Add a new sine term to the existing y array each time through
the loop.
6. After the loop code, multiply y by 4/π.
7. Plot the y array against t. What could you change in your code to make
the plot look more like a square wave?
Since you will be well paid after graduating from college, let’s say that you
invest $1,000 at the beginning of each month. At the end of each month, you
are paid interest on your investment. The interest rate paid depends on the
account balance. The annual rate is
0 < balance ≤ 10,000 : 1 %
10,000 < balance ≤ 15,000 : 2 %
15,000 < balance ≤ 30,000 : 3 %
30,000 < balance ≤ 100,000 : 5 %
100,000 < balance : 7%
Note that these are annual interest rates, not monthly rates. Write a program
that tracks the balance of the savings until it is at least 1 million dollars. Make
a plot of the savings with years on the x axis and dollars on the y axis.
1. Write a function that has one input argument, a row vector, v and one
output argument, a row vector w that is of the same length as v. The
vector w contains all the elements of v, but in the exact opposite order.
For example, if v is equal to [1 2 3] then w must be equal to [3 2 1]. You
are not allowed to use the built-in function flip.
2. Write a function that takes as input a row vector of distances in kilo-
meters and returns two row vectors of the same length. Each element of
the first output argument is the time in minutes that light would take
to travel the distance specified by the corresponding element of the in-
put vector. To check your math, it takes a little more than 8 minutes
for sunlight to reach Earth which is 150 million kilometers away. The
second output contains the input distances converted to miles. Assume
that the speed of light is 300,000 km/s and that one mile equals 1.609
km.
54 Introduction to Computational Engineering With MATLAB ®
p = c0 + c1 x + c2 x2 + · · · + cn xn
Hints:
1. The functions isempty, isscalar, iscolumn, and length will tell you
everything you need to know about the vector c.
2. When c is a vector, use an element-wise exponent to determine the
vector x x2 x3 · · · xn .
3. When c is a vector, use the sum function with element-wise multipli-
cation. Matrix multiplication could also be used, but we have not yet
covered that.
>> p = poly_val(-17,[],5000)
p =
-17
>> p = poly_val(3.2,[3,-4,10],2.2)
p =
96.9200
MATLAB Programming 55
>> p = poly_val(1,[1;1;1;1],10)
p =
11111
>> p = poly_val(8,5,4)
p =
28
1. From the command window, import the data in the file tallest_bldgs.
txt and save it to a table named buildings.
Using the max and min functions with the height_feet array, what is
the difference in feet between tallest and shortest building listed in the
table?
3. Modify the existing buildings table to include an additional variable
called height_feet containing the height data you just calculated. No-
tice the use of curly brackets, {} as one way to add a variable to a table.
The curly brackets are also a good way to work with a subset of a table.
>> buildings{:,’height_feet’} = height_feet;
56 Introduction to Computational Engineering With MATLAB ®
4. The dot notation is simplest when working with all of the data from a
table variable. Remove the height_m table variable.
5. The sorting capability is a good reason for using tables to hold data.
Sort the values in the buildings table in order of decreasing height.
9. Find the number of buildings that are over 1000 feet tall. Store the result
in n1k.
11. Sort the tallest buildings by age. The default sorting order is ascending.
Which country has the oldest building that is over 1000 feet tall?
MATLAB Programming 57
12. Using the table dot notation, table variables may be used like column
vectors with results saved to a new table variable. Determine which
buildings have the most and least head room on each floor (story).
In which country is the building with the most feet per story?
Chapter 2
Graphical Data Analysis
DOI: 10.1201/9781003271437-2 59
60 Introduction to Computational Engineering With MATLAB ®
FIGURE 2.1: The Plot Tool lets user select the data and plot type with a
click of the mouse.
FIGURE 2.2: The Plot Tool pull down menus provide a graphical interface
to annotate a plot.
Graphical Data Analysis 61
Video Tutorial
MathWorks has made a tutorial video that demonstrates how to use
the Plot Tool [31].
Line style codes are used if you want something other than a solid line.
The following command will plot y versus x with a dotted line.
In addition to color and line style, you can specify a marker style. The following
command will plot y versus x using asterisks.
If the only plot specification is a marker, the default line style is none. Add
a line style, such as a dash (-), to the marker style to also plot a line.
The grid command adds or removes a grid to your plot:
>> grid on
If you need more than one line in an axis label or title, use a cell array created
with a pair of curly brackets {}.
LATEX markup formatting may be used in the annotations. Thus, x^2 will
become x2 and x_2 becomes x2 . LATEX markup is a default standard for math-
ematical equations. You can search the Internet and find good documentation
on formatting LATEX math equations. Other symbols, such as Greek letters,
can also be added, such as ∆, γ, and π. See MathWorks documentation on
Greek Letters and Special Characters [39]. Try the following code to see what
happens.
If you need to use a symbol such as an underscore (_) or a backslash (\), then
preface it with the escape character \. Thus, to display \, you must type \\.
The Matlab text function adds a text annotation to a specified position
on the plot. See the Matlab documentation for text.
The nexttile command is used to advance to the next plot. Several new
features were added beyond the capabilities of subplots.
• Figures may now have a global title, xlabel, and ylabel for the set
of plots.
• The size and spacing between tiles is adjustable. The following com-
mands reduce the spacing between tiles and the padding between the
edges of the figure and the grid of tiles. The net effect is to increase the
size of the individual plots.
Graphical Data Analysis 65
t = tiledlayout(3, 3);
t.TileSpacing = ’compact’;
t.Padding = ’compact’;
• Specifying the size and location of tiles can create a custom layout. This
is done by passing a tile number and an optional range of tiles to be
used to nexttile. The example in figure 2.4 shows a 3×3 grid of tiles.
Three plots are added across the first row. Starting at tile 4, the whole
second row is filled with the next plot, which spans a 1×3 set of tiles.
The third row has a single tile plot and a 1×2 tile plot.
>> x = linspace(-2,2);
>> tiledlayout(3,3)
>> nexttile, plot(x,x)
>> nexttile, plot(x,x)
>> nexttile, plot(x,x)
>> nexttile(4, [1 3])
>> plot(x,x)
>> nexttile(7), plot(x,x)
>> nexttile(8, [1 2])
>> plot(x,x)
• The layout of the plots may be adjusted with each nexttile command.
This is done with the ’flow’ option to tiledlayout. The first plot fills
the whole figure. After the second plot, two stacked plots are shown.
As each tile is added, the layout of tiles is adjusted to accommodate
the new tile. This can make for a nice demonstration showing successive
changes to data. Run the code from the sawTooth.m file (code 2.1) to
see this. Successive plots show the Fourier series of a sawtooth wave as
each new sine wave is added. The user presses the Enter key when they
are ready to see the next plot. The final tiled figure with six plots is
shown in figure 2.5.
The following simple examples illustrate both methods. See figure 2.6 for the
plot.
% File: sawTooth.m
% Plots of the Fourier series of a sawtooth function
% displayed using a tiledlayout. Each time through
% the loop, a new plot is added.
FIGURE 2.7: Common types of 2-D plots other than line plots.
The legend function also accepts two optional arguments: the keyword
’Location’ and a text description of the location that is given by a com-
pass direction, such as ’north’ (top center) or ’southwest’ (lower left). The
default location for the legend is ’northeast’ (top right).
legend(’Trucks’,’Cars’,’Location’,’west’)
Scatter Plot
scatter(x, y): Scatter plot with variable marker size and color. Example
shown in figure 2.7 (a).
Graphical Data Analysis 69
% File: Plots2D.m
% Some 2-D plotting examples
x = linspace(0, 2, 10);
y = 5 + x.^2;
y2 = 5 + x;
plts = tiledlayout(3, 3);
plts.TileSpacing = ’compact’;
plts.Padding = ’compact’;
title(plts, ’2-D Plots’);
nexttile, scatter(x, y, ’ko’), title(’(a) Scatter Plot’);
nexttile, bar(x, y, ’k’), title(’(b) Bar Chart’);
nexttile, stem(x, y, ’k’), title(’(c) Stem Plot’);
nexttile, stairs(x, y, ’k’), title(’(d) Stairs Plot’);
labels = {’excellent’, ’good’, ’fair’, ’poor’};
nexttile(5, [2 2]), pie([0.4 0.3 0.2 0.1], labels)
title(’(e) Pie Chart’)
newcolors = [0.7 0.7 0.7; 0.2 0.2 0.2]; % make grayscale
nexttile(7), area(x, [y2; y]’), title(’(f) Area Plot’);
colororder(newcolors)
CODE 2.2: Code to make common types of 2-D plots other than line plots.
Bar graph
bar(x, y): Bar graph (vertical and horizontal). Example shown in fig-
ure 2.7 (b).
Stem plot
stem(x, y): A discrete sequence plot—each point is plotted with a marker
and a vertical line to the x axis. Example shown in figure 2.7 (c).
Pie chart
pie([0.4 0.3 0.2 0.1], labels): The values of the data are passed
as a vector that often sums to 1; otherwise, Matlab will calculate the
percentages. The labels of the pie slices should be in a cell array. Example
shown in figure 2.7 (e).
>> v = axis
v =
10 20 1 5
Use the functions xlim and ylim to set the x and y axis limits. Both
functions take a vector of two elements as input. For example, the following
command sets the lower y axis limit to 2 and the upper y axis limit to 4.
When the lower or upper range of the data falls between the limits that Mat-
lab picked, the data may not fill the x axis. Use the axis tight command
to change the limits based on the range of the data.
For plots depicting geometry, we would like the x, y, and z axes to have the
same scale. There are two ways to accomplish this. The axis command has
an equal option that sets the scale of each axis to be the same. We can also
set the aspect ratio to be the same for each axis with the daspect function,
which takes a vector with three values, even for 2-D plots.
x = linspace(pi/4, 5*pi/4);
figure, plot(x, 4 - 4*sin(2*x - pi/2))
txt = texlabel(’f(x) = 4 - 4*sin(2*x - pi/2)’);
Graphical Data Analysis 71
FIGURE 2.8: Sine wave with TeX formatted x axis tick marks and title.
2.2.10 Fplots
Matlab includes another 2-D plotting function that comes in handy when
working with either function handles, anonymous functions, or symbolic math
functions (sections 1.8.4 and 4.4). The plot, annotations, and options are the
same as the plot function. However, instead of passing vectors for the x and y
data, we give fplot a function and a range of values for the x axis, or accept
the default x axis range of −5 to 5.
In addition to being convenient when working with function handles or
symbolic math functions, fplot also correctly handles plotting difficulties,
such as data discontinuities. A good example of this, as shown in figure 2.9,
comes from the trigonometric tangent function.
FIGURE 2.9: The fplot function handles the data discontinuities of the
tangent function.
>> m = membrane;
>> surf(m)
>> xlabel(’x’)
>> ylabel(’y’)
>> zlabel(’z’)
FIGURE 2.10: The surf plot is the first plot to try for plotting 3-D surfaces.
Given only a matrix as input, the 3-D surface plotting functions will show
the matrix values as the z axis values and use the matrix indices for the x and
y axis.
The meshgrid function is used to create matrices of the x and y axis values
that cover the range of data. Using data created from meshgrid, the z axis
values are simple to create from an equation.
>> x = -3:3;
>> y = -3:3;
>> [X, Y] = meshgrid(x,y);
>> X
X =
-3 -2 -1 0 1 2 3
-3 -2 -1 0 1 2 3
-3 -2 -1 0 1 2 3
-3 -2 -1 0 1 2 3
-3 -2 -1 0 1 2 3
-3 -2 -1 0 1 2 3
Graphical Data Analysis 73
-3 -2 -1 0 1 2 3
>> Y
Y =
-3 -3 -3 -3 -3 -3 -3
-2 -2 -2 -2 -2 -2 -2
-1 -1 -1 -1 -1 -1 -1
0 0 0 0 0 0 0
1 1 1 1 1 1 1
2 2 2 2 2 2 2
3 3 3 3 3 3 3
>> Z = X.^2 + Y.^2
Z =
18 13 10 9 10 13 18
13 8 5 4 5 8 13
10 5 2 1 2 5 10
9 4 1 0 1 4 9
10 5 2 1 2 5 10
13 8 5 4 5 8 13
18 13 10 9 10 13 18
x = -8:0.25:8;
y = -8:0.25:8;
y1 = 8:-0.25:-8;
t = x.^2 + y1.^2;
z = (y1-x).*exp(-0.12*t); % for plot3
[X, Y] = meshgrid(x,y);
T = X.^2 + Y.^2;
Z = (Y-X).*exp(-0.12*T); % for surface plots
surf
surf(X, Y, Z): A 3-D surface plot, which is probably the most often
used 3-D plot function. The surface is color coded to the z axis height.
The surface is covered with grid lines parallel to the x and y axis. Examples
are shown in figure 2.10 and figure 2.11.
surfc
surfc(X, Y, Z): A contour plot under a surface plot. Example is shown
in figure 2.12.
74 Introduction to Computational Engineering With MATLAB ®
mesh
mesh(X, Y, Z): A wireframe mesh with color determined by Z, so color
is proportional to surface height. Example is shown in figure 2.13.
plot3
plot3(x, y, z): A line plot, like the plot function, but in 3 dimensions.
Example is shown in figure 2.14.
contour
contour(X, Y, Z): A contour plot displays isolines to indicate lines of
equal z axis values, like found on a topographic map. Example is shown
in figure 2.15 (e).
contour3
contour3(X, Y, Z): A 3-D contour plot. Example is shown in figure 2.15
(f).
meshz
meshz(X, Y, Z): Like a mesh plot, but with a curtain around the wire-
frame mesh. Example is shown in figure 2.15 (g).
Graphical Data Analysis 75
FIGURE 2.15: Three dimensional plots: (e) contour, (f) contour3, (g) meshz,
(h) waterfall.
waterfall
waterfall(X, Y, Z): A mesh similar to the meshz function, but it does
not generate lines from the columns of the matrices. Example is shown in
figure 2.15 (h).
surfl
surfl(X, Y, Z): A shaded surface based on a combination of ambient,
diffuse, and specular lighting models. Color is required for this plot to look
right and the effect will vary depending on the data. An example plot is
not shown.
Tip: Before concluding that you made a mistake if your 3-D plot looks like
a 2-D plot, use the rotate tool to move the plot around. The view(3)
command ensures that a plot displays in 3-dimensions.
x axis
Index finger
y axis y axis
Middle finger Middle finger
x axis
Index finger
FIGURE 2.16: 3-D coordinate frame and the right hand rule.
right index finger in the direction of the x axis. Hold your middle finger at
90 degrees, which will be in the direction of the y axis. Your thumb will be
in the direction of the z axis. See figure 2.16 for the correct 3-D coordinate
frame layout.
A physical model is also helpful to visualize the 3-D coordinate frame axes.
One can be printed on a 3-D printer. For a paper model, Peter Corke’s website
has a PDF file that can be printed, cut, folded, and glued. The PDF file is
available at http://www.petercorke.com/axes.pdf
2.4 Exercises
(b) Use Matlab’s plot3 function to plot a 3-dimensional spiral in the shape
of a sphere of radius 3 inches centered at (0, 0, 0).
(c) Use Matlab’s plot3 function to plot stacked circles in the shape of a
sphere of radius 3 inches centered at (0, 0, 0).
(d) Look up the documentation for the Matlab sphere function and use
it along the surf function to plot a sphere of radius 3 inches centered
at (0, 0, 0).
The horizontal and vertical paths of the ball may be calculated indepen-
dently. The following diagram shows the geometry of how the initial velocity,
V0 , and projection angle, θ, relate to the initial horizontal and vertical veloc-
ities.
V0 , y = V0 sin θ
V0
V0 , x = V0 cos θ
y0
Graphical Data Analysis 79
y = y0 + V0,y t + 1/2 g t2
Using the tf value measured with the stopwatch, calculate V0,y , which
occurs when y = 0. The y path of the ball may be calculated now.
3. Next use the final horizontal displacement, xf , to calculate the initial
horizontal velocity. Air resistance will slightly slow the horizontal dis-
placement according to the differential equation [1],
dvx (t)
= −0.2 vx (t).
dt
The solution to the differential equation is
The constant value of 0.2 is an estimated, typical value. The real value
is dependent on other factors such as the wind speed and direction that
we don’t know.
From simple trigonometry, V0 , and θ can be expressed in terms of V0,y
and V0,x . Hint: Use the atan2d function to find θ.
4. Plot the path of the ball with a ’LineWidth’ of 2.
Show the home run wall by plotting a green line with a ’LineWidth’ of
3. Make the wall 10 feet tall at 410 feet from home plate.
Remember to annotate the plot and print in the Command Window the
maximum height, initial velocity, and initial projection angle of the ball.
Chapter 3
Statistical Data Analysis
Matlab contains many functions for statistical data analysis. The coverage
here focuses on statistical calculations with Matlab.
We will also touch on an important statistics topic later when we discuss
least squares regression in section 5.10.
DOI: 10.1201/9781003271437-3 81
82 Introduction to Computational Engineering With MATLAB ®
and returns the mean of the data. Outliers (a few values significantly
different than the rest of the data) can move the mean. So it can be a
poor estimator of the center of the data. The median is less affected by
outliers. The symbol for the sample mean of a random variable, X, is
x̄. The symbol for a population mean is µ, which is the expected value,
E(X), of the random variable.
n
1X
x̄ = xi
n i=1
µ = E(X)
>> v = [31;12;8;29;36];
>> sum(v)/length(v)
ans =
23.2000
>> mean(v)
ans =
23.2000
Standard Deviation
The standard deviation, which is the square root of the variance, is a
measure of how widely distributed a random variable is about its mean.
As shown in figure 3.1, a small standard deviation means that the numbers
are close to the mean. A larger standard deviation means that the numbers
vary quite a bit. For a normal (Gaussian) distributed variable, 68% of the
values are within one standard deviation of the mean, 95% are within
two standard deviations and 99.7% are within three standard deviations.
The symbol for a sample standard deviation is s and the symbol for a
population standard deviation is σ.
We define the difference between a random variable and its mean as
Y = (X − x̄). Note that owing to the definition of the mean, ȳ = 0. To
account equally for the variability of values less than or greater than the
mean, we will use Y 2 . The maximum likelihood estimator (MLE) of the
variance computes the mean value of Y 2 .
n
!
1 X
s2M LE = x2 − x̄2
n i=1 i
Median
The median of a data set is the number which is greater than half of the
numbers and less than the other half of the numbers. Although the mean
is a more commonly used metric for the center value of a data set, the
median is often a better indicator of the center. This is because values
that are outliers to the main body of data can skew a mean, but will not
shift the median value.
When the number of items in the vector is odd, the median is just
the center value of the sorted data. When the array has an even number
of items, the median is the average of the two center items of the sorted
data.
Mode
The mode of a data set is the value that appears most frequently. Matlab
has a function called mode. If there is a tie for the number of occurrences
of values, the smaller value is returned. If no values repeat, then the mode
function returns the minimum value of the data set.
Diff
The diff function calculates the difference between each successive values
of a vector. Note that the length of the returned vector is one less than
the original vector.
>> diff(v)
ans =
-19
-4
21
7
>> y = movmean(x, k)
Function Description
movmin Moving minimum
movmax Moving maximum
movsum Moving sum
movmean Moving mean
movmedian Moving median
movstd Moving standard deviation
movvar Moving variance
The commands below show how the values are calculated using the
movmean function with a window size of 7. Notice that the first three val-
ues are calculated with a shortened window size. The full window size is used
beginning with the fourth term. The window advances for the first time to
not include the first data value at the fifth term. A shortened window size
is similarly used at the end of the vector. You can change this behavior by
specifying the optional ’Endpoints’ argument. A plot of the original data
and the filtered data from the movmean function is shown in figure 3.2.
>> sum(y(1:4))/4
ans =
2.7774 % y1(1) - startup, short window
>> sum(y(1:5))/5
ans =
3.2687 % y1(2) - startup, short window
>> sum(y(1:7))/7
ans =
4.3319 % y1(4) - first full window mean
>> sum(y(2:8))/7
ans =
5.7502 % y1(5) - window advanced
88 Introduction to Computational Engineering With MATLAB ®
FIGURE 3.2: The moving window mean smooths the data fluctuations.
p(0) = P {X = 0} = 1 − p, (0 ≤ p ≤ 1)
p(1) = P {X = 1} = p
E[X] = p
Since X has only two values—zero and one, E[X 2 ] = E[X] = p.
V ar[X] = p − p2 = p(1 − p)
V ar[X] = n p (1 − p)
For our example of three fair (p = 0.5) coin tosses:
% Expected value
>> E_x = sum((0:3).*p)
E_x =
1.5000
Some experiments may have multiple discrete outcomes that are each
equally likely. The most obvious example is the throw of a six sided die, which
has six possible outcomes each with probability of 1/6.
a+b
E[X] =
2
n2 − 1
V ar[X] =
12
The variables a and b represent minimum and maximum values. The number
n represents how many different outcomes are available.
Statistical Data Analysis 91
The Poisson distribution is for random variables that count the number
of events that might occur within a fixed unit, such as a quantity, a spatial
region, or a time span. A special property of Poisson random variables is that
their values do not have a fixed upper limit. Some examples might be the
number of defects per 1,000 items produced, the number of customers that a
bank teller helps per hour, or the number of telephone calls processed by a
switching system per hour. It is denoted as X ∼ P oi(λ). An example Poisson
distribution probability mass plot is show in figure 3.3 for λ = 3. Note that
0! = 1, so k = 0 is defined in the PMF.
λk e−λ
p(k) = P (X = k) = , k = 0, 1, 2, . . . , λ > 0
k!
E[X] = λ
V ar[X] = λ
>> k = 0:10;
>> lambda = 3;
>> p = poi(lambda, k);
>> stem(k, p)
>> title(’Poisson Distribution, \lambda = 3’)
>> xlabel(’k’)
>> ylabel(’p(k)’)
The area under a region of the PDF, a definite integral, defines a proba-
bility. The total area under a PDF must be one. A useful tool for computing
probabilities is a cumulative distribution function (CDF), which tells us the
probability F (a) = P (X < a). The CDF is a function giving the result of
integrating the PDF. A CDF plot is always zero on the left side of the plot
and one to the right. Z ∞
f (x) dx = 1
−∞
Z a
F (a) = f (x) dx
−∞
if length(varargin) == 0
a = 0;
b = 1;
else
a = varargin{1};
b = varargin{2};
end
F = zeros(1, length(c));
inRange = (c >= a) & (c <= b);
F(inRange) = (c(inRange) - a)/(b - a);
F(c > b) = 1;
end
0, c<a
c−a
F (c) = b−a , a≤c≤b
1, c>b
a+b
E[X] =
2
(b − a)2
V ar[X] =
12
λ e−λ x , x > 0
f (x) =
0, x≤0
Statistical Data Analysis 95
1 − e−λ a , a > 0
F (a) =
0, a≤0
1
E[X] =
λ
1
V ar[X] = 2
λ
The normal distribution, also called the Gaussian distribution, models ran-
dom variables that naturally occur in nature and society. Random variables
with a normal distribution include measurements (length, width, height, vol-
ume, weight, etc.) of plants and animals, scores on standardized tests, income
levels, air pollution levels, etc. Its probability density function has the familiar
bell shaped curve. We denote the distribution as X ∼ N (µ, σ 2 ).
1 −(x−µ)2
f (x) = √ e 2 σ2
2 π σ2
To simplify things, we will map the random variable to a standard normal
distribution with zero mean and unit variance, Y ∼ N (0, 1).
x−µ
y=
σ
1 −y 2
f (y) = √ e 2
2π
Computing the CDF requires numerical integration because there is not a
closed form integral solution to the PDF. Fortunately, there is a built-in Mat-
lab function that we can use to compute the integral. See the documentation
for functions erf and erfc. The Statistics and Machine Learning Toolbox in-
cludes a function to compute the CDF, but implementing our own function
using erfc is not difficult.
We desire a function called normcdf that behaves as follows for distribution
Y ∼ N (0, 1). A plot of the needed area calculation is shown in figure 3.6. A
plot of the CDF is shown in figure 3.7.
Z a
1 −y 2
normcdf(a) = P (Y < a) = F (a) = √ e 2 dy
2 π −∞
The erfc(b) function gives us the following definite integral.
Z ∞
2 2
erfc(b) = √ e−y dy.
π b
So, if we multiply by 1/2, change the sign of the definite integral boundary
value to reflect that we are integrating from the boundary to infinity rather
96 Introduction to Computational Engineering With MATLAB ®
>> a = linspace(-3,3);
>> plot(a, normcdf(a))
than
√ from negative infinity to the boundary, and divide the boundary-value
by 2 because the squared integration variable in the erfc calculation is not
divided by 2 as we need, then the integration will compute an equivalent area
to what we need. The code for a normal distribution CDF is listed in code 3.2.
Note that because of the symmetry of the PDF curve about the mean,
P (Y < a) is the same as P (Y > −a).
Z ∞ !
1 2 −y 2 1 √
normcdf(a) = F (a) = √ √ e dy = erfc(−a/ 2)
2 π −a/ 2 2
Probability Example:
You learn that the lot of calves to be auctioned at the local livestock
sale have a mean weight of 500 pounds with a standard deviation of 150
pounds. What fraction of the calves likely weigh:
Statistical Data Analysis 97
if length(varargin) ~= 0
mu = varargin{1};
sigma = varargin{2};
a = (a - mu)./sigma;
end
F = erfc(-a/sqrt(2))/2;
end
rand
The rand(n) function creates a n×n matrix of uniformly distributed ran-
dom numbers in the interval (0 to 1). rand(m, n) creates a m×n matrix
or vector. To generate n random numbers in the interval (a, b) use the
formula r = a + (b-a)*rand(1, n). A uniform distribution means that
each number is equally likely.
randn
The randn function creates random numbers with a normal (Gaussian)
distribution. The parameters to randn regarding the size of the returned
data is the same as for rand. The values returned from randn have a
mean of zero and variance of one. To generate data with a mean of a and
standard deviation of b, use the equation y = a + b*randn(N,1).
randi
The randi(Max, m, n) function creates a m×n matrix of uniformly dis-
tributed random integers between 1 and a maximum value. Provide the
maximum value as the first input, followed by the size of the array.
randperm
The randperm function returns a row vector containing a random per-
mutation of the integers from 1 to n without repeating elements. One
application of this is to achieve a shuffle operation, like with a deck of
playing cards.
% File: randGenPlots.m
t = tiledlayout(3, 1);
title(t, ’Random Number Generators’)
nexttile;
histogram(20*rand(1, 1000), 40)
title(’rand \sim U(0, 20)’)
nexttile;
histogram(randi(20, 1, 1000), 20)
title(’randi \sim U(1, 20) integers’)
nexttile;
histogram(10 + 3*randn(1, 1000), 40)
title(’randn \sim N(10, 3)’)
1 − e−λ a , a > 0
F (a) = (3.1)
0, a≤0
Equation (3.2) shows the inverse of the CDF found by letting y = F (a)
and solving for a. To make an exponential random number generator we use
equation (3.2) and the rand random number generator that has a uniform
distribution in the interval (0, 1). Code 3.4 lists an exponential random number
100 Introduction to Computational Engineering With MATLAB ®
function X = randexp(lambda, m, n)
% RANDEXP - Simple exponential distribution random
% number generator. If you have the Statistics
% and Machine Learning Toolbox, use exprnd.
%
% lambda - exponential distribution parameter
% mean(X) ~ 1/lambda, var(X) ~ 1/lambda^2
% Output is size m x n
y = rand(m, n); % y ~ U(0, 1)
y(y == 1) = 1 - eps; % protect from divide by 0
X = log(1./(1-y))./lambda; % inverse of CDF
end
of the square is 4, while the area of the unit circle is π. A plot of the points
inside and outside of the circle is shown in figure 3.10.
% File: randpi.m
% Script to use a random number generator to estimate the
% value of PI. This is an example of a Monte Carlo simulation.
N = 1000000;
FIGURE 3.10: The ratio of points inside the circle to total points is close
to π/4.
% File: casinoWalk.m
% Random walk simulation of playing slot machines.
% 100 people each start with 80 quarters ($20).
% They play all 80 quarters and see how much money they
% have in winnings. The slot machines payout 80% of
% what they take in. Half of the time they use 1/5 odds
% to pay 4 quarters. The other times they use 1/10 odds
% to pay 8 quarters. You should find the average winning
% to be about -4 dollars.
>> A = randi(25,10,20);
>> mC = mean(A);
>> mr = mean(A,2);
>> whos
Name Size Bytes Class Attributes
Some functions, such as min, max, and diff, use the second argument
for other purposes, which makes dimension the third argument. To skip the
second argument, use a pair of empty square brackets for an empty vector,
[].
σx2 = E(X − µx )2
n
2 1 X
s = (xi − x̄)2
n − 1 i=1
Similarly, the covariance between two variables is the product of the differences
between samples and their respective means.
Thus, the covariance between a variable and itself is its variance, sxx = s2x .
Covariance is represented with a symmetric matrix because sxy = syx . The
variances of each variable will be on the diagonal of the matrix.
Statistical Data Analysis 105
For example, consider taking a sampling of the age, height, and weight of
n children. We could construct a covariance matrix as follows.
saa sah saw
Covariance(a, h, w) = sha shh shw
swa swh sww
>> boxplot(d’)
The box plot function from MathWorks is part of the Statistics and Ma-
chine Learning Toolbox, which is an extra purchase. However, there are a few
free box plot functions available on the MathWorks File Exchange. Some of
Statistical Data Analysis 107
those use functions from extra toolboxes. The free Boxplot function [52] is a
simple function that uses only standard Matlab functions.
The boxplot function wants the data to be in a column vector because it
can make several box plots in a figure if each data set is a column of a matrix.
3.7.2 Histogram
A histogram plot divides the data into regions (called bins) and shows how
many values fall into each region. If the size of the data is large, a histogram
plot will begin to take the shape of the PDF.
The histogram function has several possible parameters, but the most com-
mon usage is to pass two arguments—the data and the number of bins to use.
Another useful pair of options is ’Normalization’, ’probability’, which
scales the height of each bin to its probability level. This is a good option if
you want to overlay another histogram or a PDF plot. An example histogram
is shown in figure 3.13.
histogram(d, 40)
random variable X, let X̄i be the sample mean, and let Yi be the sample sum.
Pn
X̄i = n1 j=1 xi,j
Pn
Yi = j=1 xi,j
We also define variable Z as follows.
X̄ − µ Y − nµ
Z= =
√σ √σ
n n
>> n = 100;
>> X = randi(6, n); % 100 x 100
>> X_bar = mean(X); % 1 x 100
>> mu = mean(X(:))
mu =
3.4960 % 3.5 expected
>> sigma = std(X(:))
sigma =
1.7025 % 35/12 = 2.92 expected
% Make Z ~ N(0, 1)
>> Z = (X_bar - mu)/(sigma/sqrt(n));
>> mean(Z)
ans =
-1.9895e-15
>> var(Z) % Z ~ N(0,1)
ans =
0.9791
have a different mean. However, we can calculate a range of values for which
we have confidence that the actual population mean is within. The central
limit theorem and what we know about the standard normal distribution give
us the tools to do this.
Based on the mapping of the random variable to a standard normal dis-
tribution by the central limit theorem, we establish the Z variable.
x̄ − µ
Z=
√σ
n
Then for the estimated mean of the population, we use the sample mean,
population standard deviation, and the size of the sample to find the following
probability relationship, which establishes a confidence interval.
σ σ
P r x̄ − Zα/2 √ < µx < x̄ + Zα/2 √ =C
n n
110 Introduction to Computational Engineering With MATLAB ®
TABLE 3.2: Z-critical values
We refer to these Z-critical values as α/2 because there are regions greater
than and less than the confidence interval where the population mean might
be found. The combination of those two regions is called α. A standard normal
PDF plot marking the confidence interval is shown in figure 3.15.
% File: testConfidence.m
% Find a 95% confidence interval and see how many
% sample means are inside and outside the interval.
n = 100;
X = 50 + 20*randn(50, n); % 50 x 100
X_bar = mean(X);
Za2 = 1.96; % 95% Confidence
mu = mean(X(:)) % population mean
sigma = std(X(:));
% look at one sample
x_bar = X_bar(1);
Zlow = x_bar - Za2*sigma/sqrt(n)
Zhigh = x_bar + Za2*sigma/sqrt(n)
% test all samples
inAlpha = nnz(X_bar < Zlow | X_bar > Zhigh)
inC = nnz(X_bar > Zlow & X_bar < Zhigh)
>> testConfidence
mu =
49.5914
Zlow =
45.2956
Zhigh =
53.5387
inAlpha =
8
inC =
92 % About as expected for a 95% confidence interval
Counterfeit Competition
3.10.1 Z-Test
For the Z-test, we only need to calculate the sample mean of the Ha data
and then calculate the Z variable that maps the sample mean of the treatment
data set to a standard normal distribution.
x¯a − µ0
Z= , whereH0 : µ = µ0
√σ
n
FIGURE 3.16: One and two tailed 95% confidence intervals, α = 0.05,
Zα = 1.645, Zα/2 = 1.96.
For the sake of our example, let’s say that the metric of concern is a
Bernoulli event. We obtain 500 each of our competitor’s product and our
114 Introduction to Computational Engineering With MATLAB ®
product for testing. We split the 500 products into 10 sampling sets of 50
items.
We can use the Z-test for our product since we have a well established
success rate for it. In several tests, our company’s product has consistently
shown a success rate of at least 90%. We just want to verify that our latest
product matches up to our established quality level. We put our successes and
failures in a matrix and compute the mean of each column. According to the
central limit theory, the means of the sample sets give us a normal distribution
that we can use with both Z-tests and t-tests.
You may wonder about the binomial distribution, which comes from the
sum of successes of Bernoulli events. A binomial distribution is a discrete
facsimile of a normal distribution. The central limit theorem equations can
scale and shift a binomial distribution into a standard normal distribution.
In the simulation test of our product, we find a Z-Test score that is well
below any Z-critical value. So we confirm a null hypothesis that these samples
test match the quality of our product.
3.10.2 t-Test
In practice, the t-test is much like the Z-test except the sample’s cal-
culated standard deviation (s) is used rather than the population standard
deviation (σ). Because each sample from the population will have a slightly
different standard deviation, we use the student’s t-distribution, which has a
slightly wider bell-shaped PDF than the standard normal PDF. The distri-
bution becomes wider for smaller sample sizes (degrees of freedom). As the
sample size approaches infinity, the t-distribution approaches the standard
normal distribution. The degrees of freedom (df ) is one less than the sample
size (df = n − 1). The t-distribution PDF plot for three df values is shown in
figure 3.17.
The T statistic is calculated and compared to critical values to test for
acceptance of a null hypothesis as was done with the Z-test. The critical values
are a function of both the desired confidence level and the degree of freedom
Statistical Data Analysis 115
FIGURE 3.17: t-Distribution PDF plots showing two tailed 95% critical
values for infinity, 9, and 3 degrees of freedom.
(sample size). There are several ways to find the critical values. Tables found
on websites or in statistics books list t-distribution critical values. Interactive
websites as well as several software environments can calculate critical values.
Unfortunately, Matlab requires the Statistics and Machine Learning Toolbox
to calculate critical values. However, as with the Z-test, the t-critical values are
constants, so they may be hard-coded into a program to test a null hypothesis.
X̄ − µ0
T = , whereH0 : µ = µ0
√s
n
>> s = std(X_bar)
s =
0.0382
>> T = (x_bar - 0.9)/(s/sqrt(n))
T =
-3.1425
3.11 Exercises
>> rng(2022)
>> v = 20 + 3*randn(1, 200);
• Search the Internet for hourly temperature data. The National Climatic
Data Center of NOAA has data taken hourly from several weather re-
porting stations. https://www.ncdc.noaa.gov/crn/qcdatasets.html
• The Hourly02 directory has what we want. The folders and files are
organized by year and reporting station. Downloaded the data for the
location and year of your choice.
• The meaning of fields are given in the documentation. Field 4 is Local
Standard Time (LST) date. Field 5 is the Local Standard Time (LST)
time of the observation. Field 10 is average air temperature in degrees
C for the entire hour.
• Note that data starts based on UTC time, so it includes the last hours
of the previous year.
• Use the Import Data tool. Rename and import fields 4, 5, and 10 to
Date, Hour, and Temp.
• Check for missing data with the ismissing function. If there is missing
data, use the fillmissing function to fill in those values using linear
interpolation.
• Check the minimum and maximum temperatures. You will likely find a
few clusters of -9999 temps from when no reading was taken. We can
replace those values with NaN and again use the fillmissing function
to fill in the missing data.
118 Introduction to Computational Engineering With MATLAB ®
Now for the date and time. The following function returns a datetime
value from the dates and times given:
year = floor(dateNum/10000);
month = floor((dateNum - year*10000)/100);
day = dateNum - year*10000 - month*100;
hour = timeNum/100;
date = datetime(year,month,day,hour,0,0);
end
dy(t)
v(t) = . (4.1)
dt
The derivative of the velocity is the acceleration, which in this case is a con-
stant from gravity in the negative direction. The acceleration is the second
derivative of the position.
dv(t) dy 2 (t)
a(t) = = =g (4.2)
dt dt2
y(tmax ) = ymax
v(tmax ) = 0
v(0) = V0
y0
y(tf inal ) = 0
FIGURE 4.1: A ball is thrown upward from height y0 with an initial ve-
locity of V0 . When the ball reaches its maximum height, ymax at time tmax ,
its velocity is zero, v(tmax ) = 0. The ball hits the ground at time tf inal ,
y(tf inal ) = 0.
diff(y(t), t, 2) == g
>> v(t) = diff(y(t),t)
v(t) =
diff(y(t), t)
The diff function takes the derivative of the position to find the velocity.
The maximum height of the ball occurs when the velocity is zero. The solve
function finds solutions to algebraic equations.
The ball lands on the ground when y(t) is zero. Two answers are returned
because y(t) is a quadratic equation.
We can give the variables numerical values and substitute (subs) the values
into the equations, and see a decimal version of the number with vpa.
1.0204081632653061224489795918367 % seconds
>> subs(y_max)
ans =
299/49
>> vpa(subs(y_max))
ans =
6.1020408163265306122448979591837 % meters
>> syms x y
>> f = (2*x^2 + 3*y) * (x^3 + x^2 + x*y^2 + 2*y);
>> collect(f,x)
ans =
2*x^5 + 2*x^4 + (2*y^2 + 3*y)*x^3 + 7*y*x^2 + 3*y^3*x
+ 6*y^2
>> collect(f,y)
ans =
3*x*y^3 + (2*x^3 + 6)*y^2 + (3*x^3 + 7*x^2)*y
+ 2*x^2*(x^3 + x^2)
4.2.2 Factor
The factor function finds the set of irreducible polynomials whose prod-
uct form the higher order polynomial argument to factor. The factoring
operation is useful for finding the roots of the polynomial.
ans =
[ x - 2, x - 3, x + 5, x + 3]
4.2.3 Expand
The expand function multiplies polynomial products to get a single poly-
nomial.
4.2.4 Simplify
The simplify function generates a simpler form of an expression.
>> g
g =
x^4 + 3*x^3 - 19*x^2 - 27*x + 90
>> h = expand((x-2)*(x+5))
h =
x^2 + 3*x - 10
>> k = g/h
k =
(x^4 + 3*x^3 - 19*x^2 - 27*x + 90)/(x^2 + 3*x - 10)
>> simplify(k)
ans =
x^2 - 9
4.2.5 Solve
Finding the roots of equations, i.e., x, where f (x) = 0, is addressed here
and also in the context of two other topics.
• The roots function described in section 6.3.2 finds the numerical roots
of a polynomial using a linear algebra algorithm.
• The fzero function described in section 7.1.1 uses numerical methods
to find the roots of any equation of real numbers.
>> solve(g(x) == 4, x)
ans =
5^(1/2)
-5^(1/2)
>> h(x) = x + 2;
>> solve(g(x) == h(x), x)
ans =
1/2 - 13^(1/2)/2
13^(1/2)/2 + 1/2
4.2.6 Subs
The subs(s) function returns a copy of s after replacing the symbolic vari-
ables in s with their values from the Matlab workspace. The subs function
was demonstrated in section 4.1.
4.2.7 Vpa
The Symbolic Math Toolbox tries to keep variables as exact symbolic values
rather than as decimal approximations. The vpa function converts symbolic
expressions to decimal values. The name vpa is an acronym for variable–
precision floating-point arithmetic. As the name implies, the vpa(x, d) func-
tion evaluates x to at least d digits. The default number of digits is 32. The
maximum number of digits is 229 .
>> syms x
>> diff(x^2 + 3*x + 2, x)
ans =
2*x + 3
>> diff(x^2 + 3*x + 2, x, 2) % 2nd derivative
ans =
2
>> diff(x*sin(3*x))
ans =
sin(3*x) + 3*x*cos(3*x)
>> diff(exp(-x/2))
ans =
-exp(-x/2)/2
>> syms x t
>> int(x^2 + 3*x + 2, x)
ans =
(x*(2*x^2 + 9*x + 12))/6
>> expand(int(x^2 + 3*x + 2, x))
ans =
x^3/3 + (3*x^2)/2 + 2*x
>> int(exp(-x/2),x,0,t)
ans =
2 - 2*exp(-t/2)
When the SMT is not able to find an analytic solution, it may return a
result that makes use of numerical integration methods. We also encountered
the erf and erfc functions in section 3.4.2.
>> limit(sin(x)/x, x, 0)
ans =
1
>> limit(x*exp(-x), x, Inf)
ans =
0
>> syms h s x
>> s = ((x + h)^2 - x^2)/h;
>> limit(s, h, 0)
ans =
2*x
The example above shows two important points about differential equa-
tions. First, the solution contains an exponential function of the irrational
number e. This is because ea t is the only function for which the derivative is
a multiple of the original function. Specifically,
d ea t
= a ea t . (4.4)
dt
Thus, exponential functions of e show up as the solution to any differential
equation of growth or decay where there is a linear relationship between the
rate of change and the function’s value.
Secondly, the solution to equation (4.4) contains a constant, C4. This is
normal when an initial condition or boundary condition was not specified.
With an initial condition, we can either solve for C4 algebraically, or use the
condition as an input to dsolve. For example, let’s say that Y (0) = 5. From
our solution, Y (0) = C4, thus C4 = 5. The following code uses the SMT to
solve a first order differential equation like equation (4.4) with exponential
decay. The solution is plotted in figure 4.2.
Initial or boundary conditions may also be passed to dsolve as an extra
argument. When there is more than one condition, as is common for second
or higher order differential equations, the conditions are passed as an array.
See the section 4.1 example for a second order differential equation example
with two initial conditions.
Note: As with function handles, the results computed by the SMT that are
functions may be plotted with the fplot function (section 2.2.10).
130 Introduction to Computational Engineering With MATLAB ®
>> a = 0.005;
>> y(t) = subs(Y(t))
y(t) =
5*exp(-t/200)
>> fplot(y(t), [0 1000])
4.5 Exercises
x4 −5 x3 +20 x−16
6. Simplify the equation x2 −4
x2 −4
12. Find the limit limx→2 x−2
Using the Symbolic Math Toolbox 131
values of points are always relative to the origin. Vectors may be defined as
spanning between two points (v = p1 − p2 ); and one of the points may be the
origin. Figure 5.1 shows the relationship of points and vectors relative to the
coordinate axes.
p1 = (2, 6)
v = p1 − p2 = (−9, 5)
p2 = (11, 1)
FIGURE 5.1: The elements of points are the coordinates relative to the
axes. The elements of vectors tell the length of the vector for each dimension.
Note that vectors are usually stored as column vectors in Matlab. One
might accurately observe that the data of a vector is a set of numbers, which
is neither a row vector nor a column vector [48]. However, vectors in Matlab
are stored as either row vectors or column vectors. The need to multiply a
vector by a matrix using inner products (defined in section 5.1.4) motivates
toward using column vectors. Because column vectors take more space when
displayed in written material, they are often displayed in one of two alternate
ways—either as the transpose of a row vector [a, b, c]T or with parentheses
(a, b, c).
Note: The word vector comes from the Latin word meaning “carrier”.
v = (2, 3)
2 w = (8, 4)
u = (8, 2) w = (4, 2)
The first three vectors of the set {w, x, and y} are not independent because
y = 2 w + x. Likewise, w = 12 (y − x) and x = y − 2 w. However, z is
independent of the other three vectors.
Sometimes we can observe dependent relationships, but it is often difficult
to see the relationships. Matlab has a few functions that will help one test for
independence and even see what the dependent relationships are. Tests for in-
dependence include the rank and det functions discussed in section 5.2.6. The
rref function (section 5.5.3) and cr function show the relationships between
dependent vectors. The cr function is not part of Matlab, but is described
in section 5.4.3.
Appendix A.2.2 gives a more mathematically formal definition of linearly
independent vectors.
5.1.3 Transpose
The transpose of either a vector or matrix is the reversal of the rows for
the columns.
a
Let a = b then, aT = a b c
c
a
Let b = a b c then, bT = b
c
a d
a b c
Let C = then, CT = b e
d e f
c f
Introduction to Linear Algebra 137
Inner Product
An inner product is the sum of products between a row vector and a
column vector. The multiply operator (*) in Matlab performs an inner
product. Thus to calculate a dot product, we only need to multiply the
transpose of the first vector by the second vector. The resulting scalar is
the dot product of the vectors.
For example,
1 3
a= b=
2 4
T
3
a·b =a b= 1 2
4
=1·3+2·4
= 11
The inner product is also used to calculate each value of the prod-
uct between matrices. Each element of the product is an inner product
between a row of the left matrix and a column of the right matrix.
Matlab has a dot function that takes two vector arguments and returns
the scalar dot product. However, it is often just as easy to implement a dot
product using inner product multiplication.
138 Introduction to Computational Engineering With MATLAB ®
Note: If the vectors are both row vectors (nonstandard), then the dot prod-
uct becomes a · b = a bT .
Matlab has a function called norm that takes a vector as an argument and
returns the length. The Euclidean length of a vector is called the l2 -norm,
which is the default for the norm function. See appendix A.1 for information
on other vector length measurements.
A normalized or unit vector is a vector of length one. A vector may be
normalized by dividing it by its length.
v
v̂ =
∥v∥
z = (cos β, sin β)
v = (cos θ, sin θ)
θ
θ w = (cos α, sin α)
u = (1, 0)
FIGURE 5.3: The dot product between two unit vectors is the cosine of the
angle between the vectors.
When the vectors are not unit vectors, the vector lengths factor out as con-
stants. A unit vector is obtained by dividing the vector by its length.
v w
· = cos θ
∥v∥ ∥w∥
|v · w| ≤ ∥v∥ ∥w∥
Perpendicular Vectors
u−v
u
u+v
v
FIGURE 5.5: A rhombus is a parallelogram whose sides are the same length.
The dot product can show that the diagonals of a rhombus are perpendicular.
Section 5.2.3.2 lists transpose properties including the transpose with re-
spect to addition, (a + b)T = aT + bT . We also need to use the fact that the
sides of a rhombus are the same length. We begin by setting the dot product
Introduction to Linear Algebra 141
(u + v)T (u − v) =0
(uT + v T ) (u − v) =0
(5.3)
uT u − uT v + v T u − v T v =0
uT u = vT v
Each term of equation (5.3) is a scalar and dot products are commutative,
so the middle two terms cancel each other and we are left with the requirement
that the lengths of the sides of the rhombus be the same, which is in the
definition of a rhombus.
x b̂ b
p1 p2
a c
c
goal = p2 − k kck
We want to identify the vector from the terminal point of a to the terminal
point of c as b̂ x where x is a scalar and b̂ is a unit vector in the direction
of b.
142 Introduction to Computational Engineering With MATLAB ®
>> p1 = [0; 50];
>> p2 = [40; 40];
Wall >> b = p2 - p1;
p1 = (0, 50) >> bhat = b/norm(b);
p2 = (40, 40) >> a = p1;
>> c = a - bhat*(bhat’*a);
>> k = 50;
>> goal = p2 - k*c/norm(c)
goal =
27.873
goal = (27.9, −8.5) -8.507
FIGURE 5.7: Wall–following example.
b
b̂ =
∥b∥
c = a + b̂ x (5.4)
We find the answer by starting with the orthogonality requirement between
vectors c and b̂.
b̂T (a + b̂ x) =0
b̂T a + b̂T b̂ x =0
x = −b̂T a
Since b̂ is a unit vector, b̂T b̂ = 1 and drops out of the equation. We return to
equation (5.4) to find the equation for c.
c = a − b̂ b̂T a (5.5)
Figure 5.7 shows an example starting with the sensor measurements for
points p1 and p2 . Finding the vector c perpendicular to wall with equa-
tion (5.5) makes finding the short-term goal location quite simple. We are
using sensors on the robot at ±90◦ and ±45◦ , and the coordinate frame is
that of the robot’s, so p1 is on the y axis and p2 is on a line at ±45◦ from the
robot. The robot begins 50 cm from the wall.
outer products.
u1 u1 v1 u1 v2 ··· u1 vn
u2 u2 v1 u2 v2 ··· u2 vn
u ⊗ v = u v T = . v1 v2 ··· vn =
.. .. .. ..
..
. . . .
um um v1 um v2 ··· um vn
the number of basis vectors. In appendix A.2, we give a more precise definition
of dimension as the number of basis vectors in the vector space.
In figure 5.9, a R2 custom vector space has basis vectors as the columns of
the W matrix. Even though the basis vectors are in R3 Cartesian coordinates,
the vector space is 2-dimensional since there are two basis vectors and the span
of the vector space is a plane.
Notice that the two basis vectors are orthogonal to each other. Since the
basis vectors are in the columns of W, WT W = I show us that the columns
are unitary (length of one and orthogonal to each other).
Other vector spaces may also be used for applications not relating to ge-
ometry and may have higher dimension than 3. Generally, we call this Rn .
For some applications, the coefficients of the vectors and scalars may also be
complex numbers, which is a vector space denoted as Cn .
2 The meaning of the SVD factors is not our concern at this point. In this application, it
is only an example factoring. However, the SVD was chosen because the three factors have
the interesting property of rotating, scaling, and then rotating points or vectors again.
3 The factors of the SVD are A = U Σ VT . In this example, V is symmetric, so V = VT .
146 Introduction to Computational Engineering With MATLAB ®
p = [1;0];
R = [0.7071 -0.7071;
0.7071 0.7071];
[U,S,V] = svd(2*R);
p1 = V’*p; % (-1, 0)
p2 = S*p1; % (-2, 0)
p3 = U*p2; % (1.4, 1.4)
FIGURE 5.10: The first matrix multiplication rotates the vector. The second
multiplication scales the vector. The third multiplication rotates the vector
again. The three successive multiplications achieve the same result as multi-
plying the vector by 2 R.
A × B = AB
m×p p×n m×n
Diagonal Matrix
A diagonal matrix has zeros everywhere except on the main diagonal,
which is the elements where the row index and column index are the
same. Diagonal matrices are usually square (same number of rows and
columns), but they may be rectangular with extra rows or columns of
zeros.
d1 0 0
0 d2 0
0 0 d3
The main diagonal, also called the forward diagonal, or the major diago-
nal, is the set of diagonal matrix elements from upper left to lower right.
The set of diagonal elements from lower left to upper right is of signifi-
cantly less interest to us, but has several names including the antidiagonal,
back diagonal, secondary diagonal, or the minor diagonal.
Identity Matrix
An identity matrix (I) is a square, diagonal matrix where all of the ele-
ments on the main diagonal are one. Identity matrices are like a one in
scalar math. That is, the product of any matrix with the identity matrix
yields itself.
AI = A = IA
1 0
I2×2 =
0 1
1 0 0
I3×3 = 0 1 0
0 0 1
1 0 0 0
0 1 0 0
I4×4 =
0
0 1 0
0 0 0 1
150 Introduction to Computational Engineering With MATLAB ®
Matlab has a function called eye that takes one argument for the matrix
size and returns an identity matrix.
Symmetric Matrix
A symmetric matrix (S) has symmetry relative to the main diagonal. If
the matrix where written on a piece of paper and you folded the paper
along the main diagonal then the off-diagonal elements with the same
value would lie on top of each other. For symmetric matrices ST = S.
Note that for any matrix, A, the products AT A and A AT both yield
symmetric matrices. Here are a couple examples of symmetric matrices.
2 3 6
1 2 3 1 5
2 3
6 5 2
Orthonormal Matrix
A matrix, Q, is called orthonormal if the columns are unit vectors (length
of 1) and the dot product between the columns is zero (cos(θ) = 0). That
is to say, the columns are all orthogonal to each other.
Introduction to Linear Algebra 151
(0.52 0.8662 )
+ (0.5 · 0.866 − 0.866 · 0.5)
=
(0.866 · 0.5 − 0.5 · 0.866) (0.8662 + 0.52 )
1 0
0 1
Note: Orthogonal matrices may seem a bit abstract at this point because
we have not yet encountered an application for them. We need an
orthogonal matrix to provide the basis vectors for vector projections,
which is described in section 5.9.3.2. Rotation matrices discussed in
section 5.3.1 are orthogonal. We can obtain orthogonal columns from
an existing matrix by using the modified Gram–Schmidt algorithm
(appendix A.4.3), QR factorization (appendix A.5), or from the SVD
(appendix A.4 and section 6.8.5.2). Orthogonal matrices are also the
result from finding the eigenvectors of symmetric matrices, which is
proven in appendix A.9.
A−1 A = A A−1 = I
Not all square matrices have an inverse and calculating the inverse for larger
matrices is slow, requiring significant calculations.
In Matlab, the function inv(A) returns the inverse of matrix A. Mat-
lab’s left-divide operator provides a more efficient and numerically stable
alternative to using a matrix inverse to solve a system of equations. So while
matrix inverses frequently appear in written documentation, we seldom cal-
culate a matrix inverse.
152 Introduction to Computational Engineering With MATLAB ®
QT = Q−1
Here is an example of an orthogonal rotation matrix.
>> Q
Q =
0.5000 -0.8660
0.8660 0.5000
>> Q’
ans =
0.5000 0.8660
-0.8660 0.5000
5.2.4 Determinant
The determinant of a square matrix is a number. At one time, determinants
were a significant component of linear algebra. But that is not so much the
154 Introduction to Computational Engineering With MATLAB ®
case today [69]. The importance of determinants is now more for analytical
than computational purposes. Thus, while it is important to know what a
determinant is, we will avoid using them when possible.
number of calculations needed relative to the size of the data operated on, n. If we say
that an algorithm has run time complexity of O(n3 ), we are not saying that n3 is the exact
number of calculations needed to run the algorithm. Rather, we are saying that the order
of magnitude of the number of calculations needed has a cubic relationship to n. So if the
number of data elements is doubled, then the number of calculations needed grows by a
factor of 8, O((2n)3 ).
Introduction to Linear Algebra 155
a b c
e f d f d e
h i − b
d e f = a +c
g i g h
g h i
Another approach to remembering the sum of products needed to compute
a 3×3 determinant is to compute diagonal product terms. The terms going
from left to right are added while the right to left terms are subtracted. Note:
This only works for 3×3 determinants.
a b c a b
a b c ⧹ ⧹ ⧹
d e f = d e f d e
g h i ⧹ ⧹ ⧹
g h i g h
a b c a b
⧸ ⧸ ⧸
−
d e f d e
⧸ ⧸ ⧸
g h i g h
a b c
d
e f = aei + bf g + cdh − ceg − af h − bdi
g h i
For a 4×4 determinant, the pattern continues with each cofactor term
now being a 3×3 determinant. The pattern likewise continues for higher order
matrices. The number of computations needed with this method of computing
a determinant is on the order of n!, which is prohibitive for large matrices.
Notice that the sign of the cofactor additions alternate according to the
pattern:
+ − + −
− + − +
signi,j = (−1)i+j = + − + −
− + − +
It is not necessary to always use the top row for Laplace expansion. Any row
or column will work, just make note of the signs of the cofactors. The best
strategy is to apply Laplace expansion along the row or column with the most
zeros.
Here is bigger determinant problem with lots of zeros so it is not too bad.
At each stage of the expansion there is a row or column with only one nonzero
element, so it is not necessary to find the sum of cofactor determinants. Trace
156 Introduction to Computational Engineering With MATLAB ®
If you work much with Matlab, you will occasionally run across a warn-
ing message saying that a matrix is close to singular. The message may also
reference a rcondition number. The condition number and rcondition number
are metrics indicating if a matrix is invertible, which we will describe in sec-
tion 5.5.4.3. For our discussion here, we will focus on the rank as our invertible
test.
u×v
Curl of fingers
FIGURE 5.11: Right Hand Rule: u × v points in the direction of your right
thumb when the fingers curl from u to v.
• The length ∥v × u∥ = ∥u∥ ∥v∥ | sin θ|, where θ is the angle between u
and v.
⃗k
⃗i ⃗j
u × v = u1 u2 u3
v1 v2 v3
Vectors ⃗i, ⃗j, and ⃗k are the unit vectors in the directions of x, y, and z of
the 3-D axis.
[u]T× = −[u]×
u × v = [u]× v
y
ŷ
p′
b′ Rotate point by θ
b p
a′ a x̂ x
FIGURE 5.12: Point p = (a, b) is rotated by θ to p′ = (a′ , b′ ).
y
− sin θ
ŷ
cos θ
p′
cos θ
b ŷ ′
x̂
a x̂′ sin θ
FIGURE 5.13: Rotating the axis allows us to see the coordinates of the
rotated point as scalar multiples of the rotated unit vectors x̂ and ŷ.
p = a x̂ + b ŷ
p′ = a x̂′ + b yˆ′
162 Introduction to Computational Engineering With MATLAB ®
1. The columns define basis vectors for the rotated coordinate frame.
2. The matrix is orthogonal (square with unit length, orthogonal columns).
3. R(θ)−1 = R(−θ) = RT (θ).
4. The determinant, det(R(θ)) = 1, ∀θ, thus R is never singular.
The left to right reading of equation (5.7) might lead one to say that the
translation occurs before the rotation. We say that rotation occurs first
because the rotation matrix is immediately left of the point location. So
it is the multiplication of the point with the rotation matrix that moves
the point first. Since multiplication of matrices is not commutative, the
rotation matrix must be immediately left of the point location to get the
desired result.
first product
z }| {
′
p = Tt (xt , yt ) R(θ) p
| {z }
second product
the rotation and translation matrices are constructed with a sequence of com-
mands. Peter Corke’s Spatial Math Toolbox provides functions that expedite
the effort [16].5
5 You might also be interested in his toolboxes for robotics and machine vision.
Introduction to Linear Algebra 165
>> % transformation of p
>> p2 = T*p1
p2 =
1.6340
2.3660
1.0000
>> p_new = p2(1:2) % Cartesian coordinates
p_new =
1.6340
2.3660
{E}
W
TE
{W }
FIGURE 5.16: The pose of the end effector is represented by the coordinate
frame {E}, which is defined relative to the world frame by the homogeneous
coordinate transformation W TE = R(θ)Tx(a).
translation is only along the x axis. If a rigid body also has a displacement in
the direction of the y axis, we can either use a second translation matrix, or
specify the translation matrix as having both x and y components.
cos θ − sin θ 0 1 0 a cos θ − sin θ a cos θ
W
TE = sin θ cos θ 0 0 1 0 = sin θ cos θ a sin θ
0 0 1 0 0 1 0 0 1
q2
A
TB {B}
a {A}
W
TA
q1
{W }
FIGURE 5.17: A coordinate frame for each joint is created by multiplication
of transform matrices.
For serial-link robot arms, we define a coordinate frame for each joint of the
robot. As shown in figure 5.17, the composite coordinate frames are products
of the transformation matrices from the world coordinate frame to each joint.
If we know the position of a point relative to a transformed coordinate
frame, then the point’s location in the world coordinate frame is the prod-
uct of the transformation matrices and the point location in the transformed
coordinate frame. However, if we know a point location in the world coordi-
nate frame and want its location in another coordinate frame, we multiply the
point location by the inverse of the transformation.
168 Introduction to Computational Engineering With MATLAB ®
% point in B to World
>> p1_B = [1;1;1]; % homogeneous
>> p1_W = B_frame*p1_B
p1_W =
11.0161
6.1306
1.0000
% point in World to B
>> p2_W = [12;7;1];
>> p2_B = inv(B_frame)*p2_W
Introduction to Linear Algebra 169
p2_B =
2.1754
1.5851
1.0000
5.4.1 An Example
The following system of equations has three unknown variables.
2x1 − 3x2 =3
4x1 − 5x2 + x3 = 7
2x1 − x2 − 3x3 = 5
We can solve equation (5.8) with the inv function, but Matlab provides
a more efficient way to solve it.
170 Introduction to Computational Engineering With MATLAB ®
>> A = [2 -3 0; 4 -5 1; 2 -1 -3];
>> b = [3 7 5]’;
>> x = A\b
x =
3.0000
1.0000
0.0000
6 Dr. Gilbert Strang is widely considered to be a leading authority and advocate for
[R,j] = rref(A);
r = length(j); % r = rank.
R = R(1:r,:);
C = A(:,j);
end
Rank
The rank, r, of a matrix is the number of independent rows and columns of
the matrix. Rank is never more than the smaller dimension of the matrix.
Sometimes we can look at a matrix and observe if some rows or columns
are linear combinations of other rows or columns. The CR factoring helps
us to see those relationships. The rank function tells us the rank for
comparison to the size of the matrix. A square matrix is full rank when
the rank is equal to the matrix size. Code 5.1 shows one way to compute
the rank of a matrix, which is the number of nonzero pivot variables
from the elimination algorithm, which are the diagonal elements of R. See
appendix A.2.3 and section 6.8.5.3 for more information on the calculation
of a matrix’s rank.
172 Introduction to Computational Engineering With MATLAB ®
Now we change the A matrix making the third column a linear combination
of the first two columns. The rank of the matrix is then 2. We can see the rank
from the CR factorization as the number of columns in C and the number
of rows in R. The last column of R shows how the first two columns of C
combine to make the third column of A.
When a square matrix is not full rank, we say that the matrix is singular.
Singular matrices do not have a matrix inverse, and the left-divide operator
will not find a solution to the system of equations.
When a matrix has fewer rows than columns, we say that the matrix is
under-determined. The rank of under-determined matrices is less than or equal
to the number of rows, r ≤ m. The CR factorization will show the columns
beyond the rank as linear combinations of the first r columns.
R =
1.0000 0 0 -2.6456
0 1.0000 0 3.0127
0 0 1.0000 1.2278
When a matrix has fewer columns than rows, we say that the matrix is
over-determined. If the columns of A are independent, the CR factorization
looks the same as that of the full rank, square matrix—all of the columns of
A are in C.
5 10 1
>> A = randi(10, 4, 3) 9 8 9
A = 2 10 10
10 5 7
Introduction to Linear Algebra 173
9 8 9
2 10 10
>> [C, R] = cr(A) R =
C = 1 0 0
10 5 7 0 1 0
5 10 1 0 0 1
As shown in figure 5.18, the point where the lines intersect is the solution to
the system of equations. Two parallel lines represent a singular system that
does not have a solution. If we have only one equation, we have an under-
determined system with no unique solution.
If another row (equation) is added to the system making it over-
determined, there can still be an exact solution if the new row is consistent
with the first two rows. For example, we could add the equation 3x + 2y = 7
to the system and the line plot for the new equation will intersect the other
lines at the same point, so x = 1; y = 2 is still valid. We could still say that b
174 Introduction to Computational Engineering With MATLAB ®
is in the column space of A. Of course, most rows that might be added will
not be consistent with the existing rows, so then we can only approximate a
solution, which we will cover in section 5.9.
y
(4, 1)
(2, -1)
Square matrix systems that are not full rank are called singular and
do not have a solution.
This example does not have a solution.
>> A = [1 2 3; 0 3 1; 1 14 7]
A =
1 2 3
0 3 1
1 14 7
Under-determined case:
When m < n, there are not enough equations, and no unique solution
exists. In this case, the A matrix is called under-determined. Although
a unique solution does not exist, it is possible to find an infinite set of
solutions, such as all points on a line or a plane.
Over-determined case:
When m > n, there are more equations than unknowns. In this case, the
A matrix is called over-determined. It is required that rank(A) = n to
find a solution.
When b is in the column space of A, then an exact solution exists.
That is, when b can be written as a linear combination of the columns of
A. We can test for this by comparing the rank of the augmented matrix
of both A and b to the rank of A, rank([A b]) = rank(A). If the rank of
the augmented matrix and A are the same, then b is in the column space
of A.
When b is not in the column space of A, the only solution available
is an approximation. We will discuss the over-determined case in more
detail in section 5.9.
In the following example, we use the rank test and see that b is in the
column space of A. Then after a change to the A matrix, we see from the
rank test that b is no longer in the column space of A
5.5 Elimination
The elimination procedure is used to find solutions to matrix equations.
This section reviews the elimination procedure as it is usually performed
with pencil and paper. A basic understanding of how to manually implement
the elimination procedure is a prerequisite to understanding and applying
the functions available in Matlab for solving systems of equations. In sec-
tion 5.5.4, we consider round-off errors and poorly conditioned matrices that
complicate finding accurate solutions with elimination based algorithms on
a computer. Two elimination based Matlab functions are introduced. The
rref function performs elimination as it is described here and is a useful ed-
ucational and analysis tool. The lu function implements a variation of elimi-
nation known as LU decomposition (section 5.6) that is the preferred method
for solving square matrix systems of equations.
1. Reorder the rows, which is called row exchanges or partial pivoting. This
step is critical to finding an accurate solution to a system of equations. In
section 5.5.4, we will address row exchanges in the context of numerical
accuracy. Initially, we will use systems of equations where row exchanges
are not needed.
2. Add a multiple of one row to another row, replacing that row.
3. Multiply a row by a nonzero constant.
When a row has all zeros to the left of the main diagonal, that nonzero
element on the diagonal is called the pivot. The pivot is used to determine a
multiple of the row to add to another row to produce a needed zero.
We begin with our classic matrix equation Ax = b. Any operations done
on the A matrix must also be applied to the b vector. After elimination, we
7 Before row operations are performed using a pivot variable, the Gauss–Jordan elimi-
nation algorithm divides the pivot row by the pivot value. Thus the pivot values used are
always one. This step is needed when the objective is RREF. The Gaussian elimination
algorithm, which is used in the examples presented here, does not take this step.
Introduction to Linear Algebra 179
x1 = 3, x2 = 1, x3 = 0
180 Introduction to Computational Engineering With MATLAB ®
Practice Problem
Here is another system of equations with an integer solution. Use elimi-
nation to solve for variables x1 , x2 , and x3 . Then use Matlab to verify
your answer.
−3x1 + 2x2 − x3 = −1
6x1 − 6x2 + 7x3 = −7
3x1 − 4x2 + 4x3 = −6
2 −3 0 1 0 0
4 −5 1 0 1 0
0 2 −3 −1 0 1
2 −3 0 1 0 0
0 1 1 −2 1 0
0 2 −3 −1 0 1
2 −3 0 1 0 0
0 1 1 −2 1 0
0 0 −5 3 −2 1
2 0 3 −5 3 0
0 1 1 −2 1 0
0 0 −5 3 −2 1
Add 3/5 of row 3 to row 1 ; and add 1/5 of row 3 to row 2. Then a pivot is
no longer needed.
Introduction to Linear Algebra 181
2 0 0 −16/5 9/5 3/5
0 1 0 −7/5 3/5 1/5
0 0 −5 3 −2 1
1 0 0 −16/10 9/10 3/10
0 1 0 −7/5 3/5 1/5
0 0 1 −3/5 2/5 −1/5
−1.6 0.9 0.3
A−1 = −1.4 0.6 0.2
−0.6 0.4 −0.2
Note: This result matches what Matlab found in section 5.2.5. All of the
tedious row operations certainly makes one appreciate that Matlab
can perform the calculations for us.
A full rank matrix system in augmented form has the identity matrix in
the first m columns when in RREF. The last column is the solution to the
system of equations.
182 Introduction to Computational Engineering With MATLAB ®
A larger exponent increases the value of the smallest number that can
be added to that number. A value of one can only be added to numbers
smaller than 253 .
>> x = 2^53
x =
9.0072e+15
>> isequal(x, x+1) % x+1 is the same as x
ans =
logical
1
>> y = 2^53 - 1;
>> isequal(y, y+1)
ans =
logical
0
remaining rows are sorted to maximize the absolute value of the pivot variable.
The new row order is documented with a permutation matrix to associate
results with their unknown variable. Reordering rows, or rows exchanges, is
also called partial pivoting. Full pivoting is when columns are also reordered.
Here is an example showing why partial pivoting is needed for numerical
stability. This example uses a 2×2 matrix where we could see some round-off
error. The ϵ value used in the example is taken to be a very small number, such
as the machine epsilon value (eps) that we used in section 1.6.3. Keep in mind
that round-off errors will propagate and build when solving a larger system
of equations. Without pivoting, we end up dividing potentially large numbers
by ϵ producing even larger numbers. In the algebra, we can avoid the large
numbers, but get very small numbers in places. In either case, during the row
operation phase of elimination, we add or subtract numbers of very different
value resulting in round-off errors. The accumulated round-off errors become
especially problematic in the substitution phase where we might divide two
numbers that are both close to zero.
ϵx1 + 2x2 = 4
3x1 −x2 = 7
ϵ 2 x1 4
=
3 −1 x2 7
Regarding ϵ as nearly zero relative to the other coefficients, we can use sub-
stitution to quickly see the approximate values of x1 and x2 .
4
x2 ≈ 2 =2
7+2
x1 ≈ 3 =3
ϵx1 = 4 − 2x2 = 0
4−2x2 0
x1 = ϵ ≈ ϵ =0
3x1 − x2 = 7
7+2
x1 ≈ 3 =3
Matlab uses partial pivoting in both RREF and LU decomposition to min-
imize the impact of the round-off error. It will use the small values in the
calculation that we rounded to zero, so the results will be more accurate than
our approximations.
>> x = A\b
x =
3.0000
2.0000
>> isequal(x, [3;2]) % not completely rounded
ans =
logical
0
186 Introduction to Computational Engineering With MATLAB ®
% RREF fails
>> rref([A b])
ans =
1.0000 0.0000 0
0 0 1.0000
The cond function in Matlab uses the ∥·∥2 matrix norm by default, but
may also compute the condition number using other matrix norms defined
in appendix A.1.3. However, to avoid calculating a matrix inverse, Matlab
calculates the condition number using the singular values of the matrix as de-
scribed in section 6.8.5.5. Using the singular values, there is a concern about
division by zero, so Matlab calculates the reciprocal, which it calls RCOND.
Matlab’s left-divide operator checks the condition of a matrix before at-
tempting to find a solution. When using the left-divide operator, one may
occasionally see a warning message such as follows.
5.6 LU Decomposition
We present here a variant of Gaussian elimination called LU decomposition
(for Lower–Upper). It is used internally by Matlab for computing inverses,
the left- and right-divide operators, and determinants. Matlab has two func-
tions that implement a form of the elimination procedure to solve systems of
equations—lu and rref. The lu function is the faster and more numerically
accurate of the two.
Two features of the algorithm lend to its numerical accuracy. First, the LU
algorithm uses partial pivoting. As we saw in section 5.5.4.1, round-off errors
are less likely when the pivot variable is the largest element in its column.
Secondly, LU decomposition uses a different elimination algorithm that is less
likely to incur round-off errors than the traditional Gauss–Jordan algorithm
used by rref.
Recall that Gaussian elimination uses an augmented matrix so that
changes made to matrix A are also applied to vector b. A sequence of row
operations changes the matrix equation.
Ax = b 7→ Ux = c
triangular. Thus the substitution phase for solving Ax = b has two steps—
solving Ly = b and then solving Ux = y.
Note: LU decomposition is also called LU factorization because it is one of
the ways that a matrix can be factored into a product of sub-matrices.
LU decomposition changes the equations used to solve systems of equations
as follows. In addition to L and U, we have an elimination matrix, E. Its
inverse is L. We also introduce the P matrix, which is the permutation matrix
that orders the rows to match the row exchanges.8
A 7→ EPA = U
P A = E−1 U 7 → PA = LU
Ax = b 7→ PT L U x = b
x = A−1 b 7→ x = U−1 L−1 P b
5.6.1 LU Example
Before demonstrating how to use the lu function in Matlab, we will first
work part of a problem manually and then turn to Matlab to help with the
computation. We will use the same example that was solved in section 5.5.
For purposes of understanding how the U and L matrices are factors of the
A matrix, we will follow a traditional elimination strategy to find U and will
form L from a record of the row operations needed to convert A into U. We
will focus on how the U and L matrices are found, and will not make row
exchanges in this example, but will later work an example with row exchanges.
2x − 3y =3
4x − 5y + z = 7
2x − y − 3z = 5
The needed row operations are the same as noted in section 5.5. Each row
operation is represented by a matrix Ei,j that can multiply with A to affect
the change. These matrices start with an identity matrix and then change one
element to represent each operation. We call these matrices Elementary Ma-
trices. Then the product of the elementary matrices is called the elimination
matrix.
1. Add −1 of row 1 to row 3, called E1 .
2. Add −2 of row 1 to row 2, called E2 .
3. Add −2 of row 2 to row 3, called E3 .
8 Equation (5.11) is expressed with a pure mathematics notation using the inverse of
matrices. The solution expression that has better value is the Matlab expression that
directly uses back substitution on triangular matrices (x = U\(L\(P*b))).
190 Introduction to Computational Engineering With MATLAB ®
>> inv(E3)
ans =
1 0 0
0 1 0
0 2 1
Also note that the L matrix does the reverse of E, so L could be found
directly from the list of row operations.
Now, with L and U found, we can find the solution to L U x = b. Since
these are lower and upper triangular matrices, the left-divide operator will
quickly solve them. If several solutions are needed, the one time calculations
of L and U will substantially expedite the calculations.
3
>> b 7
b = 5
192 Introduction to Computational Engineering With MATLAB ®
3.0000
1.0000
>> x = U\(L\b) -0.0000
x =
% Returning to P =
% our previous example ... 0 1 0
>> [L, U, P] = lu(A) 0 0 1
L = 1 0 0
1.0000 0 0 >> x = U\(L\(P*b))
0.5000 1.0000 0 x =
0.5000 -0.3333 1.0000 3.0000
U = 1.0000
4.0000 -5.0000 1.0000 -0.0000
0 1.5000 -3.5000
0 0 -1.6667
LU for the algorithm presented in code 5.2. Golub and Van Loan give a proof showing that
potential round-off errors are minimized by this algorithm [28].
Introduction to Linear Algebra 195
[m, n] = size(A);
if m ~= n
error(’Matrix is not square’)
end
P = eye(n);
% Turing’s nested elimination loop
for k = 1:(n - 1)
[A(k:n,:), idx] = sortrows(A(k:n,:), k, ’descend’, ...
’ComparisonMethod’,’abs’);
I = P(k:n,:);
P(k:n,:) = I(idx,:); % Permutation matrix
for i = k + 1:n
A(i, k) = A(i, k)/A(k, k);
for j = (k + 1):n
A(i, j) = A(i, j) - A(i, k) * A(k, j);
end
end
end
% L is now in the lower triangular of A.
% U is now in the upper triangular of A.
L = tril(A, -1) + eye(n); % extract lower
U = triu(A); % and upper triangular
Because of the zeros in all three of these matrices, their determinants are quick
to compute. It is not necessary to take the transpose of P before finding its
determinant because the determinant of a square matrix is the same as the
determinant of its transpose, det(P) = det(P’). The determinant of P is
196 Introduction to Computational Engineering With MATLAB ®
either 1 or -1. Since each row has only a single one, the determinant is quick
to calculate. The determinant of L is always 1, which is the product of the
ones along the diagonal. The determinant of U is the product of the values
on its diagonal.
Refer to the normal method for calculating the determinant in section 5.2.4
and take a moment to see why the determinant of an upper or lower triangular
matrix is the product of the diagonal values.
The Matlab det function computes the determinant of a matrix as fol-
lows.
R1 R2
+
i1 i2 i3
V
R3 R4 R5
−
FIGURE 5.20: DC circuit for solving with the KVL method. We need to
find the current through and voltage drop across each resistor.
R1 R2
v1 v2
+ i1 i2 i5
V
R3 R4 R5
−
i3 i4
FIGURE 5.21: DC circuit for solving with the KCL method. We need to
find the current through and voltage drop across each resistor.
To remove the fractions, multiply each equation by the products of the de-
nominators. Thus we multiply the terms of the first equation by −R1 R2 R3 .
The second equation is multiplied by R2 R4 R5 . Then we combine terms by
the unknown node voltages to make the matrix equations.
(R2 R3 + R1 R3 + R1 R2 ) −R1 R3 v1 R2 R3 V
=
R4 R5 (−R4 R5 − R2 R5 − R2 R4 ) v2 0
% File: circuitSys.m
%% Constants
R1 = 1000; R2 = 1500; R3 = 2000; R4 = 1000; R5 = 1500;
v = 12;
%% KVL Solution
R = [(R1+R3) -R3 0;
-R3 (R2+R3+R4) -R4;
0 -R4 (R4+R5)];
V = [v; 0; 0];
I = R \ V; % V = RI
i1 = I(1); i2 = I(2); i3 = I(3);
V1 = R1*i1; V2 = R2*i2; V3 = R3*(i1-i2); V4 = R4*(i2-i3); V5 = R5*i3;
fprintf(’\n\tKVL Solution:\n’)
fprintf(’Current:\n i1 = %f\n i2 = %f\n i3 = %f\n’,i1,i2,i3);
fprintf(Voltages:\n)
fprintf(...
V1 = %f\n V2 = %f\n V3 = %f\n V4 = %f\n V5 = %f\n’, ...
V1,V2,V3,V4,V5);
disp(’Verify: ’)
fprintf(’ %f == %f ?\n’,V4,V5);
fprintf(’ %f == %f ?\n’,V3,(V2+V4));
fprintf(’ %f == %f ?\n’,v,(V3+V1));
%% KCL Solution
Secondly, the method of joints is used to determine the force on each mem-
ber. We examine each joint and write equations for the forces in the x and
the y directions. These forces must also sum to zero for each joint.
As you can see from figure 5.22, this is a fairly simple problem—4 mem-
bers between 3 joints and the external supports. As a full system of equa-
tions, there are 9 equations—3 external force equations and 2 equations each
for the 3 joints: A, B, and C. When taken in the desired order, each algebra
Introduction to Linear Algebra 201
C T D
60◦
60◦
% Forces on each member
5m T C 5m >> truss
AB: 14.4338 kN C
AC: 28.8675 kN T
C BC: 28.8675 kN C
CD: 28.8675 kN T
A 5m B
25 kN
FIGURE 5.22: A static truss based support. We determine the tension or
compression force on each member of the truss in code 5.4 and report the
results above. Each member is marked with a T or C as either bearing a
tension or compression force.
equation reduces to having only one unknown variable. We’ll take a middle
path between an over-determined system of 9 equations with 4 unknowns and
a carefully ordered sequence of algebra problems. We list the 9 equations and
then pick 4 equations for a 4×4 matrix equation. Three of the picked equa-
tions have only one unknown variable. The fourth equation has two unknown
variables. The 9 equations are listed below. The numbered equations are used
in the matrix solution.
Fy = 0: BC cos(60◦ ) = By = 25
P
4 – B:
Fx = 0: AC cos(60◦ ) + BC cos(60◦ ) = CD
P
C:
Fy = 0: BC cos(30◦ ) + BC cos(30◦ ) = 0
P
C:
202 Introduction to Computational Engineering With MATLAB ®
The rows of the A matrix are ordered such that it is upper triangular,
which the left-divide operator will take advantage of.
% File: truss.m
% Find the forces on truss members in figure 5.21.
c60 = cosd(60);
c30 = cosd(30);
s30 = sind(30);
A = [1 -c60 0 0; % rows ordered for triangular structure
0 c30 0 0;
0 0 c30 0;
0 0 0 c30];
b = [0; 25; 25; 25];
x = A\b;
fprintf(’AB: %g kN C\n’, x(1));
fprintf(’AC: %g kN T\n’, x(2));
fprintf(’BC: %g kN C\n’, x(3));
fprintf(’CD: %g kN T\n’, x(4));
CODE 5.4: Code to solve the system of equations for the forces on a truss
based support. The diagram and results are in figure 5.22.
Notice that each of the first three columns have only a single one. But
the fourth column, for x4 , has values in each row. Matlab found the fourth
column to be a linear combination of the first three columns by the weights
indicated. So the values of the first three variables, which are pivot columns,
are fixed by the equation. We call the variables associated with the pivot
columns basic variables. Any variables, such as, x4 , that are not associated
with pivot columns are called free variables, meaning that they can take any
value. We don’t have an equation for x4 since it is a free variable, so we can
just say that x4 = x4 . We can also replace it with an independent scalar
variable, a.
The new system of equations is:
x1 − 0.6091 x4 = 3.3909
x2 + 0.3864 x4 = 2.3864
x3 + 0.9136 x4 = 5.9136
x4 = x4
204 Introduction to Computational Engineering With MATLAB ®
The solution is a set of lines. We can see this if we write the line equations in
what is called parametric form.
x1 3.3909 0.6091
x2 2.3864 −0.3864
x3 5.9136 + a −0.9136 = u + a v
=
x4 0 1
Part of the solution that we found by elimination with the RREF is called
the particular solution, which is any vector that solves Au = b when a = 0.
The u vector is the last column of the augmented matrix in RREF. We also
found a general solution, which are the vectors that can be multiplied by any
scalar, a. In addition to finding the general solution from RREF, it is also
found from the null space solution of A, a v = Null(A). If x = u + a v, then
Ax = A (u + a v) = b + 0 = b.
Note that the general solutions found from the rref and null Matlab
functions are likely not the same, but may be scalar multiples of each other.
In the previous example, if a = 0.6516 then a v = Null(A). However, the
general solutions of rref and null will likely not have such a relationship if
m − n > 1 and the general solution has more than one column. If the general
solution has two columns that we call v and w, then we still have the same
basic relationship.
Ax = A (u + a v + b w) = b + 0 + 0 = b
See appendix A.3.2 for more information about the null space of a matrix.
Here is a quick example showing a general solution with two columns.
Notice that as we put the equation into parametric form that we change the
sign of the general solution columns to move them from the left to right side
of the equal sign. We also add ones in the appropriate rows on v and w so
that x3 = a and x4 = b.
>> A
A =
8 -6 4 -3
10 10 -7 2
>> b
b =
14
58
>> z = rref([[A b];zeros(2,5)])
z =
Introduction to Linear Algebra 205
Under-determined Pseudo-inverse
−1 T T −1
T T T T
=A AA =A AA
= AT (A AT )−1
As with the over-determined case, the direct matrix equation can be slow
and prone to inaccuracy for large and poorly conditioned matrices, so or-
thogonal techniques such as the SVD or QR factorization algorithms are
used instead. Matlab’s pinv function uses the SVD as described in sec-
tion 6.8.5.1, while the left-divide operator uses QR factoring as described
in section 5.11.3.
>> A\b
ans =
3.3909
2.3864
5.9136
0
>> pinv(A)*b
ans =
3.3909
2.3864
5.9136
0
y
a = (4, 2)
b = (2, 1)
x
FIGURE 5.23: Vectors a and b are consistent.
Here x is a scalar, not a vector, and we can quickly see that x = 1/2
satisfies both rows of equation (5.12). But since a and b are both vectors, we
want to use a strategy that will extend to matrices. We can multiply both
sides of equation (5.12) by aT so that the vectors turn to scalar dot products.
Introduction to Linear Algebra 209
aT ax = aT b (5.13)
4 2
4 2 x = 4 2
2 1
10
x = 20 = 12
Now, let’s extend the problem to a more general case as shown in figure 5.24
where vector b is not necessarily in-line with a. We will find a geometric reason
to multiply both sides of equation (5.12) by aT . We wish to project a vector b
y
b
e = b − ax
p = ax
a
x
FIGURE 5.24: Vector p is a projection of vector b onto vector a.
onto vector a, such that the error between b and the projection is minimized.
The projection, p, is a scalar multiple of a. That is, p = a x, where x is the
scalar value that we want to find. The error is the vector e = b − a x.
The geometry of the problem provides a simple solution for minimizing
the error. The length of the error vector is minimized when it is perpendicular
to a. Recall that two vectors are perpendicular (orthogonal) when their dot
product is equal to zero.
aT e = aT (b − a x) = 0
aT a x = aT b
aT b
x=
aT a
Since x is a fraction of two dot products, we can think of the projection in
terms of the angle, θ, between a and b.
∥a∥ ∥b∥ cos θ
x = ∥a∥2
∥b∥
= ∥a∥ cos θ
210 Introduction to Computational Engineering With MATLAB ®
a
= ∥a∥ ∥b∥ cos θ
T
= a aaT ab
We can also make a projection matrix, P, so that any vector may be
projected onto a by multiplying it by P.
a aT
P=
aT a
p = Pb
Note that a a is here a 2×2 matrix and aT a is a scalar. Here is an example.
T
>> a = [3;1];
>> b = [2;2];
>> P = a*a’/(a’*a) % Projection Matrix to a
P =
0.9000 0.3000
0.3000 0.1000
>> p = P*b % Projection of b onto a
p =
2.4000
0.8000
>> x = a’*b/(a’*a) % length of projection
x =
0.8000
>> e = b - p % error vector
e =
-0.4000
1.2000
>> p’*e % near zero dot product to
ans = % confirm e is perpendicular to p.
2.2204e-16
Matlab can help us see this. First, let us consider the case where b is in
the column space of A and an exact solution exists. Notice that rank(A) is
equal to rank([A b]).
Now, let’s add some random noise to the b vector so that it is not in the
column space of A and the solution is an approximation. Notice that rank(A)
is less than rank([A b]).
A = a1 a2
aT1 (b − Ax̂) = 0
aT2 (b − Ax̂) = 0
212 Introduction to Computational Engineering With MATLAB ®
aT1
0
b − Ax̂ =
aT2 0
The left matrix is just AT , which is size 2×3. The size of x̂ is 2×1, so the size
of Ax̂, like b, is 3×1.
AT (b − Ax̂) = 0
AT Ax̂ = AT b
−1 T
x̂ = AT A A b
Over-determined Pseudo-inverse
−1 T
The matrix AT A A has a special name and symbol. It is called
the Moore–Penrose pseudo-inverse of an over-determined matrix. It is used
when the simpler A−1 can not be used. A superscript plus sign (A+ ) is
used as a short-hand math symbol for the pseudo-inverse. Matlab has a
function called pinv(A) that returns the pseudo-inverse of A.
As with the pseudo-inverse of under-determined systems described in
section 5.8.2, Matlab uses the SVD to calculate the pseudo-inverse of
over-determined systems, which is more accurate and usually faster than
the direct matrix calculation (section 6.8.5.1).
Introduction to Linear Algebra 213
FIGURE 5.25: The projection of a vector onto the column space of A, which
spans a plane in R3 .
Two vectors, a1 and a2 define a plane. To find the equation for a plane,
we first determine a vector, n, which is orthogonal to both vectors. The cross
product provides this. The vector n is defined as n = [a b c]T . Then given one
point on the plane (x0 , y0 , z0 ), we can calculate the equation for the plane.
In this case, we know that the point (0, 0, 0) is on the plane, so the plane
equation is simplified.
The points of the plane are determined by first defining a region for x and
y and then using the plane equation to calculate the corresponding points for
z. A simple helper function called vector3 was used to plot the vectors. A
plot of the projection is shown in figure 5.25.
% File: projectR3.m
% Projection of the vector B onto the column space of A.
% Note that with three sample points, we can visualize
% the projection in R^3.
figure;
surf(x,y,z); % Plot the plane
daspect([1 1 1]); % consistent aspect ratio
hold on;
z0 = [0 0 0]’; % vector starting point
vector3(z0, a1, ’k’); % column 1 of A
vector3(z0, a2, ’k’); % column 2 of A
vector3(z0, b, ’g’); % target vector
vector3(z0, p, ’b’); % projection
vector3(p, b, ’r’); % error from p to b
hold off;
xlabel(’X’);
ylabel(’Y’);
zlabel(’Z’);
title(’Projection of a vector onto a plane’);
% helper function
function vector3( v1, v2, clr )
% VECTOR3 Plot a vector from v1 to v2 in R^3.
plot3([v1(1) v2(1)], [v1(2) v2(2)], ...
[v1(3) v2(3)], ’Color’, clr,’LineWidth’,2);
end
basis vector; thus, each term in the sum is the projection of b onto a
basis vector.
The example in code 5.7 finds the projection using the two methods. The
mod_gram_schmidt function from appendix A.4.3 is used to find the orthog-
onal basis vectors.
216 Introduction to Computational Engineering With MATLAB ®
% File: alt_project.m
%% Comparison of two projection equations
% Define the column space of A, which forms a plane in R^3.
a1 = [1 1 0]’; % vector from (0, 0, 0) to (1, 1, 0).
a2 = [-1 2 1]’;
A = [a1 a2];
b = [1 1 3]’;
%% Projection onto the column space of matrix A.
x_hat = (A’*A)\(A’*b);
p = A*x_hat; % projection vector
disp(’Projection onto column space: ’)
disp(p)
%% Alternate projection
% The projection is the vector sum of projections onto
% the orthonormal basis vectors of the column space of A.
[U, ~] = mod_gram_schmidt(A); % Could use the orth function
u1 = U(:,1);
u2 = U(:,2);
p_alt = b’*u1*u1 + b’*u2*u2;
disp(’Projection onto basis vectors: ’)
disp(p_alt)
The output from the example of two methods of vector projection is:
>> alt_project
Projection onto column space:
0.1818
1.8182
0.5455
Projection onto basis vectors:
0.1818
1.8182
0.5455
U
Fh
A
Fg
FIGURE 5.26: A wagon on an in-
cline.
FIGURE 5.27: Free-body diagram
of forces on the wagon.
NOTE: All vectors and forces are calculated as upward and to the right.
The force from gravity, fg , will be negative. Then, the important statics equa-
tion is that the sum of forces equals zero. In the calculations, projection ma-
trices are denoted as P. For the sake of clarity in the code, vector variables
begin with capital letters, while scalar variables are lower case.
The force toward the left (down the incline) is the projection of the gravity
force onto the incline, A.
Fl = Pa Fg
The sum of the forces to the left and right is zero.
Fr + Fl = 0 7−→ Fr = −Fl
The force to the right is a projection of the handle force vector onto the incline.
The handle force vector is the product of the scalar force vector and the unit
vector in the direction of the handle, Fh = fh H. We need to calculate the
inverse of the projection, so we separate the scalar from the unit vector to
218 Introduction to Computational Engineering With MATLAB ®
%% File: wagon.m
% Statics problem of a wagon on an incline solved with vector
% projections
P = @(a) a*a’/(a’*a);
% Projection matrix equation
V = [0 1]’; % Vertical vector
A = [8 3]’; % vector of the incline slope
H = [1 1]’; % vector of handle direction
H = H/norm(H); % unit vector of handle direction
% Fh = fh*H
fg = 23 * -9.81; % Gravity force down (F = m*a)
% 23 kg is about 50 pounds
Fg = fg * V; % Gravity force vector
Pa = P(A); % Projection matrix to incline A
Fl = Pa * Fg; % force down hill to left
Fr = -Fl; % force to right, Fr + Fl = 0
fh = (Pa * H) \ Fr; % Fr = Pa*Fh = Pa*H*fh
fprintf(’Force on handle = %.3f newtons or %.3f kg\n’, ...
fh, fh/9.81);
Challenge Question
By changing the vectors of the incline and the handle, one can verify
the correctness of the solution. What are the boundary conditions? Try
to determine the relationship of the slope of the incline to the mechanical
advantage for raising a weight vertically using a rolling wagon or cart on
an incline.
C + D t1 = b(t1 )
C + D t2 = b(t2 )
C + D t3 = b(t3 )
C + D t4 = b(t4 )
..
.
C + D tn = b(tn )
220 Introduction to Computational Engineering With MATLAB ®
= bT b − bT Au − uT AT b + uT AT Au
The minimum occurs where the derivative is zero. The derivative is with re-
spect to the vector u, which is the gradient of E. Appendix C of [78] and the
vector calculus chapter of [17] lists the derivative results that we need.
∂E
∇E = = −AT b − AT b + 2AT Au = 0
∂u
AT A u = AT b
Calculus gives us the same equation that we get from linear algebra.
−1 T
û = AT A A b
Introduction to Linear Algebra 221
C = b̄ − D t̄
We write the matrix and vector terms as sums of the variables t and b to show
that the equations for linear regression from the study of statistics and linear
algebra are the same.
P P
T n t T b
A A= P P 2 A b= P
t t tb
Using elimination to solve, we get:
P P
n
P P t2 P b
t t tb
P P
n t b
P 2 1 P 2 P
tb − n1 ( b) ( t)
P P
0 t − n ( t)
We get a match for the D coefficient by identifying statistical equations in the
equation from elimination.
tb − 1 ( b) ( t)
P P P
r σ t σb r σb
D= P n 1 P 2 = =
t2 − ( t) σt 2 σt
n
The C coefficient comes from back substitution of the elimination matrix.
P P
b−D t
C= = b̄ − D t̄
n
5.10.1.4 Linear Regression Example
Code 5.9 shows a simple linear regression example. The exact output is
different each time the script is run because of the random noise. A plot
showing the output is shown in figure 5.28.
222 Introduction to Computational Engineering With MATLAB ®
% File: linRegress.m
% Least Squares linear regression example showing the
% projection of the test data onto the design matrix A.
%
% Using control variable t, an experiment is conducted.
% Model the data as: b(t) = C + Dt.
f = @(t) 2 + 4.*t;
C = u_hat(1);
D = u_hat(2);
fprintf(’Equation estimate = %.2f + %.2f*t\n’, C, D);
FIGURE 5.28: Simple linear least squares regression. Output from the script
is Equation estimate = 1.90 + 3.89*t, which is close to the data model
of b = 2 + 4 t.
t21
1 t1
1 t2 t22
A=
.. .. ..
. . .
1 tn t2n
Higher order polynomials would likewise require additional columns in the A
matrix.
Code 5.13 on page 233 lists a quadratic regression example. The plot of
the data is shown in figure 5.29.
FIGURE 5.29: Least squares regression fit of quadratic data. The output
of the script is Equation estimate = 1.79 + -4.06*t + 0.99*t^2, which
is a fairly close match to the quadratic equation without the noise.
(yi − pi )2
P
R2 = 1 − Pi 2
i (yi − µ)
The range of R2 is from 0 to 1. If the regression fit does not align well with the
data, the R2 value will be close to 0. If the model is an exact fit to the data,
then R2 will be 1. In reality, it will be somewhere in between, and hopefully
closer to 1 than 0.
f0 (t) = 1, a constant
f1 (t) = t
f2 (t) = t2
..
.
fn (t) = tn
Introduction to Linear Algebra 225
The regression algorithm then uses a simple linear algebra vector projection
algorithm to compute the coefficients (α1 , α2 , ...αn ) to fit the data to the
polynomial.
ŷ(t0 ) = α0 f0 (t0 ) +α1 f1 (t0 ) + . . . +αn fn (t0 )
ŷ(t ) = α f (t ) +α f (t ) + . . . + α f (t )
1 0 0 1 1 1 1 n n 1
..
.
ŷ(tm ) = α0 f0 (tm ) +α1 f1 (tm ) + . . . +αn fn (tm )
The basis functions need not be polynomials, however. They can be any func-
tion that offers a reasonable model for the data. Consider a set of basis func-
tions consisting of a constant offset and oscillating trigonometry functions.
f0 (t) = 1, a constant
f1 (t) = sin(t)
f2 (t) = cos(t)
Code 5.14 on page 234 lists a function that implements generalized least
squares regression. The basis functions are given as an input in the form
of a cell array containing function handles.
Code 5.10 is a script to test the genregression function. A plot of the
output is shown in figure 5.30.
A fun application of generalized least squares is to find the Fourier series
coefficients of a periodic waveform. It turns out that the regression calculation
can also find Fourier series coefficients when the basis functions are sine waves
of the appropriate frequency and phase. The fourierRegress script listed in
code 5.11 demonstrates this for a square wave. The plot of the data is shown
in figure 5.31.
% File: trigRegress.m
f = {@(x) 1, @sin, @cos };
x = linspace(-pi, pi, 1000)’;
y = 0.5 + 2*sin(x) - 3*cos(x) + randn(size(x));
[alpha, curve] = genregression(f, x, y);
disp(alpha)
figure, plot(x, y, ’b.’)
hold on
plot(x, curve,’r’, ’LineWidth’, 2);
xlabel(’x’)
ylabel(’y’)
title(’Regression Fit of Trig Functions’)
legend(’data’, ’regression’, ’Location’, ’north’)
hold off
axis tight
FIGURE 5.31: A few terms of the Fourier series of a square wave found by
least squares regression.
% File: fourierRegress.m
% Use least squares regression to find Fourier series coefficients.
CODE 5.11: Script for least squares regression of a square wave by the
Fourier series.
% File: expRegress.m
x = linspace(0,5);
y = 25*exp(-0.5*x) + randn(1, 100);
Y = log(y);
A = ones(100, 2); A(:,2) = x’;
y_hat = exp(A*(A\Y’));
figure, hold on
plot(x,y, ’o’)
plot(x, y_hat’, ’r’)
hold off
legend(’data’, ’regression’)
title(’Regression Fit of Exponential Data’)
xlabel(’x’), ylabel(’y’)
a flow chart showing how the A matrix is evaluated in terms of its shape
and special properties to determine the most efficient algorithm. Note that
mldivide often acts as a recursive function. It will use factorization to break
a problem down to simpler problems and then call itself to solve the simpler
problems. In particular, lower and upper triangular matrices are sought so
that in turn the efficient triangular solver can be used to find the solution by
forward and back substitution.
Right Division
As noted in section 5.2.1.4, Matlab also has a right-divide operator
(/). It provides the solution to xA = b as x = b/A;. We have seldom
mentioned it for two reasons. First, the need for the right-divide operator
is much less than that of the left-divide operator, especially in regard
to systems of equations. Secondly, right division is implemented using
the left-divide operator with the relationship b/A = (A’\b’)’. So the
algorithms reviewed here for the left-divide operator also apply to the
right-divide operator.
requirement for being positive definite. If the first test passes, then
a second test is required to see if the Cholesky factorization may be
used. A Hermitian matrix is positive definite if all of its eigenvalues
are positive.
Here is an example of the triangular Cholesky and LDLT factorization of
a positive definite Hermitian matrix.
We have shown how the least squares solution is found from the vector pro-
−1 T
jection equation x = AT A A b or from the pseudo-inverse (x = A+ b),
which Matlab finds from the economy SVD. However, the mldivide function
uses QR factorization to find the least squares solution. The QR algorithm is
described in appendix A.5. It is a matrix factoring returning an orthogonal
matrix, Q, and a upper triangular matrix, R, such that A = Q R. It is an
efficient and accurate algorithm. Using the result to find the solution to our
system of equations is also quick because Q is orthogonal and R is upper
triangular. So finding the solution only requires a matrix transpose, a matrix–
vector multiplication, and back substitution with the triangular solver.
A = QR
Ax = QRx = b
x = R−1 QT b
Note that for an over-determined matrix, the R matrix will end with rows of
all zeros, so to remove those rows we use the economy QR by adding an extra
zero argument to qr. Here is a quick example:
A x = RtT QTt x
−1
x = Qt RtT b
In the following example, the solution using the left-divide operator is found
first to see which columns should be replaced with zeros.
>> A
A =
2 4 20 3 16
5 17 2 20 17
19 11 9 1 18
>> b = [2; 20; 13];
>> y = A\b
y =
0.5508
0
0
0.7794
0.0975
To get the same result as the left-divide operator for this example, we
replace the second and third columns with zeros.
% File: quadRegress.m
%% Quadratic Least Squares Regression Example
%
% Generate data per a quadratic equation, add noise to it, use
% least squares regression to find the equation.
%% Generate data
f = @(t) 1 - 4*t + t.^2; % Quadratic function as a handle
Max = 10;
N = 40; % number of samples
std_dev = 5; % standard deviation of the noise
% A will be m-by-n
% alpha and b will be n-by-1
if length(x) ~= length(y)
error(’Input vectors not the same size’);
end
m = length(x);
n = length(f);
A = ones(m,n);
if size(y, 2) == 1 % a column vector
b = y;
cv = true;
else
b = y’;
cv = false;
end
if size(x, 2) == 1 % column vector
c = x;
else
c = x’;
end
for k = 1:n
A(:,k) = f{k}(c);
end
alpha = (A’*A) \ (A’*b);
if nargout > 1
yHat = A * alpha;
if ~cv
yHat = yHat’;
end
varargout{1} = yHat;
end
end
5.12 Exercises
(a) Use the vecnorm function to find the length of each column.
(b) Is A an orthogonal matrix?
(c) What is its inverse?
piece welded to its top. Our robot needs to make two weld lines. To help
the robot weld straight lines, we need to find a set of points along the weld
lines that are 1/8 inch apart. The location of the weld lines in the work-piece
coordinate frame are shown in the code in the change_coords.m file.
We need to transform the points from work-piece coordinates to world
coordinates. To do this, we add the base of the work-piece coordinate frame to
a transformation of the points. A set of points parallel to the work-piece along
the x axis (fixed Z coordinate) are at a 30◦ angle in the world coordinate frame.
We can find the transformed points by setting the work-piece coordinates in a
3×1 column vector and multiplying by a 3×3 transform matrix. The diagram
below shows a side view of the work-piece and a vertical line perpendicular
to the work-piece. Gray lines make right triangles that will help us determine
the transformation matrix based on the trigonometry of the right triangles.
z
wpz
wpx
% File: change_coords.m
% Demonstration of changing the coordinate frame between a robotic
% work-piece coordinate frame and the world coordinate frame. A
% work-piece surface is defined as starting at [10; -20; 30] in the
% world coordinate frame and tilting up from there at 30 degrees along
% the x-axis.
% The robot needs to make two 30-inch welds from points [5; 30] to
% [23; 6] on the work-piece surface and 10 inches above and parallel
% to the work-piece. To make straight line welds, the robot needs
% a set of points 1/8 apart in the world coordinate frame.
% pre-allocate storage
line1_rw = zeros(3, points);
line2_rw = zeros(3, points);
T = [ ]; % Orthogonal transform
%%
for p = 1:points
line1_rw(:,p) = Base_wp + T*line1_wp(:,p);
line2_rw(:,p) = Base_wp + T*line2_wp(:,p);
end
% Make a surface plot of the work-piece and plot the two weld lines.
figure
hold on
surf(X, Y, Z, ’FaceAlpha’,0.5)
plot3(x1, y1, z1, x2, y2, z2, ’LineWidth’, 3)
hold off
grid on
xlabel(’X’);
ylabel(’Y’);
zlabel(’Z’);
(a) Make matrices for A and b in Matlab such that the system
of equations can be expressed as a matrix equation in the form
Ax = b.
(b) Is matrix A full rank? Based on the rank, does this matrix equa-
tion have a solution for x = (x1 , x2 , x3 )?
(c) If a solution exists, use Matlab’s left-divide operator to find it.
2. Consider the system of linear equations:
−8 x1 + 7 x2 − 7 x3 = 10
9 x1 − 10 x2 + 5 x3 = −9
5 x1 + 10 x2 − x3 = 6
240 Introduction to Computational Engineering With MATLAB ®
(a) Use the orth function to find a matrix, Q, whose columns form
orthogonal basis vectors for the vector space (plane) spanned by
the two vectors.
Introduction to Linear Algebra 241
50 kg
(a) Write a Matlab function to solve systems of equations. For square ma-
trix systems, use LU decomposition. For rectangular systems, use the
QR factorization equations from sections 5.11.2 and 5.11.3. Use Mat-
lab’s left divide operator to solve triangular systems.
(b) Write Matlab functions to solve lower and upper triangular systems of
equations with back-substitution and forward-subsitution. Change your
function your the previous problem to use your triangular solvers.
Chapter 6
Application of Eigenvalues and
Eigenvectors
Eigenvalue / Eigenvector problems are one of the more important linear alge-
bra topics. Eigenvalues and eigenvectors are used to solve systems of differen-
tial equations, but more generally they are used for data analysis, where the
matrix A represents data rather than coefficients of a system of equations.
They introduce a simple, yet very powerful relationship between a matrix and
a set of special vectors and scalar values. This simple relationship provides
elegant solutions to some otherwise difficult problems.
Eigenvalues and eigenvectors are used to find the singular value decom-
position (SVD) matrix factoring, which is used by Matlab in many linear
algebra functions that we used in chapter 5 to solve systems of equations.
A x1
x1
A x3
x3
x2
A x2
file is part of the downloadable code from the book’s website. The code defines
a function that plots an animation when a 2×2 matrix is passed to it. The
matrix must have real eigenvectors and eigenvalues, which is always achieved
when the matrix is symmetric (equal to its own transpose). The animation
shows the matrix product of the matrix eigenvalues and eigenvectors λi xi , a
rotating unit vector, and the rotating product of the matrix and vector. When
the unit vector is inline with an eigenvector, then the product of the matrix
and vector overlays the eigenvector.
Ax = λ x
Ax = λI x
Then we rearrange the equation to find what is called the characteristic eigen-
value equation.
Ax − λI x = 0
(A − λI) x = 0
The case where x = 0 is a trivial solution that is not of interest to us. Eigen-
vectors are defined to be nonzero vectors. The solution only exists when the
columns of matrix A − λ I form a linear combination with x yielding the zero
vector. This linear dependence of the columns of the characteristic equation
means that it is singular—having a zero determinant.
det(A − λ I) = 0
Application of Eigenvalues and Eigenvectors 247
a11 − λ a12 ··· a1n
a21 a22 − λ · · · a2n
=0
.. .. ..
. . .
an1 an2 · · · ann − λ
Note: We will use the determinant here on small matrices because it keeps
things simple. But as noted in section 5.2.4, calculating determinants
is computationally slow. So it is not used for large matrices. Matlab
uses an iterative algorithm based on the QR factorization. The QR
algorithm for finding eigenvalues is described in appendix A.7.
The determinant yields a degree-n polynomial, which can be factored to
find the eigenvalue roots.
2 2
A=
2 −1
(2 − λ) 2
=0
2 (−1 − λ)
(2 − λ)(−1 − λ) − 4 = 0
λ2 − λ − 6 = 0
(λ − 3)(λ + 2) = 0
λ1 = −2, λ2 = 3
For the generalized 2×2 matrix, the coefficient of the λ term in the quadratic
equation is the negative of the sum of the matrix diagonal (the trace), while
248 Introduction to Computational Engineering With MATLAB ®
(a − λ)(d − λ) − bc = 0
λ2 − (a + d)λ + (ad − bc) = 0
This result for 2×2 matrices is further simplified by the quadratic equation.
We will define variables m as the mean of the diagonal elements and p as
the determinant of the matrix. Then we have a simple equation for the two
eigenvalues.
m = a+d2
p = ad − bc
p
λ1 , λ2 = m ± m2 − p
Here is a quick example, which is verified with Matlab’s eig function.
>> A
A =
-6 -1
-8 -4
>> eig(A)
ans =
-8
-2
>> m = -5;
>> p = 24 - 8; % p = 16
>> l1 = m - sqrt(m^2 - p) % l1 = -5 - 3
l1 =
-8
>> l2 = m + sqrt(25 - 16)
l2 =
-2
parts will cancel leaving only real numbers. In other cases, the presence
of complex eigenvalues implies oscillation in a system (appendix B.1.2).
>> poly(r)
ans =
1.0000 -4.0000 1.0000 6.0000
The algorithm used by the roots function is short and quite clever.
Polynomial factor
det(C) = 0 Roots
Eigenvalues
QR
Matrix algorithm
FIGURE 6.2: We first found the eigenvalues of a 2×2 matrix using the de-
terminant of the characteristic matrix to find a polynomial which was factored
to find the polynomial roots, which are the matrix eigenvalues. Such a strat-
egy is slow for larger matrices. The iterative QR algorithm is faster for finding
eigenvalues. In fact, a better strategy for finding the roots of a polynomial is
to first change the polynomial to a matrix that has the same eigenvalues as the
polynomial roots. Then the QR algorithm may be used to find the polynomial
roots.
>> A = diag(ones(n-1,1),-1)
A =
0 0 0
1 0 0
0 1 0
>> A(1,:) = -p(2:n+1)./p(1)
A =
4 -1 -6
1 0 0
0 1 0
>> r = eig(A)
r =
3.0000
2.0000
-1.0000
λ3 − 4λ2 + λ + 6 =0
• If analytic roots are needed, then the Symbolic Math Toolbox can help
(section 4.2.5).
Application of Eigenvalues and Eigenvectors 251
Matlab has a function called eig that calculates both the eigenvalues
and eigenvectors of a matrix. The results are returned as matrices. The eigen-
vectors are the columns of X. As with the null function, the eig function
always normalizes the eigenvectors. The eigenvalues are on the diagonal of L,
the Matlab function diag(L) will return the eigenvalues as a row vector.
Passing the matrix as a symbolic math variable to eig will show eigenvec-
tors that are not normalized.
Application of Eigenvalues and Eigenvectors 253
A (c x) = λ (c x)
• The trace of a matrix (sum of the diagonal elements) is the sum of the
eigenvalues.
• The determinant of a matrix is the product of the eigenvalues.
• The eig function:
6.5.1 Diagonalization
The eigenpairs of the n×n matrix A give us n equations.
A x1 = λ1 x1
A x2 = λ2 x2
..
.
A xn = λn xn
A X = A x1 x2 ··· xn
= λ1 x1 λ2 x2 ··· λn xn
λ1 0 ··· 0
0 λ2 ··· 0
= x1 x2 ··· xn .
.. .. ..
..
. . .
0 0 ··· λn
= XΛ
The matrix Λ is a diagonal eigenvalue matrix. If the matrix has linearly inde-
pendent eigenvectors, which will be the case when all eigenvalues are different
(no repeating λs), then X is invertible.
AX = XΛ
X−1 A X = Λ
A = X Λ X−1
>> A = [5 7; 0 5] L =
A = 5 0
5 7 0 5
0 5 >> rank(X)
>> [X, L] = eig(A) ans =
X = 1
1.0000 -1.0000
0 0.0000
6.5.2 Powers of A
How does one compute matrix A raised to the power k (Ak )?
Ak = A A A2 A4 . . . Ak/2 .
|{z}
A2
| {z }
A4
| {z }
Ak
Application of Eigenvalues and Eigenvectors 257
Ak = X Λk X−1
Because the Λ matrix is diagonal, only the individual λ values need be raised
to the k power (element-wise exponent).
k
λ1 0 · · · 0
0 λk2 · · · 0
Λk = .
.. . . ..
.. . . .
0 0 ··· λkn
Example Power of A
>> S = gallery(’moler’, 4)
S =
1 -1 -1 -1
-1 2 0 0
-1 0 3 1
-1 0 1 4
>> [Q, L] = eig(S)
Q =
-0.8550 0.1910 0.3406 -0.3412
-0.4348 -0.6421 -0.6219 0.1093
-0.2356 0.6838 -0.4490 0.5248
-0.1562 -0.2894 0.5437 0.7722
L =
0.0334 0 0 0
0 2.2974 0 0
0 0 2.5477 0
0 0 0 5.1215
>> Q * L.^5 * Q’
ans =
425 -162 -639 -912
-162 110 204 273
-639 204 1022 1389
-912 273 1389 2138
>> S^5
ans =
258 Introduction to Computational Engineering With MATLAB ®
y
v
a2
x2
x1
c1
c2
a1 x
FIGURE 6.3: Vector v is changed from coordinates (a1 , a2 ) using the stan-
dard basis vectors to coordinates (c1 , c2 ) using the eigenvectors of A as the
basis vectors.
Av = AXc
= XΛc
= λ1 c1 x1 + λ2 c2 x2 + · · · + λn cn xn
A2 v = AXΛc
= X Λ2 c
= λ21 c1 x1 + λ22 c2 x2 + · · · + λ2n cn xn
Ak v = A X Λk−1 c
= X Λk c
= λk1 c1 x1 + λk2 c2 x2 + · · · + λkn cn xn
The Λ matrix is the same diagonal eigenvalue matrix from the diagonalization
factoring.
Let’s illustrate this with an example. Consider the following matrix and
its eigenvectors and eigenvalues. We will find A2 v using the change of basis
strategy, where v = (3, 11) is changed to a linear combination of the eigen-
vectors.
260 Introduction to Computational Engineering With MATLAB ®
v = c1 x1 + c2 x2 + · · · + cn xn ,
then
Ak v = λk1 c1 x1 + λk2 c2 x2 + · · · + λkn cn xn . (6.2)
Note: Eigenvalues and eigenvectors are often complex, but when the terms
are added together, the result will be real (zero imaginary part).
uk = A uk−1 .
The state of the system at the first observation as given by its initial condition
is
u1 = A u0 .
After the second observation, it is
u2 = A u1 = A A u0 = A2 u0 .
After k observations, it is
uk = Ak u0 .
Application of Eigenvalues and Eigenvectors 261
• If any 0 ≤ λi < 1.0, then that term goes to zero when k goes to
infinity.
• If any λi > 1.0, the equation goes to infinity.
Applying equation (6.2) gives us an equation for the state of a system defined
by a difference equation after k observations.
Let us use the change of basis strategy to compute F (k) for any value of k.
Using the quadratic equation, one can quickly find the two eigenvalues
from the determinant of the characteristic matrix.
r √
1 1 1 1± 5
m = , p = −1, λ1,2 = ± +1=
2 2 4 2
√ √
1− 5 1+ 5
λ1 = , and λ2 = .
2 2
However, the algebra to find the symbolic equation for the eigenvectors and
the change of basis coefficients gets a little messy, but Chamberlain shows the
derivation in a nice blog post [10].
We can also let Matlab compute the eigenvalues and eigenvectors using
the Symbolic Math Toolbox to get the exact equations so that they can be
used in the solution.
Then each Fibonacci number is given by the first value of the difference equa-
tion vector.
Fibonacci(k) = c1 λk1 x1,1 + c2 λk2 x1,2
With this solution, Fibonacci(0) = 1, but we observe that the eigenvector
terms used are the same as the eigenvalues, so we are actually raising the
eigenvalues to the (k + 1) power. Thus we can shift the terms and get a result
with Fibonacci(0) = 0.
Code 6.1 shows a closed form (constant run time) Fibonacci sequence func-
tion. The function calculates integer values. The round function just returns
values of integer data type instead of floating point numbers. The commented
lines show code that yields the equations used in the code.
% A = [1 1; 1 0];
% [X, L] = eig(sym(A));
% X = [(1 - sqrt(5))/2 (1 + sqrt(5))/2; 1 1];
L = [(1 - sqrt(5))/2 0; 0 (1 + sqrt(5))/2];
% u0 = [1; 0];
% c = X\u0;
c = [-1/sqrt(5); 1/sqrt(5)];
Mostly
0.7 Rain 0.2
Sunny
0.1
0.2 0.5
Mostly
0.4 0.2
Cloudy
0.4
The columns and rows are here ordered as sunny, cloudy, and rain.
0.7 0.4 0.3
M = 0.2 0.4 0.5
0.1 0.2 0.2
Matrix multiplication tells us the weather forecast for tomorrow if it is cloudy
today.
>> M = [0.7, 0.4, 0.3; 0.2, 0.4, 0.5; 0.1, 0.2, 0.2]
M =
0.7000 0.4000 0.3000
0.2000 0.4000 0.5000
0.1000 0.2000 0.2000
>> M * [0; 1; 0]
ans =
0.4000 % probability sunny
0.4000 % probability cloudy
0.2000 % probability rain
The steady state of the system only depends on the eigenvector corre-
sponding to the eigenvalue of value one. The other terms will go to zero in
future observations.
We can plot the forecast to see the convergence. Code 6.2 shows the code for
the change of basis calculation and plot of the weather forecast. The markers
in figure 6.5 show the starting and ending of the forecast period. Each point
in the 3-D plot is a probability of sun, clouds, or rain. We see from the plot
that the distant forecast converges to the same probability regardless of the
weather condition today.
266 Introduction to Computational Engineering With MATLAB ®
FIGURE 6.5: A 3-D plot showing the Markov matrix forecast probabilities
given that today is each of sunny, cloudy, or raining.
Markov Eigenvalues
>> M = [0.7, 0.4, 0.3; 0.2, 0.4, 0.5; 0.1, 0.2, 0.2];
>> rank(M)
ans =
3
>> d = M - eye(3)
d =
-0.3000 0.4000 0.3000
0.2000 -0.6000 0.5000
0.1000 0.2000 -0.8000
>> rank(d)
ans =
2
% File: MarkovWeather.m
% Script showing change of basis calculation for
% a Markov matrix.
N = 20;
F_sun = zeros(3, N);
F_clouds = zeros(3, N);
F_rain = zeros(3, N);
% build forecast for next N days
for k = 1:N
F_sun(:,k) = c_sun(1)*X(:,1) + c_sun(2)*l(2)^k*X(:,2) ...
+ c_sun(3)*l(3)^k*X(:,3);
F_clouds(:,k) = c_clouds(1)*X(:,1) + c_clouds(2)*l(2)^k*X(:,2) ...
+ c_clouds(3)*l(3)^k*X(:,3);
F_rain(:,k) = c_rain(1)*X(:,1) + c_rain(2)*l(2)^k*X(:,2) ...
+ c_rain(3)*l(3)^k*X(:,3);
end
CODE 6.2: Script showing the change of basis calculation and plot of the
weather forecast based on a Markov model.
268 Introduction to Computational Engineering With MATLAB ®
greater than one. A more rigorous proof is easy to show for a 2×2 matrix,
and is intuitive but difficult to formally verify for larger matrices. Recall two
eigenvalue properties from section 6.3:
1. The trace of a matrix (sum of the diagonal values) is the sum of the
eigenvalues.
2. The determinant of a matrix is the product of the eigenvalues.
Since all entries of a Markov matrix are probabilities, every entry must
be between zero and one. The identity matrix has the highest trace of any
valid Markov matrix. For a n×n identity matrix, the trace is n, thus all of
the eigenvalues of an identity matrix are one. All other valid Markov matrices
have a trace less than n; therefore, the sum of the eigenvalues is ≤ n. It is
clear to see from a 2×2 matrix that all eigenvalues are ≤ 1 since at least one
eigenvalue is equal to one.
a 1−b
trace =a+b≤2
1−a b
1 + λ2 ≤ 2
λ2 ≤ 1
The maximum determinant of a Markov matrix is one, which is the case for
the identity matrix. All other Markov matrices have a determinant less than
one. Therefore, the product of the eigenvalues is ≤ 1.
a 1-b
1-a b = a + b − 1 ≤ 1
y = c eA t . (6.6)
Steady state
The steady state of a systems refers to the state of the system when
the control variable, which is often t for time, goes to infinity. Because
the eigenvalues are in an exponent, they determine the steady state.
For the stead state to be stable, meaning that it does not go to infin-
ity and does not oscillate, all eigenvalues must be real and less than or
equal to zero. Some ODE systems have complex eigenvalues. When this
occurs, the solution will have oscillating terms because of Euler’s complex
exponential equation, ei x = cos(x) + i sin(x) (appendices B.1 and B.1.2).
270 Introduction to Computational Engineering With MATLAB ®
In matrix notation,
−2 1 6
y′ = y, y(0) = .
1 −2 2
We first use Matlab to find the eigenvalues and eigenvectors. Matlab always
returns normalized eigenvectors, which can be multiplied by a constant to get
simpler numbers if desired.
The columns of the X matrix are the eigenvectors. The eigenvalues are on the
diagonal of L for Lambda (Λ). Our solution has the form
1 1
y(t) = c1 e−3t + c2 e−t
−1 1
>> y0 = [6;2];
>> c = X\y0
c =
2.0000
4.0000
y1 (t) = 2 e−3t + 4 e−t
r e u System y
Controller Plant
+
−
x
Sensors
FIGURE 6.6: The input to the controller may be either the estimate of
the system’s state, x, or the difference, e, between the state and a reference
set-point, r. The system plant includes the models of the system constraints,
physical environment, and the mechanisms whereby the control signal, u can
guide the state of the system. The observed output of the system is y. The
sensors attempt to measure the state of the system.
controllers look at the error between the desired set-point and the measured
system output. The control variable, u, is the sum of three values. A term that
is proportional to the error adjusts the control variable to minimize the error.
An integral term guards against long-term differences between the output
and set-point. The derivative term suppresses rapid fluctuations of the control
variable that the proportional control alone might produce.
The state space alternative to the PID controller models changes to the
state of the system. The state of the system is stored in a vector, x, having
terms for any variables that are significant and can be measured. The sys-
tem and its control mechanisms use differential equations that track changes
(derivatives) to the state. For a linear time-invariant (LTI) system, the state
space equations are first-order differential equations expressed with vectors
and matrices. Figure 6.7 shows how the state space equations (6.8) to (6.10)
relate in the control system.
u ẋ = A x + B u y
System
u = −K x x x = Cy
Controller Sensors
In the system plant, the A matrix models the physics of the system. B,
which may be a matrix or a column vector, defines the ability of the control
variable, u, to guide the system toward the desired behavior. A and B rep-
resent the given model of the system. The observable output of the system
may define the state of the system (x = Cy). But in some cases, it may be
challenging to determine the system’s state from the sensors. Kalman filters
are sometimes used when the sensors alone cannot determine the system’s
state. The variable K, which may be a scalar, a row vector, or a matrix, is
the tunable variable that we use to ensure a stable steady state condition.
Application of Eigenvalues and Eigenvectors 273
FIGURE 6.8: The state of the un- FIGURE 6.9: The state of the
stable system quickly goes to infin- stable system has a steady state of
ity. (0, 0).
ẋ = (A − B K C)x
% File: closedLoopCtrl.m
% Simulation and analytic solution of a stable
% and unstable closed-loop control system.
N = 200;
A = [1 1.8; -0.7 2.5];
B = [1 0.5; 0.5 1];
T = 15;
t = linspace(0, T, N);
dt = T/N;
X = zeros(2, N);
X(:,1) = [2; 1];
K_unstable = eye(2);
Atarget = [0.5 -1; 1.5 -2];
K_stable = B\(A - Atarget);
% K = K_unstable;
K = K_stable;
% simulation
for k = 2:N
x = X(:,k-1);
u = -K*x;
x_dot = A*x + B*u;
X(:,k) = x + x_dot*dt;
end
% analytic solution
x0 = X(:,1);
[ev, L] = eig(A - B*K);
c = ev\x0;
y = real(c(1)*exp(L(1,1)*t).*ev(:,1) + ...
c(2)*exp(L(2,2)*t).*ev(:,2));
% plot results
hold on
plot(t, X(1,:))
plot(t, X(2,:))
plot(t, y(1,:))
plot(t, y(2,:))
hold off
legend(’Simulation y1’, ’Simulation y2’, ...
’Analytic y1’, ’Analytic y2’,’Location’, ’north’)
xlabel(’t’)
title(’Stable Control System’)
Equation (6.11) is sufficient for our application here, but it assumes that ẋ
is constant between xk−1 and xk . More precise algorithms for simulating the
response of ODEs is presented in section 7.5.
1. The SVD can factor any matrix, even singular and rectangular matrices.
2. The SVD is used to solve many linear algebra problems and has applica-
tion to many science and engineering fields of study including artificial
intelligence, data analytics, and image processing. The linear algebra ap-
plications of the SVD include low-rank matrix approximations, finding
the rank of a matrix, finding the pseudo-inverse of rectangular matrices,
finding orthogonal basis vectors for the columns of a matrix, finding the
condition number of a matrix, and finding the null space and left null
space of a matrix.
3. The SVD is now fast and numerically accurate. That was not always
the case. An improved algorithm for finding the SVD became available
in 1970. The history of the SVD is briefly described below.
This fast algorithm and the vast application domain of the decomposition
has resulted in the SVD being a capital algorithm in software such as Matlab.
For several applications, it has replaced elimination based algorithms in the
internal workings of Matlab and other software.
Note: Our discussion of the SVD algorithm focuses on its interpretation
and application rather than its newer, more efficient implementation
in software such as Matlab. The complexity of that algorithm is
beyond the scope of this text and is actually less important to un-
derstanding and applying it than the classic algorithm.
want to factor out the stretching and rotating components of the matrix. For
a n×n square matrix A and a n×1 unit vector v, we have
Av = σ u, (6.12)
where σ is a scalar that is the length of the resulting vector and u is a unit
vector that shows the direction of the resulting vector. There are n vector–
matrix relationships like equation (6.12) that make the SVD factorization of a
n×n square matrix. We will soon see that the SVD can also be used to factor
rectangular matrices.
Consider the following matrix and two orthogonal unit vectors.
√ √
−0.1895 2.6390 2 −1 2 1
A= v1 = v2 =
3.1566 −1.7424 2 1 2 1
In figure 6.10, we show a plot of the two vectors and the product of each with
the A matrix. Both vectors are rotated and scaled by what are called the
singular values: σ1 = 4, and σ2 = 2.
A v1 = σ√
1 u1
(2, −2 3)
matrix equation.
A V = A v1 v2 ··· vn = σ 1 u 1 σ2 u2 ··· σn un
σ1 0 ··· 0
0 σ2 ··· 0
= u1 u2 ··· un .
.. .. ..
..
. . .
0 0 ··· σn
= UΣ
Note: This matrix equation is written as for a square matrix (m = n). For
rectangular matrices, the Σ matrix needs to be padded with either
rows or columns of zeros to accommodate the matrix multiplication
requirements. We will see this in the Matlab examples. The number
of nonzero σs is r, the rank of A.
Then a factoring of A is A = U Σ V−1 , but a requirement of the v vectors
is that they are orthogonal unit vectors such that V−1 = VT , so the SVD
factoring of A is
A = U Σ VT . (6.13)
278 Introduction to Computational Engineering With MATLAB ®
U = A V Σ−1
U = A*V*pinv(Sigma);
or
U = (A*V)/Sigma;
σ1 0
..
.
0 σn
Am×n = Um×m Σm×n T
Vn×n
0(m−n)×n
Note that in the multiplication of the factors to yield the original matrix,
(m − n) columns of U for an over-determined matrix and (n − m) rows of VT
for an under-determined matrix are multiplied by zeros from Σ. They are not
needed to recover A from its factors. Many applications of the SVD do not
require the unused columns of U or the unused rows of VT . So the economy
SVD is often used instead of the full SVD. The economy SVD removes the
unused elements. Figure 6.13 shows the related sizes of the economy sub-
matrices for over-determined matrices. The primary difference to be aware of
when apply the economy SVD is a degradation of the unitary properties of U
280 Introduction to Computational Engineering With MATLAB ®
σ1 0
= U ..
Am×n . T
m×m 0 σm 0m×(n−m) Vn×n
σ1 0
..
Am×n = Ũm×n . T
0 σn Vn×n
Another difficulty arising from the padded zeros in the full SVD Σ is that
the full U matrix for an over-determined matrix can not be directly found
from A, V, and Σ as may be done for square matrices or using the economy
SVD of a rectangular matrix. The same problem exists for finding the full V
matrix for an under-determine matrix. However, finding U and V from both
AAT and AT A is fraught with problems
Of course, Matlab’s svd function and similar software do not use the clas-
sic algorithm to find the SVD, so Matlab does not have the same challenges
when computing the full SVD factors.
Application of Eigenvalues and Eigenvectors 281
-3 3
>> A = [4 4; -3 3] >> [U,S,V] = svd(A)
A = U =
4 4 -1.0000 -0.0000
282 Introduction to Computational Engineering With MATLAB ®
0.0000 1.0000 V =
S = -0.7071 -0.7071
5.6569 0 -0.7071 0.7071
0 4.2426 >> U*S*V’
ans =
4.0000 4.0000
-3.0000 3.0000
>> A = [2 3; 4 6] S =
A = 8.0623 0
2 3 0 0.0000
4 6 V =
>> rank(A) -0.5547 -0.8321
ans = -0.8321 0.5547
1 >> U*S*V’
>> [U,S,V] = svd(A) ans =
U = 2.0000 3.0000
-0.4472 -0.8944 4.0000 6.0000
-0.8944 0.4472
0 0.7077
>> A = [2 3; 4 10; 5 12] 0 0
A = V =
2 3 -0.3871 0.9220
4 10 -0.9220 -0.3871
5 12 >> U * Sigma * V’
>> [U, Sigma, V] = svd(A) ans =
U = 2.0000 3.0000
-0.2053 0.9649 -0.1638 4.0000 10.0000
-0.6243 -0.2580 -0.7373 5.0000 12.0000
-0.7537 -0.0490 0.6554
Sigma =
17.2482 0
Application of Eigenvalues and Eigenvectors 283
0 0.7077
V =
>> [U, Sigma, V] = svd(A, ’econ’) -0.3871 0.9220
U = -0.9220 -0.3871
-0.2053 0.9649 >> U * Sigma * V’
-0.6243 -0.2580 ans =
-0.7537 -0.0490 2.0000 3.0000
Sigma = 4.0000 10.0000
17.2482 0 5.0000 12.0000
function show_SVD(A)
% SHOW_SVD - A demonstration of how the U, S, and V matrices
% from SVD rotate, stretch, and rotate vectors making a circle.
% Try several 2x2 matrices to see how each behaves.
[U,S,V] = svd(A);
Vx = V’*x;
figure, plot(x(1,:), x(2,:), ’*’)
hold on
scatter(Vx(1,:), Vx(2,:))
for z = 1:30
line([x(1,z), Vx(1,z)], [x(2,z), Vx(2,z)], ’Color’, ’k’)
end
title(’Rotation by V^T Matrix’)
hold off
svx = S*Vx;
figure, scatter(svx(1,:), svx(2,:), ’*’)
daspect([1 1 1])
usvx = U*svx;
hold on
scatter(usvx(1,:), usvx(2,:), ’o’)
title(’Stretch by \Sigma and rotation by U’)
hold off
CODE 6.5: Demonstration function that plots points starting on a circle and
being rotated by VT , stretched by Σ and rotated again by U.
The first few σ values are larger than later values, which may be zero, or
near zero. The Eckart–Young theorem developed in 1936 says that the closest
rank k < r matrix to the original comes from the SVD [7, 21]. The choice of
k should be based on the singular values. It might be chosen such that the
sum of the first k singular values constitute 95%, or another percentage, of
the sum of all singular values. Gavish and Donoho [26] suggest that if the
noise can be modeled independent of the data then k should be the number
of singular values from the data that are larger than the largest singular value
of the noise. They also suggest algorithms and equations for picking k when
the noise can not be modeled. The rank-k approximation of A is given by the
following sum of rank-1 matrices.
Eckart–Young Theorem
Consider two different matrices, A and B, of the same size. A has rank
r > k. Define Ak as the sum of the first k rank-1 outer product matrices
from the SVD. Both Ak and B have rank k. Then the Eckart–Young
theorem tells us that Ak is a closer match to A than any other rank k
matrix, B [21]. We express this as the norm (magnitude) of differences
between matrices—∥A − B∥ > ∥A − Ak ∥. This relationship holds for all
matrix norms listed in appendix A.1.3.
The example in code 6.6, builds a simple pattern in the matrix and then
adds noise to the matrix. We see that the best representation of the origi-
nal matrix is found in the first three terms of the SVD outer product. The
remaining two terms restore the noise, which we would just as well do without.
Code 6.7 shows another dimensionality reduction example. This one ex-
amines the SVD of an image. With the sum of a few rank-1 matrices from
the SVD, the major parts of the image start to take shape. The higher terms
add fine details. Something to think about: how is a low rank image from
SVD different or the same as a low-pass filtered image? Figure 6.16 shows the
singular values of the image, and the images with progressive rank are shown
in figure 6.17.
% File: bestKsvd.m
% Demonstrate the Eckart-Young theorem that says that the
% closest rank k matrix to the original comes from the SVD.
% Because of the random noise added, you may want to run this
% more than once. You should see that after about 3 SVD terms,
% the SVD matrix is closest (lowest mean-squared error) to the
% clean matrix before the noise was added. The later terms
% mostly add the noise back into the matrix. We can see why a
% rank k (k < r) matrix from SVD may be preferred to the
% full rank r matrix.
A1 = 5*eye(5) + 5*rot90(eye(5));
Aclean = [A1 A1]; % A test pattern
A = [A1 A1] + 0.5*randn(5, 10); % add some noise
[U, S, V] = svd(A);
figure, plot(diag(S), ’*’), title(’Singular Values’)
Ak = zeros(5, 10);
figure
% NOTE:
% >> rank(A1)
% ans =
% 3
% >> rank(A)
% ans =
% 5
CODE 6.6: Demonstration showing that a rank-k SVD can retain the essence
of data with some noise removed.
Application of Eigenvalues and Eigenvectors 287
FIGURE 6.16: The singular values of the cameraman image. Most of the
important data to the image is contained in a small number of rank 1 images.
% File: bestKsvdImage.m
% Use an image to demonstrate the Eckart-Young theorem that says
% that the closest rank k matrix to the original comes from the SVD.
[U, S, V] = svd(A);
figure, plot(diag(S), ’*’), title(’Singular Values’)
%%
Ak = zeros(m, n);
figure
FIGURE 6.17: Images with progressive rank from the sum of SVD outer
products.
unlike the matrix equations, the same equation of the SVD sub-matrices finds
the pseudo-inverse for both under-determined and over-determined matrices.
Recall that since U and V are orthogonal, their inverse are just their trans-
pose.
+
A+ = U Σ V T = V Σ+ UT
Since Σ is a diagonal matrix its inverse is the reciprocal of the nonzero ele-
ments on the diagonal. The zero elements remain zero.
Matlab uses the SVD method for computing the pseudo-inverse for both
over-determined and under-determined matrices.
Economy SVD
The economy SVD removes the unused columns of U and the zero rows
of Σ.
A = ŨΣ̃VT
x̂ = A+ b = V Σ̃+ ŨT b
p = A x̂
T +
= Ũ Σ̃ V V Σ̃ ŨT b
T
= Ũ Ũ b
FIGURE 6.18: Four vector projection alternatives. The projection lines ap-
pear as one line because they are on top of each other.
Application of Eigenvalues and Eigenvectors 291
% File: four_projections.m
% Comparison of 4 ways to compute vector projections of an
% over-determined system.
%
%% Over-determined System with noise
t = linspace(0,20);
y = 10 - 0.75.*t + 5*randn(1,100);
b = y’;
scatter(t,b)
A = ones(100,2); % A is the design matrix
A(:,2) = t’;
%% Alternate Gram-Schmidt
G = mod_gram_schmidt(A);
% u1 = G(:,1); % could use vectors for projection
% u2 = G(:,2);
% p2 = b’*u1*u1 + b’*u2*u2;
p2 = G*G’*b; % or matrix multiplication accomplishes the same
%% Plot
figure, hold on, scatter(t, b)
plot(t, p1), plot(t, p2), plot(t, p3) plot(t, p4)
hold off
legend(’Noisy Data’, ’Onto Column Space’, ’Gram-Schmidt’, ...
’SVD’, ’Orth function’)
σ1
.. 0
. T
Ṽr×n
= Ũm×r Unull σr
Am×n
0 0 T
Vnull
FIGURE 6.19: The SVD has r (matrix rank) nonzero singular values on the
diagonal of Σ. The columns of U and rows of VT that are multiplied by either
nonzero or zero valued singular values define the column space, row space, left
null space, and the null space of the matrix.
Column Space
The first r columns of U (Ũ) are the basis vectors for the vector space
spanned by the columns of A.
Row Space
The first r rows of VT (ṼT ) are the basis vectors for the space spanned
by the rows of A.
Null Space
Rows r to n of VT (Vnull T
) are the transpose of the basis vectors for
the space spanned by the column vectors v such that Av = 0. Note
that square, full-rank matrices do not have a null space. Matlab’s null
function returns these vectors.
calculation yields one singular value of zero, while one singular value from the
second set will be close to zero.
>> A = randi(10, 4); % Start with a random, usually full rank matrix
>> A(:,4) = sum(A(:,1:3), 2)/3; % make singular
>> svd(A) % zero singular value
ans =
24.5089
6.1966
4.3492
0.0000
R = U Σ UT
Q = U VT
Recall from section 3.6.3 that a covariance matrix is symmetric with the vari-
ance of each feature on the diagonal and the covariance between features in
the off-diagonal positions.
DT D
Covx =
m−1
296 Introduction to Computational Engineering With MATLAB ®
V ar(x1 , x1 ) Cov(x1 , x2 ) · · · Cov(x1 , xn )
Cov(x2 , x1 ) V ar(x2 , x2 ) · · · Cov(x2 , xn )
Covx =
.. .. .. ..
. . . .
Cov(xn , x1 ) Cov(xn , x2 ) · · · V ar(xn , xn )
Next, the eigenvectors, V, and eigenvalues of the covariance matrix are found.
The eigenvalues are used to sort the eigenvectors. Because we wish to achieve
dimensionality reduction, only the eigenvectors corresponding the k largest
eigenvalues are used to map the data onto the PCA space.
W = v1 · · · vk
Note: The V sub-matrix from the SVD of the D matrix could be used as
the eigenvectors ([~, ~, V] = svd(D)).
W gives us the linear combinations of the data features to project the data
onto the PCA space, which is found by multiplying the W matrix by each
sample vector.
Y = D W.
For purposes of classification and object recognition, the data in the PCA
space (Y) is used to display or compare the data.
To reconstruct the original data with reduced dimensionality, we just re-
verse the steps.
X̃ = D + µ
T
= W (W D) + µ
= WY + µ
Transposed data
If the measured features are held in the rows with the samples in the
columns, then the adjusted equations are as follows.
(x1 − µ1 )
(x2 − µ2 )
D= .
. .
.. .. ..
(xn − µn )
D DT
Covx = .
n−1
Y = WT D.
Application of Eigenvalues and Eigenvectors 297
% File: PCA_demo.m
% PCA - 2D Example
n = 2;
m = 50;
% two columns random but correlated data
X = zeros(m, n);
X(:,1) = 5*randn(m, 1) - 2.5;
X(:,2) = -1 + 1.5 * X(:,1) + 5*randn(m,1);
X = sortrows(X,1);
function [eigvec,eigval]=eig_decomp(C)
[eigvec,eigval]=eig(C);
eigval=abs(diag(eigval)’);
[eigval, order]=sort(eigval, ’descend’);
eigval=diag(eigval);
eigvec=eigvec(:, order);
end
attributes of items of the same type are similar and differ from items of other
types. In this example, we will look at a public domain dataset and produce a
scatter plot showing how groups of the data are clustered in the PCA space.
The public domain data is in a file named nndb_flat.csv, which is part
of the downloadable files from the book’s website. The file can also be down-
loaded from the Data World website.2 The data comes from the USDA Na-
tional Nutrient Database and was made available by Craig Kelly. It lists many
different foods and identifies food groups, nutrition, and vitamin data for each
food. We will use the nutrition information in the analysis. Code listings 6.10
and 6.11 show the script for the PCA algorithm and plotting of the results.
The scatter plot in figure 6.21 shows the first two principal components for
8,618 foods in seven groupings. The plot shows clusters of the foods within
the groups and which food groups are nutritionally similar or different. Some
of the food groups that are contained in the data are not plotted.
To see a list of all food groups in the data set, type the following in the
command window after the FoodGroups variable is created by the script.
>> categories(FoodGroups)
FIGURE 6.21: Scatter plot of the first two principal components from nu-
trition data for 8,618 foods from seven food groups. The data clusters are
easier to see with color, so be sure to run the script.
% File: PCA_food.m
% PCA classification and visualization of nutrition in various
% food groups. Food types are in rows.
% Nutrition attributes (features) are in columns, which are:
% Energy_kcal, Protein_g, Fat_g, Carb_g, Sugar_g, Fiber_g
food = readtable(’nndb_flat.csv’);
X = food{:,8:13}; % nutrition columns
k = 2;
CODE 6.11: Part 2: Script for PCA classification of food groups by nutri-
tional content.
302 Introduction to Computational Engineering With MATLAB ®
function eigdemo(A)
%% Eigenvector, eigenvalue Demo for input 2x2 matrix.
%
% This function shows an animation demonstrating that the product
% (A*x) is inline with the vector x when, and only when, x is an
% eigenvector. The A (input) matrix must be a 2x2 matrix with real
% eigenvalues.
%
% The red rotating line shows possible unit vectors, x.
% The blue moving line is v = A*x.
% The green fixed arrows are the product (lambda*eigenvectors == A*ex),
%
% Notice that when the red rotating vector is inline win an
% eigenvector % that the blue line matches the green arrows. The blue
% line is also (-1) of the green arrow when the red rotating line is
% (-1) of an eigenvector.
%
if size(A,1) ~= 2 || size(A,2) ~= 2
disp(’Matrix A must be 2x2’)
return;
end
theta = linspace(2*pi,0,100);
xv = cos(theta);
yv = sin(theta);
v = A*[xv; yv];
[ex, lambda] = eig(A);
if ~isreal(ex) || ~isreal(lambda)
disp(’Complex Eigenvector results, use another A matrix’);
return;
end
ep = A*ex; % same as A*ex == lambda*ev
fprintf(’eigenvector 1: %f %f\n’, ex(1,1), ex(2,1));
fprintf(’eigenvector 2: %f %f\n’, ex(1,2), ex(2,2));
fprintf(’A*eigenvector 1: %f %f\n’, ep(1,1), ep(2,1));
fprintf(’A*eigenvector 2: %f %f\n’, ep(1,2), ep(2,2));
fprintf(’lambda: %f %f\n’, lambda(1,1), lambda(2,2));
M = ceil(max(vecnorm(v)));
6.11 Exercises
(b)
25 4 −4
S= 4 −9 8
−4 8 18
(a) Using the change of basis method, find a general equation for the
difference equation, uk .
(b) What is the steady state of the system?
(c) Make a plot of the state of the system from the u0 to u20 .
1. Residential.
2. Office.
3. Commercial.
4. Parking.
5. Vacant.
0.85 0.05 0.05 0.05 0.10
0.05 0.55 0.20 0.10 0.10
0.05 0.15 0.65 0.25 0.20
M=
0.05 0.20 0.05 0.55 0.20
0.05 0.05 0.05 0.05 0.40
The columns represents the transition probabilities for the initial usage of
the land in 2015. The rows represent the probabilities for the land transitioning
to a particular usage in 2020. For example, land that was in residential use in
2015 (column 1) had a 0.85 probability of still being in residential use in 2020
(row 1) and a 0.05 probability of being vacant in 2020 (row 5).
Let the ratios of land use in 2015 be as follows.
0.25
0.3
start = 0.2
0.1
0.15
For convenience, the file LandUse.mat may be loaded to create the M matrix
and start vector.
We will assume that the conditions driving the changes are consistent. Use
Matlab to determine the following. See section 6.6.3 as needed.
(a) Calculate the vectors showing the expect land use in years 2025, 2030,
and 2035. Just find powers of M to determine this.
(b) Find the eigenvectors and eigenvalues of the Markov matrix.
(c) Calculate the change of basis coefficients needed to express Mk start as
a sum of products of the eigenvectors and eigenvalues.
(d) Using the eigenvectors, eigenvalues, and coefficients calculate (matrix
multiplication) the expected land usage in year 2040 (5 sample periods
from 2015).
(e) Using the eigenvectors, eigenvalues, and coefficients find the steady state
land usage.
306 Introduction to Computational Engineering With MATLAB ®
(a) Express the system of ODEs in matrix form. Use Matlab to find the
eigenvalues, eigenvectors,and the equation coefficients. What is the so-
lution to the systems of ODEs?
(b) What is the steady state of the system?
2. Find the singular values of the following matrix. Find the condition
number of A from the ratio of the largest to smallest singular values
(σ1 /σ3 ). Verify your answer with the cond function.
−8 13 3
A = −1 11 5
−7 14 4
Refer to the example in section 6.9.2 as a guide to see how the PCA algorithm
is applied.
The data for this project is the Iris Plant Dataset, which is a quite old
dataset. It contains four attributes of three types of Iris flower plants. The data
was collected by R.A. Fisher for a paper published in 1936 [23]. Information
about the dataset is found on the UCI Machine Learning Repository [20]. The
dataset is in the file iris_data.txt, which is in the downloadable files from
the book’s website.
Use PCA to reduce the dimensionality of the data from four attributes to
two principal components. Make a scatter plot of the samples in two dimen-
sional PCA space.
Chapter 7
Computational Numerical Methods
7.1 Optimization
In this section we consider numerical algorithms that find specific points
that are either a root of a function (f (xr ) = 0) or locate a minimum value of
the function (arg minxm f (x)). We are primarily concerned with the simpler
case where x is a scalar variable, but will use vector variables with some
algorithms.
f (x) = x2 − K
x
xb a
√
xr = K
function x = newtSqrt(K)
% NEWTSQRT - Square root using the Newton-Raphson method
% x = newtQqrt(K) returns the square root of K
% when K is a real number, K > 0.
if ~isreal(K) || K < 0
error(’K must be a real, positive number.’)
end
tol = 1e-7; % reduce tolerance for more accuracy
x = K; % starting point
while abs(x^2 - K) > tol
x = (x + K/x)/2;
end
In the case of a polynomial function such as the square root function, one
can also use the roots function described in section 6.3.2. For example, the
square root of 64 could be found as follows.
points. As shown in figure 7.2, a new point is found that is half way between
the points.
a+b
c=
2
If c is not within the allowed tolerance of the root, then one on the end points
is changed to c such that we have a new, smaller, range spanning the x axis.
Code 7.3 lists a function implementing the bisection root finding algorithm.
f (x)
a c
b
FIGURE 7.2: The bisection algorithm cuts the range between a and b in
half until the root is found midway between a and b.
function x = bisectRoot(f, a, b)
% BISECTROOT find root of function f(x), where a <= x <= b.
% x = bisectRoot(f, a, b)
% f - a function handle
% sign(f(a)) not equal to sign(f(b))
f (x)
c b
a
FIGURE 7.3: The secant algorithm cuts the range between a and b by
finding a point c where a line between f (a) and f (b) crosses the x axis. The
range is repeatedly reduced until f (c) is within an accepted tolerance of zero.
Computational Numerical Methods 315
function x = secantRoot(f, a, b)
% SECANTROOT find root of function f(x), where a <= x <= b.
% x = secantRoot(f, a, b)
% f - a function handle
% sign(f(a)) not equal to sign(f(b))
7.1.1.4 Fzero
Matlab has a function called fzero that uses numerical methods to find
a root of a function. It uses an algorithm that is a combination of the bisection
and secant methods.
The arguments passed to fzero are a function handle and a row vector
with the range of values to search. The function should be positive at one
of the vector values and negative at the other value. It may help to plot the
function to find good values to use. The example in figure 7.4 uses the fplot
function to display a plot of the values of a function over a range.
316 Introduction to Computational Engineering With MATLAB ®
FIGURE 7.4: The plot from fplot helps one select the search range to pass
to fzero.
7.1.2.1 Fminbnd
The Matlab function x = fminbnd(f, x1, x2) returns the minimum
value of function f (x) in the range x1 < x < x2 . The algorithm is based
on golden section search and parabolic interpolation and is described in a
book co-authored by MathWork’s chief mathematician and co-founder, Cleve
Moler [24]. We use fminbnd to find the minimum value of a function in fig-
ure 7.5.
FIGURE 7.5: The plot from fplot shows that the minimum value of the
function f (x) = x2 − sin(x) − 2 x is between 1 and 1.5, fminbnd then searched
the region to find the location of the minimum value.
7.1.2.2 Fminsearch
The Matlab function fminsearch uses the Nelder-Mead simplex search
method [50] to find the minimum value of a function of one or more variables.
Unlike the other searching functions, fminsearch takes a starting point for
the search rather than a search range. The starting point may be a scalar
or an array (vector). If it is an array, then the function needs to use array
indexing to access the array elements. Best results are achieved when the
supplied starting point is close to the minimum location.
% File: fminsearch1.m
% fminsearch example
f = @(x) x(1).^2 + x(2).^2 + 10*cos(x(1).*x(2));
m = fminsearch(f, [1 1]);
disp([’min at: ’, num2str(m)]);
disp([’min value: ’, num2str(f(m))]);
% plot it now
g = @(x, y) x.^2 + y.^2 + 10*cos(x.*y);
[X, Y] = meshgrid(linspace(-5, 5, 50), linspace(-5, 5, 50));
Z = g(X, Y);
surf(X, Y, Z)
CODE 7.5: Script to find the minimum value and its location of a 3-
dimensional surface with fminsearch.
point to fminsearch such that the nearest set of correct joint angles puts the
robot in the desired configuration.
A simple example for a serial-link robot arm like shown in figure 5.17
on page 167 with two joints and two rigid arm segments (a and b) is listed
in code 7.6. The minimized function returns the length of the error vector
between the forward kinematic calculation and the desired location of the end
effector. The forward kinematics function uses functions from Peter Corke’s
Spatial Math Toolbox [16] to produce two dimensional homogeneous coordinate
rotation and translation matrices as used in section 5.3.4.
% File: fminsearch2.m
% Script using fminsearch for inverse kinematic
% calculation of a 2 joint robot arm.
E = [5.4 4.3]’;
error = @(Q) norm(E - fwdKin(Q));
Qmin = fminsearch(error, [0 0]);
disp([’Q = ’, num2str(Qmin)]);
disp([’Position = ’, num2str(fwdKin(Qmin)’)]);
function P = fwdKin(Q)
a = 5;
b = 2;
PB = trot2(Q(1)) * transl2(a,0) * trot2(Q(2)) * transl2(b,0);
P = PB(1:2,3);
end
CODE 7.6: Script to find joint angles for a two joint robot arm. The inverse
kinematic calculation uses Matlab’s fminsearch function.
The output of the script is the joint angles for the robot and forward
kinematic calculation of the end effector position with those joint angles.
>> fminsearch2
Q = 0.56764 0.3695
Position = 5.4 4.3
problem. Most applications of CVX will include a call to either the minimize
or maximize function [29].
An example can best illustrate how to use CVX. Recall from section 5.8
that the preferred solution to an under-determined system of equations is
one that minimizes x. Matlab’s left-divide operator finds the least squares
solution using the QR algorithm as discussed in section 5.11, which minimizes
the l2 -norm of x. The left-divide operator returns a solution with (n − r)
zeros by zeroing out columns of the matrix before finding the least squares
solution. But it is also known that a minimization of the l1 -norm tends to yield
a solution that is sparse (has several zeros). We turn to CVX to replace the
inherent l2 -norm that we get from our projection equations with an l1 -norm.
>> b >> x
b = x =
5 -0.0000
20 1.1384
13 0.0000
>> cvx_begin 0.0103
>> variable x(5); 0.0260
>> minimize( norm(x, 1) ); >> A*x
>> subject to ans =
>> A*x == b; 5.0000
>> cvx_end 20.0000
13.0000
Sparse Matrices
Very large but sparse matrices are common in some application do-
mains. Matlab supports a sparse storage class to efficiently hold large
matrices that have mostly zeros. A sparse matrix contains the indices and
values of only the nonzero elements. All functions in Matlab work with
sparse matrices. The sparse command converts a standard matrix to a
sparse matrix. The full command restores a sparse matrix to a standard
matrix [55].
S = sparse(A);
...
A = full(S);
Computational Numerical Methods 321
Matlab also has functions that directly invoke the data interpolation algo-
rithms rather than invoking them via fillmissing or interp1. These func-
tions include pchip, spline, and makima.
The default interpolation algorithm is ’linear’, which does not usually
provide smooth transitions between points but is good when the given data
changes slowly between sample points. The methods ’pchip’, ’makima’, and
’spline’ yield good results. The ’spline’ method requires the most com-
putation because it uses a matrix computation for each point to determine
polynomial coefficients that not only match the given data, but maintain con-
stant first and second order derivatives at each data point added. However,
the ’spline’ method may overshoot data points causing and an oscillation.
The results from the ’pchip’ and ’makima’ algorithms are similar. They
maintain consistent first order derivatives at each point but not consistent
second order derivatives. They avoid overshoots and accurately connect the
322 Introduction to Computational Engineering With MATLAB ®
x = -3:3;
y = [-1 -1 -1 0 1 1 1];
t = -3:.01:3;
hold on
plot(x, y, ’o’)
plot(t, interp1(x, y, t, ’pchip’), ’g’)
plot(t, interp1(x, y, t, ’spline’), ’-.r’)
legend(’data’,’pchip’,’spline’, ’Location’, ’NorthWest’)
hold off
is indistinguishable from a surface plot where all of the points come from an
equation rather than having some interpolated points.
FIGURE 7.9: A surface plot where half of the data points are found from
linear interpolation.
dy(x) ∆ y(x)
= lim .
dx ∆ x→0 ∆ x
% File: deriv_cos.m
% Derivative example, Compute derivative of cos(x)
CODE 7.7: Script implementing the Euler derivative of the cosine function.
FIGURE 7.10: Even though the h value is fairly large, the Euler numerical
derivative of the cosine function is still close to the analytic derivative.
Computational Numerical Methods 327
resources provide detailed descriptions of the Fourier transform and the FFT.
A rigorous, but still understandable coverage can be found in [7], pages 47 to
63.
The FFT is of interest to us here because of the simple relationship between
the Fourier transform of a function and the Fourier transform of the function’s
derivative. We can easily find the derivative in the frequency domain and then
use the inverse FFT to put the derivative into its original domain.
d
F f (t) = i ω F(f (t))
dt
√
Here, i is the imaginary constant −1 and ω is the frequency in radians per
second for time domain data or just radians for spatial data. We can use
ω = 2πfs k/n for time domain data and ω = 2πk/T for spatial domain data,
where k is the sample number in the range {−n/2 to (n/2 − 1)}, and n is the
number of data samples (usually a power of 2). For time domain data, fs is
the sampling frequency (samples/second). For spacial domain data, T is the
length of the data.
Code 7.8 gives an example of computing the spectral derivative using time
domain data. The FFT treats frequency data as having both positive and
negative values, so our ω needs to do the same. We can use our k variable
to build both the time, t, values and the frequency, w, values. Both the FFT
and inverse FFT algorithms reorder the data, so we need to use the fftshift
function to align the frequency values with the frequency domain data. Fig-
ure 7.11 shows a plot of the time domain spectral derivative, which aligns with
the analytic derivative.
%% SpectralTimeDerivative.m
% Numerical derivative using the spectral derivative
% (FFT based) algorithm -- time domain data
n = 256; fs = 8000; % samples/second
T = n/fs; % time span of samples (seconds)
k = -n/2:(n/2 - 1);
t = T*k/n; % time values
f1 = 200; f2 = 500; % Two frequency components (Hz)
Figure 7.12 illustrated that the value of a definite integral is the area
bounded by the function, the x axis, and the lower and upper bounds of the
integral. Numerical techniques estimate the area by summing the areas of a
y = f (x)
R4
Area = 1
f (x) dx
x
FIGURE 7.12: A definite integral is an area calculation.
f (xi+1 )
f (xi )
h
xi xi+1
As shown in figure 7.14, the area of each trapezoid is sometimes more, less, or
nearly the same as the true area of each subinterval. Using more subintervals,
which reduces the size of h, will improve the accuracy. With narrower subin-
tervals, the spanned region of f (x) from xi to xi+1 will more closely resemble
a straight line. Of course, we need to be sensitive to the size of h to avoid
excessive round-off errors that can occur when h is very small.
y = f (x)
function I = trapIntegral(f, a, b, n)
% TRAPINTEGRAL - Definite integral by trapezoid area summation.
%
% f - vector aware function
% a - lower integration boundary
% b - upper integration boundary, a < b
% n - number of subinterval trapezoids
if a >= b
error(’Variable b should be greater than a.’)
end
h = (b - a)/n;
x = linspace(a, b, n+1);
y = f(x);
I = (h/2)*(y(1) + 2*sum(y(2:end-1)) + y(end));
integration function shows that it finds an answer that is somewhat close with
a modest number of grid points, but struggles to find the true definite integral
value even as the number of grid points is increased.
>> fun = @(x) x.^2;
>> trapIntegral(fun, 0, 4, 10) % correct = 21.3333
ans =
21.4400
>> trapIntegral(fun, 0, 4, 50)
ans =
21.3376
>> trapIntegral(fun, 0, 4, 100)
ans =
21.3344
n
X
P (x) = yi Pi (x)
i=1
2h
h
xi−1 xi xi+1 x
and f (x). Using integration by substitution and some algebra, the definite
integral of each integration region is given by
Z xi+1
h
Ii = P (x) dx = (f (xi−1 ) + 4 f (xi ) + f (xi+1 )) .
xi−1 3
y = f (x) y = f (x)
a b a b
FIGURE 7.16: Left: Simpson’s rule integration with 5 grid points, 4 subin-
tervals, and 2 integration regions. Right: Simpson’s rule integration with 7
grid points, 6 subintervals, and 3 integration regions.
Figure 7.16 shows that Simpson’s rule can find accurate definite integral
results if the subintervals are small enough to accurately model the function
with quadratic equations.
Note: Because Simpson’s rule computes integrals over two subintervals, the
number of subintervals must be an even number. So the number of
grid points must be an odd number. To satisfy the indexing require-
ments of the sum functions in the Matlab code, the minimum num-
ber of grid points is 5 (4-panel integration).
The simpsonIntegral function in code 7.10 implements Simpson’s rule
integration. Testing results show that Simpson’s rule integration is more ac-
curate than trapezoid rule integration even with fewer subintervals.
>> fun = @(x) x.^2
>> simpsonIntegral(fun, 0, 4, 8)
ans = % correct = 21.3333
21.3333
function I = simpsonIntegral(f, a, b, n)
% TRAPINTEGRAL - Definite integral by Simpson’s rule.
%
% f - vector aware function
% a - lower integration boundary
% b - upper integration boundary, a < b
% n - number of subintervals, must be an even number and >= 4
if a >= b
error(’Variable b should be greater than a.’)
end
if mod(n, 2) ~= 0 || n < 4
error(’Variable n, should be an even number, n >= 4.’)
end
h = (b - a)/n;
x = linspace(a, b, n+1);
y = f(x);
I = (h/3)*(y(1) + 4*sum(y(2:2:end-1)) ...
+ 2*sum(y(3:2:end-2)) + y(end));
% MATLAB indexing starts at 1, so even and odd are reversed.
if a >= b
error(’Variable b should be greater than a.’)
end
tol = 1e-12;
c = (a + b)/2;
% Three integrals
if intFun == ’t’
Iab = trapIntegral(f, a, b, 2);
Iac = trapIntegral(f, a, c, 2);
Icb = trapIntegral(f, c, b, 2);
elseif intFun == ’s’
Iab = simpsonIntegral(f, a, b, 4);
Iac = simpsonIntegral(f, a, c, 4);
Icb = simpsonIntegral(f, c, b, 4);
else
error("Specify ’s’ or ’t’ for intFun")
end
some ODEs using analytic methods is less than satisfying [70]. So numerical
methods for solving ODEs is an important tool in the engineer’s computa-
tional toolbox.
This is our third encounter with ODEs. In section 4.1, we used Matlab’s
Symbolic Math Toolbox to find the algebraic equations from the differential
equations associated with the effect of gravity on a ball thrown upwards. Then
in section 6.7 we considered systems of first order linear ODEs that may be
solved analytically with eigenvalues and eigenvectors. Now, we will consider
Computational Numerical Methods 339
numerical methods to find and plot the solutions to ODEs. We will cover the
basic algorithmic concepts of how numerical ODE solvers work and review the
efficient suite of ODE solver provided by Matlab.
Recall the definition of a derivative:
dy y(t + h) − y(t)
= lim .
dt h→0 h
Considered in terms of discrete samples of y, the derivative becomes
yk+1 − yk
yk′ = ,
h
where h is the distance between sample points, h = tk+1 − tk . Rearranged to
find yk+1 from the previous value, yk , and the slope of the function at tk , we
have
yk+1 = yk + h yk′ . (7.1)
This equation is the basis for the simplest ODE solver, Euler’s method. How-
ever, due to concerns about excessive computation and round-off errors, we
are limited on how small h can be. So the accuracy of equation (7.1) is not
acceptable for most applications.
Since differential equations are given in terms of an equation for the deriva-
tive of y, y ′ = f (t, y), and the initial value, y0 , the general discrete solution
comes from the fundamental theorem of calculus, which applies the antideriva-
tive of a function to a definite integral.
Z tk+1
yk+1 = yk + f (t, y)dt (7.2)
tk
Then the numerical approximation comes from the algorithm used to cal-
culate the definite integral.
Z tk+1
Ik = f (t, y)dt (7.3)
tk
At each step, the ODE solver uses a straight line with a slope of Ik /h to
advance from yk to yk+1 .
the step size is small enough relative to the rate of change of the f (t, y)
function.
Z tk+1
f (t, y)dt ≈ h f (tk , yk )
tk
yk+1 = yk + h f (tk , yk )
yk+1 = yk + h a.
Then by induction,
y1 = y0 + h a
y = y + h a= y + 2 h a
2 1 0
.. .
.
yk = y0 + k h a
The eulerODE function listed in code 7.12 is an implementation of Euler’s
method. The first three arguments passed to eulerODE are the same as is used
with the Matlab ODE solvers. An additional argument for the number of
values to find is also passed to eulerODE. First, define a function or function
handle. The function defines the derivative of the solution. It should take two
arguments, t and y. Although in many cases, the function only operates on
one variable—usually the second, y, variable. The function should accept and
return vector variables.
dy(t)
= f (t, y)
dt
Next, specify a tspan argument, which is a row vector of the initial and final
t values4 ([t_0 t_final]). Then, the value of y at the initial time is specified
y0 = y(t0 ). The final argument is n, the number of values to find. The returned
data from all of the ODE solvers are the t evaluation points and the y solution.
>> f = @(t, y) 5
>> [t, y] = eulerODE(f, [0 10], 5, 4)
t =
0 3.3333 6.6667 10.0000
y =
5.0000 21.6667 38.3333 55.0000
Figure 7.17 on page 343 shows a plot comparing the relative accuracy of
Euler’s method to other fixed step size algorithms.
4 We use t for time as a variable of the ODE function, but a spatial variable, such as x,
s2 = f (tk + h2 , yk + h
2 s1 )
s3 = f (tk + h2 , yk + h
2 s2 )
s4 = f (tk + h, yk + h s3 )
h
yk+1 = yk + 6 (s1 + 2s2 + 2s3 + s4 )
The rk4ODE function listed in code 7.14 is called with the same arguments
as the eulerODE and heunODE functions.
Note: All of the algorithms shown in figure 7.17 have good performance if
h is significantly reduced. Finding noticeable differences in the plot
is only achieved by using a larger step size.
yk = (1 − ha)k y0 . (7.6)
|1 − h a| < 1
−1 < (1 − h a) < 1
−2 < −h a < 0
2
h<
a
In the following example 2/a = 2/0.5 = 4, but h = 56 and the results are
unstable.
>> fexp = @(t, y) -0.5*y;
>> [t, y_unstable] = eulerODE(fexp, [0 20], 1, 5)
t =
0 5 10 15 20
y_unstable =
1.0000 -1.5000 2.2500 -3.3750 5.0625
The stability constraint for Euler’s method is sufficient for other fixed step
size algorithms.
yk = yk+1 + h a yk+1
yk = (1 + h a) yk+1
1
yk+1 = yk
1 + ha
k
1
yk+1 = y0 (7.8)
1 + ha
which always holds because both h and a are positive numbers. So the implicit
backward Euler method is unconditionally stable.
One of two strategies can be used to determine the step size adjustments.
As was done with adaptive numerical methods for computing integrals in sec-
tion 7.4.3, the solution from two half steps could be compared to the solution
from a full step. Based on the comparison, the step size can be reduced, left
the same, or increased. However, points of an ODE solution are found se-
quentially, so the strategy of splitting a wide region into successively smaller
regions would be computationally slow. Instead, the current step size is used
to find the next solution (yk+1 ) using two algorithms where one algorithm is
regarded as more accurate than the other. The guiding principle is that if the
solutions computed by the two algorithms are close to the same, then the step
size is small enough or could even be increased. But if the difference between
the solutions is larger than a threshold, then the step size should be reduced.
If the solutions differ significantly, then the current evaluations are recalcu-
lated with a smaller step size. The algorithm advances to the next evaluation
step only when the differences between the solutions is less than a specified
tolerance.
25 1408 2197 1
zk+1 = yk + s1 + + s3 + + s4 − s5
216 2565 4101 5
16 6656 28561 9 2
yk+1 = yk + s1 + s3 + + s4 − s5 + s6
135 12825 56430 50 55
s1 = h f (tk , yk )
1 1
s2 = h f tk + h, yk + s1
4 4
Computational Numerical Methods 347
3 3 9
s3 = h f tk + h, yk + s1 + s2
8 32 32
12 1932 7200 7296
s4 = h f tk + h, yk + s1 − s2 + s3
13 2197 2197 2197
439 3680 845
s5 = h f tk + h, yk + s1 − 8s2 + s3 − s4
216 513 4104
1 8 3544 1859 11
s6 = h f tk + h, yk − s1 + 2s2 − s3 + s4 − s5
2 27 2565 4104 40
Some adaptive ODE algorithms make course changes to the step size such
as either cutting it in half or doubling it. The rkf45ODE function first finds a
scalar value, s, from a tolerance value and the difference between the solutions
from the two algorithms. In the implementation of rkf45ODE in code 7.17, the
new step size comes from multiplying the current step size by the scalar.
The scalar will be one if the difference between the solutions is half of the
tolerance. The algorithm advances to the next step if the difference between
the two solutions is less than the tolerance value, which occurs when the scalar
is greater than 0.84. Otherwise, the current evaluation is recalculated with a
smaller step size. The multiplier is restricted to not be more than two. So the
step size will never increase to more than twice its current size. The tolerance
and the threshold for advancing to the next step are tunable parameters.
41
tolerance
s=
2 ∥zk+1 − yk+1 ∥
The initial value (y0 ) passed to the rkf45ODE function can be either a scalar
for a single equation ODE or a column vector for a system of ODEs. Since
the number of steps to be taken is not known in advance, the initial step size
is a function argument.
In the example plotted in Figure 7.18 the adaptive Runge-Kutta-Fehlberg
(RKF45) method is more accurate than the classic Runge-Kutta method
(RK4) with the initial step size of RKF45 being the same as the fixed step
size used by RK4. For most ODEs, the ability of RKF45 to reduce the step
size where needed likely contributes more to its improved accuracy than can
be attributed to using a higher order equation.
FIGURE 7.18: The adjustable step size of the RKF45 algorithm gives ac-
curate results with good performance.
The Matlab ODE solvers have several additional options and tunable pa-
rameters. Rather than passing those individually to the solver, a data structure
holding the options is passed as an optional argument. The data structure is
created, and modified using the odeset command with name–value pairs. Re-
fer to the Matlab documentation for odeset for a full list of the options and
their descriptions.
dy
= a y − b y2 .
dt
The solution is modeled with positive and negative values for t. The analytic
solution is: ( [70] page 55)
a
y(t) = .
d e−a t +b
a
where, d = y(0) − b.
In the distant past, (t = −∞), the population size is modeled as zero. The
steady state (t = ∞) population size is y(t) = K = ab , which is the sustainable
population size.
The half way point, can be modeled as y(0) = a/2b.
In code 7.15, we will compute the positive half of the solution and then
reflect the values to get the negative time values. The solution plot is shown
in figure 7.19.
% File: logistic_diffEq.m
% Logistic Equation: a = 1, b = 1
f = @(t, y) y - y.^2;
tspan = [0 5];
y0 = 1/2;
[t, y] = ode45(f, tspan, y0);
figure, plot(t,y)
t1 = flip(-t);
y1 = flip(1-y);
t2 = [t1;t];
y2 = [y1;y];
analytic = 1./(1 + exp(-t2));
figure, hold on
plot(t2, y2, ’o’)
plot(t2, analytic)
hold off, title(’Logistic Equation’)
xlabel(’t’), ylabel(’y’)
legend(’ode45 Numeric Solution’,’Analytic Solution’, ...
’Location’, ’northwest’);
The eigenvectors, eigenvalues, and c coefficients are found with some help from
Matlab.
√
1 1 1 1 λ1 = −49 − 2 · 0.5
c1 = √
x1 = √ x2 = √
2 −1 2 1 λ2 = −1 c2 = 2 · 1.5
The solution to the system of ODEs then comes from the general solution for
systems of first order, linear ODEs as given by equation (6.7) in section 6.7.
The solution is u(t) = u0 eA t . Refer to section 6.7 and appendix B.2 for more
information about how we get from a solution with a matrix in the exponent
to equation (7.9).
Notice that we do not have a negative sign in front of the A matrix as
we previously saw with a in the scalar ODE from section 7.5.5. But it is a
decaying exponential that goes to zero when t goes to infinity because the
eigenvalues of A are negative.
Computational Numerical Methods 353
We get the starting equation for the numerical solution by applying equa-
tion (7.7).
uk = uk+1 − h A uk+1 = (I − h A) uk+1
uk+1 = (I − h A)−1 uk (7.10)
Equation (7.10) is an implicit equation for solving systems of ODEs numeri-
cally, and is implemented in the IBEsysODE function listed in code 7.16. No-
tice that the implicit approach requires solving an equation at every step. In
this case, the equation is a linear system of equations with a fixed matrix,
so the IBEsysODE function uses LU decomposition once and calls on Mat-
lab’s triangular solver via the left-divide operator to quickly find a solution
in each loop iteration. If this were a nonlinear problem, then numerical meth-
ods such as Newton’s method or the secant method described in section 7.1.1
would be needed, which could add significant computation. For this reason,
implicit methods are not recommended for problems that are either not stiff or
nonlinear [65].
Figure 7.22 shows the numerical solution that the IBEsysODE function
found to our stiff system of ODEs. Since this is a fixed step size solution, the
step size was decreased to 0.02 to track the quickly changing e−49 t terms in
the solution. Of course, the small step is not needed once the e−49 t term is
sufficiently close to zero. Remember that there is a difference between accuracy
and stability. A lack of stability causes the solution to be wrong as k → ∞.
A lack of accuracy just means that the exact solution is not well modeled by
the numerical solution, which is often solved by reducing the step size.
354 Introduction to Computational Engineering With MATLAB ®
>> A = [-25 24; 24 -25];
>> u0 = [1;2];
>> [t, u] = IBEsysODE(...
A, [0, 1], u0, 50);
>> x=@(t)-0.5*exp(-49*t) ...
+ 1.5*exp(-t);
>> y=@(t)0.5*exp(-49*t) ...
+ 1.5*exp(-t);
>> figure, hold on
>> plot(t, u(1,:), ’o’)
>> plot(t, u(2,:), ’+’)
>> plot(t, x(t), ’:’)
>> plot(t, y(t), ’--’)
>> hold off
FIGURE 7.22: With enough data points, the IBEsysODE function is able to
to reasonably well track the rapidly changing solution of the e−49 t terms.
Since our A matrix is symmetric, we can use the property of ∥·∥2 matrix
norms that ∥A∥2 = maxi |λi (A)|, where λi (A) means the ith eigenvalue of A
(appendix A.1.3).
Now we can make use of the eigenvalue property for symmetric matrices that
λi (A−1 ) = λi (A)
1
(equation (6.1) in section 6.4).
1
<1
max
i λi (I − h A)
Equation (7.11) holds for any symmetric A matrix with negative eigenvalues.
So the implicit backward Euler method is unconditionally stable for systems
of ODEs.
Computational Numerical Methods 355
Figure 7.23 shows how ode15s handles the stiff system of ODEs used in
the previous examples.
FIGURE 7.23: The ode15s ODE solver is designed for stiff systems, so the
less significant, but rapidly changing terms do not significantly reduce the step
size.
Online Resources
• The MathWorks websitea has some good tutorial videos about
numerical solutions to differential equations. Although, the ODE
solvers in Matlab are more complex than the examples in the first
three videos, they explain the basic concepts.
• Matlab’s ODE solvers are reviewed on the MathWorks websiteb .
a https://www.mathworks.com/videos/series/solving-odes-in-matlab-117658.
html
b https://www.mathworks.com/help/matlab/math/choose-an-ode-solver.html
Computational Numerical Methods 357
7.6 Exercises
A.1 Norms
Matlab includes a function called norm for the purpose of find the length
of vectors or matrices. The most frequent usage is to find the Euclidean length
of a vector, which is called a l2 -norm. It comes direct from the Pythagorean
theorem—the square root of the sum of the squares. The length of a vector is
a familiar concept, but the length of a matrix feels somewhat mysterious. As
always, the norm function is well documented in the Matlab documentation.
However, there are different measures of length and properties of norms that
should be reviewed. Moreover, the names, symbols, and application of the
various norms could use some clarification.
In technical literature, the most common mathematics symbol for a norm
is a pair of double bars around the variable name with a subscript for the
type of norm, ∥v∥2 . If the subscript is left off, then it is assumed to be 2.
The generic name given to the type of norm for vectors is the italics letter l
with a subscript of the type, l2 . You may sometimes see the type given as a
super-script instead of a subscript.
Cardinality
SYMBOLS ∥v∥0 , ∥A∥0 , l0
DESCRIPTION The l0 norm is the number of nonzero elements
of either a vector or a matrix. The l0 norm does not fit the
properties that are normally expected of a norm, so it is not
always classified as being a norm. It has utility for sparse (a
lot of zeros) vectors and matrices.
MATLAB EXAMPLE
361
362 Introduction to Computational Engineering With MATLAB ®
Figure A.1 shows plots of the x and y values that satisfy ∥[x y]∥p = 1 for
various values of p. Keep these plots in mind as you read the description of
the l1 , l2 , and l∞ norms. We see straight lines in the p = 1 plot because the
value of the norm is the sum of the absolute values of the elements. The plot
of a circle for p = 2 relates to the l2 norm being the length of a vector by
Pythagorean theorem. As the p values get larger, the plots approach the shape
of a square, which models the l∞ norm where the norm takes the value of the
largest absolute value of the elements.
Euclidean Norm
SYMBOLS ∥v∥, ∥v∥2 , l2
DESCRIPTION The l2 norm of a vector is by far the most fre-
quently used norm calculation. It finds the length of a vector
by the same means that the Pythagorean theorem finds the
length of the hypotenuse of a right triangle.
v
uN q
uX
∥v∥2 = t vk2 = v12 + v22 + · · · + vN
2
k=1
MATLAB EXAMPLE
Infinity Norm
SYMBOLS ∥v∥∞ , ∥v∥−∞ , l∞ l−∞
DESCRIPTION The l±∞ is the maximum or minimum abso-
lute value of the vector elements.
MATLAB EXAMPLE
2-Norm of a Matrix
SYMBOLS ∥A∥2 , ∥·∥2
DESCRIPTION The focus of the ∥·∥2 matrix norm is on the
ability of a matrix to stretch a vector rather than on
the values of the elements in the matrix. The calculation is
the maximum ratio of l2 vector norms.
∥A v∥
∥A∥2 = max
v̸=0 ∥v∥
∥A v1 ∥ σ1 ∥u1 ∥
∥A∥2 = = = σ1
∥v1 ∥ ∥v1 ∥
The final simplification stems from the fact that both u and
v are unit vectors. Thus, the ∥·∥2 matrix norm is the largest
(first) singular value from the SVD.
We learned in section 6.8 that the singular values are the
square root of the eigenvalues of AT A. If we use the notation
366 Introduction to Computational Engineering With MATLAB ®
MATLAB EXAMPLE
Frobenius Norm
SYMBOLS ∥A∥F , ∥·∥F
DESCRIPTION The Frobenius norm is similar to the l2 vector
norm in that it is the square root of the sum of the squared
elements. It can also be found from the singular values.
v
um X n
uX q
∥A∥F = t a2ij = trace(AT A)
i=1 j=1
v
umin (m,n)
u X
∥A∥F = t σi2
i=1
Linear Algebra Appendix 367
MATLAB EXAMPLE
>> A
A =
-1 7 -5
4 -3 -5
-5 0 7
>> A_f = norm(A, ’fro’)
A_f =
14.1067
>> A_f = sqrt(sum(A(:).^2))
A_f =
14.1067
>> A_f = sqrt(trace(A’*A))
A_f =
14.1067
>> A_f = sqrt(sum(svd(A).^2))
A_f =
14.1067
>> A
A =
-1 7 -5
4 -3 -5
-5 0 7
>> norm(A, Inf)
ans =
13
>> max(sum(abs(A’)))
ans =
13
Nuclear Norm
SYMBOLS ∥A∥N , ∥·∥N
DESCRIPTION The nuclear norm is the sum of the singular
values from the SVD, which is related to the rank because
most higher order singular values of matrices holding data,
such as an image, are close to zero. It has shown to give
368 Introduction to Computational Engineering With MATLAB ®
MATLAB EXAMPLE
>> A
A =
-1 7 -5
4 -3 -5
-5 0 7
>> A_nuc_norm = sum(svd(A))
A_nuc_norm =
20.4799
Span
The set of all linear combinations of a collection of vectors is called the
span of the vectors. If a vector can be expressed as a linear combination
of a set of vectors, then the vector is in the span of the set.
Basis
The smallest set of vectors needed to span a vector space forms a basis for
that vector space. The vectors in the vector space are linear combinations
of the basis vectors. For example, the basis vectors used for a Cartesian
coordinate frame in the vector space R3 are:
1 0 0
0 , 1 , 0
0 0 1
Many different basis vectors could be used as long as they are a linearly
independent set of vectors, but we have a preference for unit length or-
thogonal vectors. In addition to the Cartesian coordinates, other sets of
basis vectors are sometimes used. For example, some robotics applications
may use basis vectors that span a plane corresponding to a work piece.
Other applications, such as difference equations, use the set of eigenvectors
of a matrix as the basis vectors.
Dimension
The number of vectors in the basis gives the dimension of the vector space.
For the Cartesian basis vectors, R3 and R2 , this is also the number of
elements in the vectors. However, other vector spaces may have a smaller
dimension. For example, the subspace where the z axis is the same as
the x axis (plotted in figure A.2) forms a plane with dimension 2. The
orthonormal basis vectors for this subspace might be
1 1 0
{v1 , v2 } = √ 0 , 1 .
2
1 0
This means that no vector in the set is a linear combination of other vectors
in the set. If one or more vectors in the set can be found by a linear combination
of other vectors in the set, then the vectors are not linearly independent and
there are nonzero coefficients, ci , that will satisfy the equation c1 u1 + c2 u2 +
. . . + cn un = 0.
For example, the basis vectors used for a Cartesian coordinate frame in
the vector space R3 are linearly independent.
1 0 0
0 , 1 , 0
0 0 1
Whereas, the following set of vectors are not linearly independent because the
last vector is the sum of the first two vectors.
1 0 1
0 , 1 , 1
1 0 1
Some phrases that are used to describe a vector that is linearly dependent
on a set of vectors is: “In the span of ” or “In the space”. For example, when
we evaluate the columns of a matrix, A, with regard to another vector, b, we
might say that vector b is in the column space of A, which means that b is a
linear combination of the vectors that define the columns of matrix A.
Linear Algebra Appendix 371
A.2.3 Rank
The rank function can be used to test the linear independence of vectors
that make up the columns of a matrix.
The rank of a matrix can be determined by two different methods. The
rank of a matrix is the number of nonzero pivots in its row-echelon form,
which is achieved by Gaussian elimination. Here is an example illustrating
how elimination on a singular matrix results in a pivot value being equal to
zero. Because of dependent relationships between the rows and columns, the
row operations to change elements below the diagonal to zeros also move the
later pivots to zero. In this example, the third column is the sum of the first
column and two times the second column.
3 2 7
0 3 6
1 0 1
1 0 -2
0 1 -1
0 0 0
Notice from the output of rref that the first two columns are pivot
columns. Thus we can see, as was designed from constructing the singular
matrix, that the first two columns span the column space.
−1 4
Col(A) = Span 3 , 0
6 2
Then from the SVD, we get a column space from the first two columns of U,
which are orthogonal vectors.
6 -2 20
>> rank(A)
ans =
2
>> rref(A)
ans =
1 0 4
0 1 2
0 0 0
The third column from the RREF shows the dependent relationship of the
columns of A. The third column is four of the first column plus two of the
second column.
4 a1 + 2 a2 − a3 = 0
4 0
a1 a2 a3 2 = 0
−1 0
>> A*x
ans =
0
0
0
>> x/norm(x)
ans =
0.8729
0.4364
-0.2182
>> null(A)
ans =
0.8729
0.4364
-0.2182
V =
-0.1583 0.4616 0.8729
-0.1697 -0.8836 0.4364
-0.9727 0.0791 -0.2182
>> rref(sym(A))
ans =
[ 1, 0, 0, 10/3, 50/3]
[ 0, 1, 0, -11/3, -61/3]
[ 0, 0, 1, 4/3, 41/3]
Each row from the RREF can be used as an equation of the null vector. The
equations can either leave the terms to the left of the equal sign with zeros on
the right, or for under-determined equations they can be expressed in para-
metric form as follows. The difference in the null solution is a multiplication
by −1, which is also a correct solution.
x1 = −10/3 x4 − 50/3 x5
x2 = 11/3 x4 + 61/3 x5
x3 = −4/3 x4 − 41/3 x5
x4 = x4
x5 = x5
−10/3 −50/3
11/3 61/3
Null(A) = span −4/3 , −41/3
1 0
0 1
Notice that both basis vectors of the null space satisfy Ax = 0. They span
the null vector space.
0
0
>> A*x2
ans =
0
0
0
The normalized null space vectors found from the SVD (same result as the
null command) are the last two columns of V. Remember from section 5.8
that under-determined systems have an infinite set of solutions. The RREF ex-
presses the solution in terms of the general and particular solutions, while the
null space from the SVD is the least squares solution. Although, the solutions
are different, they are both correct.
>> A
A =
-4 2 -3 4 0
1 8 5 3 2
1 6 4 5 -4
>> rank(A)
ans =
3
% Check that the null space vectors are orthogonal to the row space
% vectors. Zero dot product shows orthogonal vectors, which we can
% see from matrix multiplication inner products.
>> nullSpace’*rowSpace
ans =
1.0e-14 *
0 0 0
0 -0.0444 0.1110
% The left null space is empty because the column space vectors
378 Introduction to Computational Engineering With MATLAB ®
A = a1 a2 ··· an
We will first find a matrix, B, who’s columns are orthogonal and are formed
from projections of the columns of A. For each column after the first, we
subtract the projection of the column onto the previous columns. Thus we are
subtracting away the projection leaving a vector that is orthogonal to all of
380 Introduction to Computational Engineering With MATLAB ®
b1 = a1
bT1 a2
b2 = a2 − b1
bT1 b1
bT1 a3 bT a3
b3 = a 3 − T
b1 − 2T b2
b1 b1 b2 b2
..
.
bT1 an bT an bT an
bn = an − T
b1 − 2T b2 − · · · − Tn−1 bn−1
b1 b1 b2 b2 bn−1 bn−1
A.5 QR Factorization
QR factorization finds sub-matrices Q and R, where Q contains orthogonal
column vectors, and R is an upper triangular matrix, such that
A = Q R.
Linear Algebra Appendix 381
function Q = gram_schmidt(A)
% GRAM_SCHMIDT - Classic Gram-Schmidt Process (CGS)
% Input - A matrix. The algorithm operates on the columns.
% Output - unitary matrix - columns are basis for A.
[m, n] = size(A);
Q = zeros(m, n);
Q(:,1) = A(:,1)/norm(A(:,1)); % the first column
For all but poorly conditioned matrices, the Q and R matrices from QR
are nearly the same as found from the modified Gram–Schmidt process of
appendix A.4.3. However, the QR algorithm implemented in Matlab’s qr
function and the qrFactor function that follows are faster, especially for larger
matrices, and give more numerically accurate results [79, 71].
The QR factorization uses an algorithm based on Householder transfor-
mation matrices.1 The algorithm feels similar to LU decomposition, where
products of elementary matrices are used to change matrix elements to zero
resulting in an upper triangular matrix. Except the Householder matrices
are orthogonal operations on the matrix columns rather than row operations.
Each Householder matrix multiplication sets all matrix elements below the
main diagonal to zero. The algorithm finds R first and then Q is the product
of the Householder matrices [18, 28].
In the following example, the × symbols represent potentially nonzero
matrix elements.
× × × × × × × ×
0 × × × 0 × × ×
H1 A = 0 × × ×, H2 H1 A = 0 0 × ×,
0 × × × 0 0 × ×
1 The QR algorithm may be accomplished with Givens rotation matrices instead of House-
holder matrices.
382 Introduction to Computational Engineering With MATLAB ®
[m, n] = size(A);
Q = zeros(m, n);
R = zeros(m, n);
for k = 1:n
R(k,k) = norm(A(1:m,k)); % start with normalized
Q(1:m,k) = A(1:m,k)/R(k,k); % column vector
for j = k + 1:n
R(k,j) = Q(1:m,k)’*A(1:m, j);
A(1:m,j) = A(1:m,j) - Q(1:m,k)*R(k,j);
end
end
× × × ×
0 × × ×
H3 H2 H1 A =
0 0 × ×
0 0 0 ×
R = (Hn−1 . . . H2 H1 ) A
−1
Q = (Hn−1 . . . H2 H1 )
Q = H1 H2 . . . Hn−1
Linear Algebra Appendix 383
span(u)⊥
x
u
Hx
The Householder reflector matrix is found from the following equation [35].
2
H = I − T uuT , u= ̸ 0, u ∈ Rn (A.1)
u u
To use reflector matrices in the QR algorithm, we first find the u vector
such that all elements of H x, except for the first, are equal to zero.
T T
H x = c 0 . . . 0 = c e1 , where e1 = 1 0 . . . 0
From the geometry and since ∥x∥2 = ∥H x∥2 = |c|, u must be parallel to
ũ
ũ = x ± ∥x∥2 e1 , and u = ∥ũ∥ . So we can find u simply from x with ∥x∥2
2
added to x1 with the same sign as x1 and then making it a unit vector.
x1 + sign(x1 ) ∥x∥2
x2 ũ
ũ = , u=
..
. ∥ũ∥ 2
xn
384 Introduction to Computational Engineering With MATLAB ®
span(u)⊥
x
Hx
FIGURE A.5: The u vector is set so that only the first element of the
Householder reflection H x has a nonzero value.
H = I − 2u uT
function u = House(x)
% HOUSE - Find u for Householder reflector matrix
% x - Real vector
% u - Unit length real vector for finding the Householder
% reflector matrix for QR factorization.
x(1) = x(1) + sign(x(1))*norm(x);
u = x/norm(x);
end
B = M−1 A M
To show this, we will start with this relationship and the eigenvalue equation
for matrix A and work toward the eigenvalue equation for the similar matrix
B.
Ax = λ x
We can insert an identity matrix (M M−1 = I) into the eigenvalue equation.
A (M M−1 ) x = λ x
386 Introduction to Computational Engineering With MATLAB ®
Substitute for B.
B (M−1 x) = λ (M−1 x)
Thus, an eigenvalue of B is λ, the same as A, and the corresponding eigen-
vector of B is M−1 x.
Some examples of similar matrices are:
λ2 (c1 x1 + c2 x2 ) = c1 λ2 x1 + c2 λ2 x2 = 0
Now subtract.
c1 λ1 x1 + c2 λ2 x2 = 0
− c1 λ2 x1 + c2 λ2 x2 = 0
c1 (λ1 − λ2 ) x1 = 0
Since the λ’s are different and x1 ̸= 0, we conclude that c1 = 0, and similarly
c2 = 0. Thus c1 x1 + c2 x2 = 0 only when c1 = c2 = 0. So the eigenvectors x1
and x2 are linearly independent of each other.
Since A x3 = λ3 x3 ,
A x3 = λ3 (k1 x1 + k2 x2 ) = λ3 k1 x1 + λ3 k2 x2 .
1 1
Vectors x1 and x2 can be substituted for x1 = λ1 A x1 and x2 = λ2 A x2 .
λ 3 k1 λ3 k2
A x3 = A x1 + A x2 (A.3)
λ1 λ2
λ 3 k1 λ3 k2
x3 = x1 + x2 (A.4)
λ1 λ2
Pre-multiplying both sides of equation (A.3) by A−1 removes the A ma-
trices leaving equations (A.2) and (A.4) as equivalent equations. If equa-
tions (A.2) and (A.4) are true, then it must be that λ3 = λ1 and λ3 = λ2 ,
so λ1 = λ2 = λ3 , which is a contradiction of the initial statement that each
eigenvalue is distinct. Therefore, equation (A.2) is false. If each eigenvalue is
distinct, then all eigenvectors are linearly independent.
x̄T S x = λ x̄T x
− xT S x̄ = λ̄ xT x̄
0 = λ − λ̄ xT x̄
Linear Algebra Appendix 391
Because of the symmetry of S, the scalar values on the left hand sides are
the same (subtracting to zero). On the right hand side, the dot product is the
2
sum of the squares of the eigenvector ∥x∥ and can not be zero for an nonzero
vector. Thus, it must be that λ − λ̄ = 0, which is true only when λ is real.
xTi S xj = xTj S xi
Starting with two eigenvector equations, we can pre-multiply the first equa-
tion by the transpose of the eigenvector from the second equation, then pre-
multiply the second equation by the transpose of the eigenvector from the first
equation and subtract the two equations.
xTj S xi = λi xTj xi
− xTi S xj = λj xTi xj
0 = (λi − λj ) xTi xj
>> S = [1 2; 2 2]
S =
1 2
2 2
>> [Q, Lambda] = eig(S)
Q =
-0.7882 0.6154
0.6154 0.7882
Lambda =
-0.5616 0
0 3.5616
>> Q’*Q
ans =
1.0000 -0.0000
-0.0000 1.0000
Appendix B
The Number e
y(x) = ea x ,
then
dy(x)
= a ea x = a y(x).
dx
Let us now regard exponential equations of e as special cases of a more
general class of exponential equations. In doing so, we will see some interesting
properties of e and also get a start toward finding a definition of the value of
e. In the following equation, k is any positive, real number.
y(x) = k a x
A strategy for finding the derivative of y(x) is to first take the natural loga-
rithm of y(x) (base e logarithm, denoted as ln y). Although we have not yet
found the value of e, we know that it is a number and can abstractly use it as
the base for a logarithm.
ln y(x) = a x ln k
d [ln y(x)] d [a x ln k]
=
dx dx
393
394 Introduction to Computational Engineering With MATLAB ®
The derivative of ea t
To find the derivative of ea t we can either take the derivative of its
Maclaurina series, or use its numeric definition in terms of a limit. The
later is used here. N
at at
e = lim 1 +
N →∞ N
N
The chain rule is used to find the derivative. If f (t) = 1 + aNt , then
N −1
f ′ (t) = a 1 + aNt . We see the desired equality then in the limit.
ea t = limN →∞ f (t)
d
dt (ea t ) = limN →∞ f ′ (t) = a limN →∞ f (t) = a ea t
a The Maclaurin series for function f is the Taylor series for function f (a) about the
point (a = 0).
The left side of the above equation is the more difficult to find. Implicit dif-
ferentiation and the chain rule shows that
d [ln y] 1 dy
= .
dx y dx
1 dy
= a (ln k)
y dx
Thus after multiplying by y we have,
dy
= a (ln k) y = a (ln k) k a x (B.1)
dx
dy
If k = 2, dx = a (0.693147) 2a x .
dy
If k = e, dx = a ea x .
dy
If k = 3, dx = a (1.0986) 3a x .
Equation (B.1) is useful, but it assumes that the value of e is already
known. We need to use the definition of a derivative to find an equation for
the value of e,
dy y(x + h) − y(x)
= lim .
dx h→0 h
The Number e 395
dy k (x+h) − k x
= lim
dx h→0 h
kx kh − kx
= lim
h→0 h
h
(k − 1)
= lim k x
h→0 h
Relating the last equation to equation (B.1), we find a limit equation for ln k.
kh − 1
ln (k) = lim
h→0 h
Let us test this with the natural log of 2 and 3. We need a very small value
for h to get a reasonably accurate result.
>> h = 0.00000001;
>> ln2 = (2^h - 1)/h
ln2 =
0.693147184094300
>> log(2)
ans =
0.693147180559945
>> ln3 = (3^h - 1)/h
ln3 =
1.098612290029166
>> log(3)
ans =
1.098612288668110
Let us test it with Matlab. Here, the limited digital resolution of the
computer can limit the accuracy if N is too large.
>> N = 1E10;
>> (1 + 1/N)^N
ans =
2.718282053234788
>> exp(1)
ans =
2.718281828459046
>> (1 + 3/N)^N
ans =
20.085541899804120
>> exp(3)
ans =
20.085536923187668
Now replace x in the equations for ex with jx. Remember that j 2 = −1.
% File: cmplxEuler.m
% let n = some big number
N = 100000;
z = linspace(0,2*pi); % test 100 numbers between 0 and 2*pi
where I is the annual interest rate and N is the number of times that interest
is compounded each year. If interest were compounded continuously, then we
have
Balance = P eI t .
If a loan uses compound interest, then the principle can grow multiple times
between each payment depending on how often it is compounded. This results
in the borrower paying much more over the life of the loan. Fortunately, au-
tomobile loans and home mortgages use simple interest. However, credit card
debt and student loans are more likely to use compound interest.
The Number e 399
The difference between simple and compound interest made for a humorous
exchange between George Washington, first president of the United States, and
his step son. Washington’s wife, Martha, was previously a widow to a wealthy
man. One of Martha’s sons, John Parke Custis, was not very knowledgeable of
business matters when he borrowed money at compound interest to purchase
a plantation. When Washington learned of the terms of the loan, he wrote
the following to his step son, “No Virginia estate (except a few under the best
management) can stand simple interest. How then can they bear compound
Interest?” [25].
From section 6.5, we know that Ak = X Λk X−1 Where X and Λ are the
eigenvector and diagonal eigenvalue matrices of A.
Thus,
X Λ2 X−1 X Λ3 X−1
eA = I + X Λ X−1 + + + ···
2! 3!
Λ2 Λ3
=X I+Λ+ + + · · · X−1
2! 3!
= X eΛ X−1
Λ2 Λ3
Γ = eΛ = I + Λ + + + ···
2! 3!
eλ 1
0 ··· 0
0 eλ2 ··· 0
= .
.. ..
..
. .
0 0 ··· eλ n
eA = X Γ X−1
The Number e 401
where
eλ1 t
0 ··· 0
0 eλ2 t ··· 0
Γt = . .
.. .. ..
.. . . .
0 0 ··· eλ n t
Note:
eA = X Γ X−1
eA 2 = (X Γ X−1 ) (X Γ X−1 )
−1
= X Γ2 X
−1
eA 3 = (X Γ2 X ) (X Γ X−1 )
−1
= X Γ3 X
..
.
−1
eA t = X Γt X
Using equation (B.4), we find k from y(t) when t = 0.
−1
y(0) = X Γ0 X k = X I X−1 k = k
402 Introduction to Computational Engineering With MATLAB ®
Letting c = X−1 k = X−1 y(0) gives the solution used in section 6.7 (equa-
tion (6.7) on page 269).
Here we keep all of the variables as matrices and vectors in Matlab and find
the same solution.
[1] Robert K. Adair. The Physics of Baseball. Harper Perennial, New York,
NY, 1990.
[2] Falah Alsaqre. Two-dimensional PCA for face recognition. https://www.
mathworks.com/matlabcentral/fileexchange/69377-two-dimensi
onal-pca-for-face-recognition), 2019. MATLAB Central File
Exchange, Retrieved July 10, 2019.
[3] ANSI/IEEE. IEEE standard for binary floating-point arithmetic. ANSI/
IEEE Std 754-1985, pages 1–20, 1985.
[4] Stormy Attaway. MATLAB, A Practical Introduction to Programming
and Problem Solving. Butterworth-Heinemann/Elsevier, Amsterdam,
fourth edition, 2017.
[5] A. Azzalini and A. W. Bowman. A look at some data on the Old Faith-
ful geyser. Journal of the Royal Statistical Society. Series C (Applied
Statistics), 39(3):357–365, 1990.
[6] Major League Baseball. Mustc: Gordon’s clutch home run. https://ww
w.mlb.com/video/must-c-gordon-s-clutch-home-run-c526415283.
[7] Steven Brunton and Nathan Kutz. Data-Driven Science and Engineer-
ing: Machine Learning, Dynamical Systems, and Control. Cambridge
University Press, 2019.
[8] Garrett Buffington. Polar decomposition of a matrix. http://buzzard.
ups.edu/courses/2014spring/420projects/math420-UPS-spring-2
014-buffington-polar-decomposition.pdf, 2014.
[9] Rizwan Butt. Introduction to Numerical Analysis using MATLAB. Jones
and Bartlett, 2010.
[10] Andrew Chamberlain. The linear algebra view of the Fibonacci se-
quence. https://medium.com/@andrew.chamberlain/the-linear-alg
ebra-view-of-the-fibonacci-sequence-4e81f78935a3, 2016.
[11] Stephen J. Chapman. MATLAB Programming for Engineers. Cengage
Learning, Boston, MA, fifth edition, 2016.
403
404 Bibliography
[12] Steven C. Chapra and Raymond L. Canale. Numerical Methods for En-
gineers. McGraw-Hill, seventh edition, 2015.
[13] Mei-Qin Chen. A brief history of linear algebra and matrix theory.
http://www.macs.citadel.edu/chenm/240.dir/12fal.dir/history2
.pdf, 2012.
[14] R.E. Cline and R.J. Plemmons. l2 –solutions to undetermined linear sys-
tems. SIAM Review, 18(1):92–106, Jan., 1976.
[15] Peter Corke. Robotics, Vision and Control–Fundamental Algorithms in
MATLAB. Springer, New York, NY, second edition, 2017.
[16] Peter Corke. Spatial math toolbox. https://petercorke.com/toolbox
es/spatial-math-toolbox/, 2020.
[17] Marc Deisenroth, Aldo Faisal, and Cheng Ong. Mathematics for Machine
Learning. Cambridge University Press, 02 2020.
[18] James W. Demmel. Applied Numerical Linear Algebra. SIAM, Philadel-
phia, PA, 1997.
[19] Froilan Dopico. Alan Turing and the origins of modern Gaussian elimi-
nation. Arbor, 189:a084, 12 2014.
[20] Dheeru Dua and Casey Graff. UCI Machine Learning Repository. http:
//archive.ics.uci.edu/ml, 2017.
[21] C. Eckart and G. Young. The approximation of one matrix by another
of lower rank. Psychometrika, 1(3):211–218, 1936.
[22] Elsie Eigerman. How to import data from spreadsheets and text files
without coding. https://www.mathworks.com/videos/importing-dat
a-from-text-files-interactively-71076.html.
[23] R. A. Fisher. The use of multiple measurements in taxonomic problems.
Annals of Eugenics, 7(2):179–188, 1936.
[24] G. E. Forsythe, M. A. Malcolm, and C. B. Moler. Computer Methods for
Mathematical Computations. Prentice Hall, Englewood Cliffs, NJ, 1976.
[25] National Archives Founders Online. From George Washington to John
Parke Curtis, 3 August 1778. https://founders.archives.gov/docu
ments/Washington/03-16-02-0249, 2006.
[26] Matan Gavish and David
√ L. Donoho. The optimal hard threshold for
singular values is 4/ 3. IEEE Transactions on Information Theory,
60(8):5040–5053, 2014.
[27] G. Golub and C. Reinsch. Singular value decomposition and least squares
solution. Numerische Mathematik, 14:403–420, 1970.
Bibliography 405
[28] Gene H. Golub and Charles F. Van Loan. Matrix Computations. Johns
Hopkins University Press, Baltimore, MD, fourth edition, 2013.
[29] Michael C. Grant and Stephen P. Boyd. CVX: MATLAB software for
disciplined convex programming. http://cvxr.com/cvx/, 2020. CVX
Research, Inc.
[30] John V. Guttag. Introduction to Computation and Programming Using
Python: With Application to Understanding Data. The MIT Press, second
edition, 2016.
[31] Gabriel Ha. Creating a basic plot interactively. https://www.mathwork
s.com/videos/creating-a-basic-plot-interactively-68978.html.
[32] Brian D. Hahn and Daniel T. Valentine. Essential MATLAB for Engi-
neers and Scientists. Academic Press/Elsevier, Amsterdam, sixth edition,
2017.
[33] E. Cuyler Hammond and Daniel Horn. The relationship between hu-
man smoking habits and death rates: A follow-up study of 187,766 men.
Journal of the American Medical Association, 155(15):1316–1328, Aug.
1954.
[34] Desmond J. Higham and Nicholas J. Higham. MATLAB Guide. SIAM,
Philadelphia, PA, third edition, 2017.
[35] Nicholas J. Higham. Accuracy and Stability of Numerical Algorithms.
SIAM, Philadelphia, PA, second edition, 2002.
[36] Nicholas J. Higham. Gaussian elimination. Wiley Interdisciplinary Re-
views: Computational Statistics, 3:23–238, 2011.
[37] Robert V. Hogg and Allen T. Craig. Introduction to Mathematical Statis-
tics. Macmillan, London, fourth edition, 1978.
[38] The MathWorks Inc. Create 2-d line plots. https://www.mathworks.co
m/help/matlab/creating_plots/using-high-level-plotting-fun
ctions.html.
[39] The MathWorks Inc. Greek letters and special characters in chart
text. https://www.mathworks.com/help/matlab/creating_plots/g
reek-letters-and-special-characters-in-graph-text.html.
[40] The MathWorks Inc. Line properties. https://www.mathworks.com/he
lp/matlab/ref/matlab.graphics.chart.primitive.line-propertie
s.html.
[41] The MathWorks Inc. MATLAB for new users. https://www.mathwork
s.com/videos/matlab-for-new-users-1487714181074.html.
406 Bibliography
[42] The MathWorks Inc. Supported file formats for import and export.
https://www.mathworks.com/help/matlab/import_export/supporte
d-file-formats-for-import-and-export.html.
[43] The MathWorks Inc. Two-D and Three-D plots. https://www.mathwo
rks.com/help/matlab/learn_matlab/plots.html.
[44] The MathWorks Inc. MATLAB fundamentals. https://matlabacadem
y.mathworks.com/, 2017.
[45] The MathWorks Inc. Characters and strings. https://www.mathworks.
com/help/matlab/characters-and-strings.html, 2021.
[46] The MathWorks Inc. Systems of linear equations. https://www.math
works.com/help/matlab/math/systems-of-linear-equations.html,
2021.
[47] J. Kautsky, N.K. Nichols, and P. Van Dooren. Robust pole assignment in
linear state feedback. International Journal of Control, 41(5):1129–1155,
1985.
[48] Philip N. Klein. Coding the Matrix: Linear Algebra through Applications
to Computer Science. Newtonian Press, 2013.
[49] Jose Nathan Kutz. Data-Driven Modeling & Scientific Computation:
Methods for Complex Systems & Big Data. Oxford University Press,
2013.
[50] J. C. Lagarias, J. A. Reeds, M. H. Wright, and P. E. Wright. Convergence
properties of the Nelder-Mead simplex method in low dimensions. SIAM
Journal of Optimization, 9(1):112–147, 1998.
[51] James Lambers. CME 335 lecture 6 notes. https://web.stanford.edu
/class/cme335/lecture6.pdf, 2010.
[52] Cris Luengo. Boxplot. https://www.mathworks.com/matlabcentral/
fileexchange/51134-boxplot, 2015. MATLAB Central File Exchange.
Retrieved September 3, 2020.
[53] John H. Mathews and Kurtis D. Fink. Numerical Methods Using Matlab.
Pearson Prentice Hall, fourth edition, 2005.
[54] J.L. Meriam. Engineering Mechanics Statics and Dynamics. John Wiley
& Sons, Inc. New York, NY, 1978.
[55] Cleve Moler. Numerical Computing with MATLAB. SIAM, Philadelphia,
PA, 2004.
[56] Cleve Moler. Professor svd. https://www.mathworks.com/company/ne
wsletters/articles/professor-svd.html, 2006. A blog post in the
MathWorks’ Technical Articles and Newsletters.
Bibliography 407
[57] Cleve Moler. Gil Strang and the cr matrix factorization. https://
blogs.mathworks.com/cleve/2020/10/23/gil-strang-and-the-cr
-matrix-factorization/, 2020. Blog: Cleve’s Corner: Cleve Moler on
Mathematics and Computing.
[58] David S. Moore, William Notz, and Michael Fligner. The Basic Practice
of Statistics. W.H. Freeman and Co., New York, 2018.
409
410 Index