A Quantitative Methodology For Mapping Project Cost
A Quantitative Methodology For Mapping Project Cost
A Quantitative Methodology For Mapping Project Cost
ii
A Quantitative Methodology for Mapping Project Costs to
Engineering Decisions in Naval Ship Design and Procurement
by
Kristopher David Netemeyer
Submitted to the Department of Mechanical Engineering and Engineering Systems Division on May 7,
2010 in Partial Fulfillment of the Requirements for the Degrees of
Naval Engineer
And
Master of Science in Engineering and Management
Abstract
Alternative methods for cost estimation are important in the early conceptual stages of a design when
there is not enough detail to allow for a traditional quantity takeoff estimate to be performed. Much of
the budgeting process takes place during the early stages of a design and it is important to be able to
develop a budget quality estimate so a design is allocated the necessary resources to meet stakeholder
requirements. Accurate project cost estimates early in the planning and design processes can also serve
as a cost-control measure to assist in managing the design process. With an understanding of the most
significant engineering decisions that affect project costs, project team members and stakeholders can
proactively make cost-effective decisions during the design process rather than after construction
begins and it is too late to prevent going over budget.
This research examines the potential of Artificial Neural Networks (ANNs) as a tool to support the tasks
of cost prediction, mapping costs to engineering decisions, and risk management during the early stages
of a design’s life-cycle. ANNs are a modeling tool based on the computational paradigm of the human
brain and have proved to be a robust and reliable method for prediction, ranking, classification, and
interpretation or processing of data.
iii
Acknowledgements
The author would like to thank the following organizations and individuals for their assistance. This
thesis would not have been possible without them.
• Dr. Annie Pearce for introducing me to the wonderful world of artificial neural networks.
• Geoffrey Pawlowski and Kelly Meyers of the Naval Surface Warfare Development Center,
Carderock for providing me real world data to ensure that my effort meant something.
• Erik Strasel for providing me the modeling tool I needed to get my research jumpstarted.
• Captain (Retired) Jeffrey Reed for guiding a “lost” junior officer toward the right path, both
literally and figuratively.
• Santiago Balestrini-Robinson for taking the time to share your vast knowledge of neural
networks and statistical modeling techniques to a student over 1,000 miles away.
• Commander Trent Gooding for being patient enough to allow me to have enough rope to find
what I was looking for, but not too much to hang myself with.
• Pat Hale for showing me there is more to the engineering duty community other than hardcore
math.
• To my wonderful boys who never gave your dad a difficult time when he had to stay in doors to
work instead of participating in snowball fights or lacing up for some roller hockey.
• To Heather, the love of my life for putting up with the bizarre weather patterns and the
deranged driving practices in the New England area and supporting me through this entire
endeavor. I could not have done it without your support!
• Finally, to my not yet born daughter Abigail. Thank you for providing some much needed joy
during my last year at MIT and for giving me the motivation to finish this task ahead of schedule.
Daddy loves you!
iv
Table of Contents
1 Introduction ........................................................................................................................................... 9
1.1 The Role of Cost Estimating in Naval Ship Design......................................................................... 9
1.2 Motivation................................................................................................................................... 10
1.3 Objectives.................................................................................................................................... 13
1.4 Related Research ........................................................................................................................ 13
2 Evaluation Framework ........................................................................................................................ 15
2.1 Approach ..................................................................................................................................... 15
2.2 Framework Architecture and Components ................................................................................ 15
3 Overview ............................................................................................................................................. 24
3.1 Problem Scope ............................................................................................................................ 24
3.2 Generation of a Representative Sample Set ............................................................................... 27
3.3 Development of the ANN model................................................................................................. 29
3.4 Generation of Project Range Estimates and Cost-Probability Functions.................................... 42
3.5 Mapping Cost to Engineering Decisions...................................................................................... 43
4 Case Studies ........................................................................................................................................ 44
4.1 Overview ..................................................................................................................................... 44
4.2 Design Space Definitions ............................................................................................................. 45
4.3 Monohull Case ............................................................................................................................ 54
4.4 Catamaran Case .......................................................................................................................... 70
5 Conclusion ........................................................................................................................................... 85
5.1 Summary of Work ....................................................................................................................... 85
5.2 Application and Future Work ...................................................................................................... 88
6 Bibliography ................................................................................................................................... lxxxix
v
List of Figures
Figure 1-1 Iron Triangle .............................................................................................................................. 10
Figure 1-2 Program Timeline (Cost Estimating Handbook, 2005).............................................................. 10
Figure 2-1 Estimation Techniques (Gates & Greenberg, 2006) .................................................................. 16
Figure 2-2 Full Factorial (Proust, 2008) ....................................................................................................... 19
Figure 2-3 Biological Neurons (Hagan, Demuth, & Beale, 2004) ................................................................ 21
Figure 2-4 Sample Neural Network Architecture....................................................................................... 22
Figure 3-1 Methodology............................................................................................................................. 24
Figure 3-2 Example Merged Design of Experiments ................................................................................... 29
Figure 3-3 Single-Layer Network (Hagan, Demuth, & Beale, 2004) ............................................................ 33
Figure 3-4 Log-Sigmoid Transfer Function (Hagan, Demuth, & Beale, 2004) ............................................. 33
Figure 3-5 MATLAB Neural Network Training Tool (Demuth, Beale, & Hagan, 2009) ................................ 38
Figure 3-6 Performance Plot ....................................................................................................................... 39
Figure 3-7 Regression Plot .......................................................................................................................... 40
Figure 3-8 Design Space .............................................................................................................................. 42
Figure 4-1 Structural Material versus Total Direct Cost.............................................................................. 48
Figure 4-2 Armament versus Total Direct Cost .......................................................................................... 49
Figure 4-3 Length to Beam Ratio versus Total Direct Cost ........................................................................ 50
Figure 4-4 C4I versus Total Direct Cost ...................................................................................................... 51
Figure 4-5 Installed Horsepower versus Total Direct Cost......................................................................... 51
Figure 4-6 Troop Capacity versus Total Direct Cost ................................................................................... 52
Figure 4-7 MONOHULL Training Cases ....................................................................................................... 56
Figure 4-8 MONOHULL Trained Network Performance Plot ..................................................................... 60
Figure 4-9 MONOHULL Trained Network Regression Check ..................................................................... 60
Figure 4-10 MONOHULL Simulated Cases ................................................................................................. 62
Figure 4-11 MONOHULL Total Direct Cost PDF .......................................................................................... 63
Figure 4-12 MONOHULL Total Direct Cost CDF.......................................................................................... 63
Figure 4-13 MONOHULL Correlation Results ............................................................................................. 64
Figure 4-14 MONOHULL Single-Factor Sensitivity Results ......................................................................... 65
Figure 4-15 MONOHULL Prediction Profiler .............................................................................................. 66
Figure 4-16 MONOHULL Scatter Plots (100 Variant Sample)..................................................................... 68
Figure 4-17 CATAMARAN Training Cases ................................................................................................... 71
Figure 4-18 CATAMARAN Trained Network Performance Plot.................................................................. 75
Figure 4-19 CATAMARAN Trained Network Regression Check.................................................................. 75
Figure 4-20 CATAMARAN Simulated Cases ................................................................................................ 77
Figure 4-21 CATAMARAN Total Direct Cost PDF ........................................................................................ 78
Figure 4-22 CATAMARAN Total Direct Cost CDF ........................................................................................ 78
Figure 4-23 CATAMARAN Correlation Results ........................................................................................... 79
Figure 4-24 CATAMARAN Single-Factor Sensitivity Results ....................................................................... 80
Figure 4-25 CATAMARAN Prediction Profiler............................................................................................. 81
Figure 4-26 CATAMARAN Scatter Plots (100 Variant Sample) ................................................................... 83
vi
Figure 5-1 CDF ............................................................................................................................................. 85
List of Tables
Table 3-1 LHSV Cost and Weight Estimation Model Independent Variables ............................................. 25
Table 4-1 Fixed Independent Parameters .................................................................................................. 45
Table 4-2 Variable Baseline Values ............................................................................................................ 46
Table 4-3 Material Yield Strength .............................................................................................................. 47
Table 4-4 Armament Configurations.......................................................................................................... 49
Table 4-5 Ship's Work Breakdown Structure ............................................................................................. 54
Table 4-6 MONOHULL Independent Variable Range ................................................................................. 55
Table 4-7 MONOHULL Stage One ANN Test Results .................................................................................. 58
Table 4-8 MONOHULL Stage Two ANN Test Results .................................................................................. 59
Table 4-9 MONOHULL Simulation Input Parameter Range ....................................................................... 61
Table 4-10 CATAMARAN Independent Variable Range ............................................................................. 70
Table 4-11 CATAMARAN Stage One ANN Test Results .............................................................................. 73
Table 4-12 CATAMARAN Stage Two ANN Test Results .............................................................................. 74
Table 4-13 CATAMARAN Simulation Input Parameter Range ................................................................... 76
List of Appendices
vii
This Page Intentionally Left Blank
viii
1 Introduction
As budgets for the construction of new ships and the maintenance, decommissioning, or refurbishing of
existing ships become more limited, stakeholders responsible for funding these pursuits are seeking
better methods to increase the accuracy of project cost estimates; specifically, estimations formulated
during early concept stages of a design. The ability to accurately predict how much a project could cost
is important not only to procure sufficient funding for the project at hand, but to also ensure that other
projects / designs do not suffer due to lack of funding caused by the over budgeting of other designs.
Alternative methods for cost estimation are important in the early conceptual stages of a design when
there is not enough detail to allow for a traditional quantity takeoff estimate to be performed. Much of
the budgeting process takes place during the early stages of a design and it is important to be able to
develop a budget quality estimate so a design is allocated the necessary resources to meet stakeholder
requirements.
Accurate project cost estimates early in the planning and design processes can also serve as a cost-
control measure to assist in managing the design process. With an understanding of the most significant
engineering decisions that affect project costs, project team members and stakeholders can proactively
make cost-effective decisions during the design process rather than after construction begins and it is
too late to prevent going over budget.
This research examines the potential of Artificial Neural Networks (ANNs) as a tool to support the tasks
of cost prediction, mapping costs to engineering decisions, and risk management during the early stages
of a design’s life-cycle. ANNs are a modeling tool based on the computational paradigm of the human
brain and have proved to be a robust and reliable method for prediction, ranking, classification, and
interpretation or processing of data. (Pearce, 1997)
9
puts a program or project on a solid foundation as does the consistent and continuous application of
system engineering and program management. (Cost Estimating Handbook, 2005) Cost estimating is
important to Naval Sea Systems Command (NAVSEA). It serves as a vital function in determining costs at
the onset of a program while also providing useful information for managing and controlling cost
throughout the program’s life cycle.
1.2 Motivation
Scope
Cost Schedule
10
In today’s cost-conscious world, project stakeholders and planners need a better way to predict how
early engineering decisions will impact design costs. While this need has traditionally been addressed by
heuristic knowledge (e.g., the larger the beam of the ship, the greater the SWBS 100 group cost will be),
no robust quantitative method exists for understanding how early engineering decisions affect final
project costs.
In practice, models are exercised only to accommodate the changes as a program proceeds through the
normal acquisition process; therefore, a point estimate provides only a miniscule example of the model
design space. Capturing a larger sample of the design space will be more beneficial as it allows the
project team to be able to see more variant possibilities long before significant engineering decisions are
made and design parameters are locked in.. Being able to see the “big picture” of how engineering
decisions will affect the design, the project team will be able to give the stakeholders more information
about the design space and could prevent unnecessary costs (i.e., change orders) later on in the design’s
life cycle.
ANNs also have the ability to model multivariate non-linear problems. While linear regression
techniques will capture a good portion of interactions occurring in the design space, they will not tell the
entire story as they do not have the capacity to discern other than linear relationships between input
and output variables.
11
A concern with ANNs that is not seen with traditional regression techniques is the individual
relationships between the input variables and output variables are not developed by engineering
judgments so the model tends to be a “black box” or in other words, the input / output pairs are derived
without analytical basis. ANN cases have no equation to analyze with common sense and the weights,
architecture, and transfer functions which can be extracted from the final trained model, offer little
inference to how the final product was generated. Explaining to the stakeholders how the ANN arrived
at the output would be similar to explaining how the Officer of the Deck manages a busy bridge by doing
a dissection of his or her brain tissue.
In 2005, the U.S. Army commissioned the RAND Corporation to oversee an Analysis of Alternatives (AoA)
study of the joint high speed vessel (JHSV) 1, a new class of surface ships that the Army, Navy, and
Marine Corps plan to acquire and operate over the next several decades. Based on commercial
automobile and passenger ferry designs, these ships will expand the services’ abilities to transport
significant cargo and personnel loads over long distances at high speeds, to reconfigure loads as
missions dictate, to operate in shallow waters, and work in and out of harsh environments. Defense
planners expect that the JHSV will enable the services to deploy and engage forces faster than they can
with today’s deployment assets by operating in locations and conditions where larger, deep-draft ships
1
The JHSV is sometimes referred to as a high-speed connector (HSC)
12
cannot function easily and by providing the ability to move large amounts of cargo within a theater
more efficiently than aircraft.Invalid source specified.
1.3 Objectives
The objectives of this research:
• Develop a quantitative methodology for identifying and ranking engineering decisions based on
correlations to fluctuations in design costs
• Develop a cost prediction model useful for generating range estimates of final project direct
costs with limited knowledge of project details.
These objectives were addressed to meet the needs of project stakeholders during the concept stage of
ship design and procurement projects for guidance on which factors should be closely managed to
results in designs that meet functional requirements while remaining within budget constraints. The
range estimating capability of the model enables project planners to identify the potential for cost
variation preconstruction.
Performance-Based Cost Models (PBCMs) were developed by Robert Jones and Mark Greenberg,
NSWCCD, and Michael Jeffers, NAVSEA 05C. PCBMs provide a quick way to evaluate cost versus
capability by filling the analysis gaps between discrete early stage concepts—this enables exploration of
a more complete trade space. (Jones, Jeffers, & Greenberg) The key to why they work is that they
acknowledge that the link between performance and cost is typically not direct. For instance, programs
often wish to know the cost of requiring additional speed. Graphing cost against speed, however,
produces poor results. In reality, setting the performance parameters necessitates physical design
modifications that drive the cost. Therefore, the PBCM accepts performance inputs, uses the inputs to
13
estimate physical characteristics, then uses the physical characteristics to estimate cost. The model’s
design results do not guarantee a producible design, but the quick iteration represents a “plastic”
concept—where the whole ship could potentially be resized to advance the design to within required
parameters. (Jones, Jeffers, & Greenberg) The models also include some standard cost adjustments for
programmatic decisions such as the year and dollar type for reporting cost, the number of ships to be
built, and learning curves.
This PBCM process is modeled by creating a system of iterating equations that converge to a specific
design and rough-order-magnitude (ROM) cost. The equations are developed from historical data and
concept designs. The equation set is physics and empirically based with linear and non-linear
components, and often multivariate. Dummy variables are introduced as necessary to capture design
trend changes or discrete system options. Care is taken to minimize loop formation that would cause
variables to be regressed against them, but the model broadly loops on some parameters such as full
load displacement. (Jones, Jeffers, & Greenberg) In the instance where the concept designs
insufficiently address key tradeoffs, selective workarounds may be introduced into the loop.
The PBCM process was developed in the early 1990’s and has been used for submarine, surface ship,
auxiliary vessel and small boat analysis. Use waned when many programs shifted to steady state
production, but has resurged as new early stage acquisition programs have emerged. As the design
community embraces more extensive design-of-experiment techniques, adaption of PBCM techniques
are under investigation. (Jones, Jeffers, & Greenberg)
More detailed information, including formulas and derivations, on Performance-Based Cost Models can
be found in an article titled, “Performance-Based Cost Models” by Robert R. Jones, Michael F. Jeffers,
and Marc W. Greenberg.
14
solutions through modeling and simulation. The outputs of this modeling and simulation are used to
create ANNs, which are fed into a decision support environment that aids in the data reduction and
provides a way to use the data to perform real-time dynamic trade studies between possible
architecture configurations. The interface of the environment includes all significant inputs and outputs
of the engagement model, as well as data from a Monte Carlo simulation using the ANN models. This
helps to provide traceability when making proposals and acquisition decisions. (Griendling, Balestrini-
Robinson, & Mavris, 2008)
2 Evaluation Framework
2.1 Approach
The approach of this thesis research is a simple process of capturing the entirety of a given model design
space by using a relatively small, but robust sample of that space to train an ANN. The trained network
can then yield an infinite amount of defined design variants along with cost estimates that can be used
by the project team to accurately forecast specific costs associated with design decisions. This
information can be used even further to extrapolate specific cost correlations within the design space
and allows for project managers to better understand cause and effects earlier in the process.
2.2.1 Large High Speed Vessel (LHSV) Cost and Weight Estimation Model
The LHSV Cost and Weight Estimation Model is an internal design tool developed by the Naval Surface
Warfare Center, Combatant Craft Division (NSWC-CCD) 2 for use in supporting the JHSV AoA. The Excel
model is comprised of algorithms, and heuristic data gathered from previous combatant craft models
held in-house and modified as required to meet the needs of the JHSV AoA. The model has only been
verified and validated for use in support of JHSV high-level design feasibility studies.
2
NSWC-CCD provides full spectrum, full life cycle engineering for combatant craft, boats, watercraft and
associated hull, mechanical, electrical and electronic systems. CCD exercises Technical Authority for combatant
craft and conducts total craft systems engineering and integration.
15
major analytical methods or cost estimating techniques used to develop cost estimates for acquisition
programs: Analogy; Parametric (Statistical); Engineering (Bottoms Up); and Actual Costs.
Generally, cost estimating techniques used for an acquisition program will progress from analogies
generated based on historical data to actual costs as the program becomes more mature and more
information is known (Figure 2-1). The analogy method is most appropriate early in the program life
cycle when the system is not yet fully defined. This assumes there are analogous systems available for
comparative evaluation. For the Joint High Speed Vessel program, several vessel types were evaluated
for information, which include previous combatant craft designs along with commercially design high
speed vessels like the Swift and WestPac Express. As systems begin to be more defined, estimators are
able to apply parametric (statistical) methods. Estimating by engineering tends to occur in the latter
stages of the process when the design is fixed and more detailed technical and cost data are available.
Once the system is being produced or constructed, the actual cost method can be applied. The LHSV
cost and weight estimation model was developed, verified, and validated for use in the early concept
design stages and incorporates analogy, additive / multiplicative factors, along with simple statistical
and engineering methods to produce cost and weight estimations for the Joint High Speed Vessel
concept. The following sections give an overview of the specific techniques used during the
development of the model.
16
In order to come up with valid data, there had to be reasonable correlation between the proposed
design and the historical systems. The estimator makes a subjective evaluation of the differences
between the new design of interest and the historical designs, which was performed adequately enough
to be used for feasibility studies using the following methods.
Now that the lift fan has been adjusted for inflation, there is one more adjustment that needs to be
made to account for technical differences between the new lift fan and the current lift fan. In this step,
it is necessary to interview experts, such as engineers, asking for a technical evaluation of the
differences between a lift fan installed onboard an LCAC versus a lift fan that has been proposed to be
installed onboard a new design like the Joint High Speed Vessel (Gates & Greenberg, 2006). Based on
3
$100,000 is a fictional value used for illustration purposes only.
17
the evaluation, the developer of the model, with help from cost experts, must assess the cost impact of
the technical difference(s). Upon evaluation of the technical data, engineers estimate that the new lift
fan is %10 more complex than the current lift fan onboard an LCAC, primarily due to added control and
electronic systems. Based on the experts providing a 10% complexity factor, the current lift fan cost
would then be increased by 10% to account for the increase in complexity. The 10% complexity increase
is equal to 10% x $122,000 = $12,200 in constant 2009 dollars. Now add the cost for the added
complexity to the current lift fan cost to estimate the new lift fan cost, which is $134,200. Adjustment
factors were used extensively during the development of the LHSV Cost and Weight Estimation Model in
order to produce data that could be used in early design concept studies.
There are multiple available methods to perform DOEs and their effectiveness depends on the type of
process or model being experimented on. Two specific methods are used in this thesis and will be
discussed in detail in the following sections.
In full factorial designs, an experimental run is performed at every combination of the factor levels given
(Proust, 2008). The sample size is the product of the numbers of levels of the factors. For example, a
factorial experiment with a two-level factor, a three level-factor, and a four-level factor has 2 x 3 x 4 = 24
runs.
18
Figure 2-2 Full Factorial (Proust, 2008)
Factorial designs with only two-level factors have a sample size that is a power of two (specifically 2f,
where f is the number of factors) (Proust, 2008). When there are three factors, the factorial design
points are at the vertices of the cube as shown in Figure 2-2 above. For more factors, the design points
are the vertices of the hypercube.
Full factorial designs are the most conservative of all the experimental design types. There is little room
for ambiguity if you are able to explore all available factor combinations. Unfortunately, the sample size
grows exponentially with the number of factors, so full factorial designs are very expensive to run for
most practical purposes.
1. Prevent replicate points by spreading the design points out to the maximum distance possible
between any two points.
2. Space the points uniformly.
There are multiple space-filling designs that can be utilized. For this thesis, several variations of the
Latin hypercube method are used.
19
that takes continuous variable inputs and generates combinations of those variables to “fill” the design
space. The technique was first described by McKay in the journal Technometrics entitled, “A comparison
of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer
Code”.
In the context of statistical sampling, a square grid containing sample positions is a Latin Square (Figure
2-3) if and only if there is only one sample in each row and each column. A Latin hypercube is the
generalization of this concept to an arbitrary number of dimensions, whereby each sample is the only
one in each axis-aligned hyper plane containing it.
1 2 3
2 3 1
3 1 2
Figure 2-3 Latin Square
When sampling a function of N variables, the range of each variable is divided into M equally probable
intervals. M sample points are then placed to satisfy the Latin hypercube requirements; not that this
forces the number of divisions, M, to be equal for each variable. Also note that this sampling scheme
does not require more samples for more dimensions (variables); this independence is one of the main
advantages of this sampling scheme. Another advantage is that random samples can be taken one at a
time, remembering which samples were taken so far.
The maximum number of combinations for a Latin hypercube of M divisions and N variables (i.e.,
dimensions) can be computed using the following formula:
∏𝑁𝑁
𝑛𝑛=0(𝑀𝑀 − 𝑛𝑛)
𝑁𝑁−1
,
For example, a Latin hypercube of M = 4 divisions with N = 2 variables (i.e., a square) will have 24
possible combinations. A Latin hypercube of M = 4 divisions with N = 3 variables (i.e., a cube) will have
576 possible combinations.
20
dendrites, the cell body, and the axon. The dendrites are receptive networks of nerve fibers that carry
electrical signals into the cell body. The cell body takes these signals and effectively sums and
thresholds these incoming signals. The axon is a single long fiber that carries the signal from the cell
body out to other neurons. The point of contact between an axon of one cell and a dendrite of another
cell is called a synapse. It is the arrangement of neurons and the strengths of the individual synapses,
determined by a complex chemical process, that establishes the function of the neural network (Hagan,
Demuth, & Beale, 2004). Figure 2-4 is a simplified schematic diagram of two biological neurons.
A naturally occurring nervous system is highly plastic. It is this plasticity that helps the network to
organize itself according to the surrounding environment (Araokar). The brain, for example, can detect
a familiar face in milliseconds, whereas even a very fast computer would require hours to accomplish
the same task. In more specific language, a brain or biological nervous system performs perceptual
recognition, which even extremely complex computer networks perform with very little success. Of the
aforesaid qualities of the nervous system, perceptual learning is the most prominent one.
21
Artificial neural networks began as a means of testing natural neural networks on an experimental level.
It was then soon realized that the networks could actually be used as alternatives to the classical
computational methods. In the book, The Organization of Behavior, Hebb postulated that neurons are
appropriately interlinked by self-organization and that "an existing pathway strengthens the connections
between the neurons". For example, the memory-storage capacity of the human brain is determined to
be of as many as 4 Terabytes! This is the capacity of more than 200 hard disks of 20 GB capacity put
together. This superiority is believed to be due to the efficient networking and learning in natural
neurons, which has not been accomplished artificially. Current research is therefore aimed towards
developing artificial computer networks to such high efficiency in memory-storage and processing-
ability.
4
Hidden layers are layers other than the input and output layers.
22
One of the most important characteristics of neural networks is that they learn from their training
experience. Learning provides an adaptive capability that can extract nonlinear parametric relationships
from the input and output vectors without the need for a mathematical theory or explicit modeling
(Thomas Lamb, 2003). Network learning occurs during the process of weight and bias adjustment such
that the outputs of the neural network for the selected training inputs match the corresponding training
outputs in a minimum root mean square (RMS) error sense. After training, neural networks have the
capability to generalize; i.e., produce useful outputs from inputs not previously seen by the model. This
generalization is achieved by utilizing the information stored in the adjusted neuron weights and biases
to decode the new input patterns. Theoretically, a neural network can approximate a complicated
nonlinear relationship between input and output provided there are enough hidden layers containing
enough nonlinear neurons. A more detailed explanation of artificial neural network architectures and
algorithms is given in Appendix A.
Training cases (design variants) are then developed through the use of design of experiments techniques
and processed through the LHSV Cost and Weight Estimation Model to create a representation of the
design space to train the artificial neural network. The modeled design variants are then used to train
the neural network by integrating the LHSV option variables (either independent or dependent) into the
network as inputs and the resulting model direct costs as the network target values.
Once the model is trained to within acceptable error ranges, an infinite number of novel variants can be
simulated by the trained neural network and produce model values and associated costs across the
entire design space.
23
3 Process Overview
The methodology undertaken to complete the thesis objectives is given in Figure 3-1. The following
sections describe each of the principal stages of the research with enough detail to allow for a complete
reproduction of the study.
Problem Scope
Generation of a
Representative
Sample Set
Development of an
ANN Model
Simulate the
Network Response
to New Inputs
Generation of Cost
Range Estimates and Cost Driver
Probability Extraction
Distributions
24
Table 3-1 LHSV Cost and Weight Estimation Model Independent Variables
For this research, several means to help narrow the scope of the problem have been identified. The first
technique is grouping sets of independent variables into essentially one independent variable that is
varied over a range of categories. For example, there are multiple armament types ranging from 50
caliber machine guns to nine millimeter handguns. If the design called for varying each of these
variables separately, this would produce an undesirable number of variants based on only armament
alone. Categorizing armament independent variables into a handful of armament levels like low,
medium, and high would narrow the number of choices while still effectively capturing this part of the
design space.
A second method is to leverage past decisions of the design project. This depends on what stage of the
design the project is in, but often, even during concept refinement, some of the significant aspects of
the design are agreed upon by all of the stakeholders and are essentially fixed.
A third method of narrowing the scope of the model is where experience is essential. Some of the
available independent variables are important to the design because they have to be there in order to
call the model robust. However, other independent variables have been known, thanks to historical
data, to be insignificant when it comes to final project costs. Also, some of the independent variables
have been known not to affect or are not affected by other design decisions. And finally, some of the
25
variables are based on only a go / no go scenario where you either have it onboard your design or you
don’t.
Independent variables chosen to be varied or fixed for artificial neural network training cases for this
type of project should be developed using the three techniques above along with a general guideline,
which is to maintain a sufficient degree of problem complexity in order to keep the relationships
between inputs and outputs unpredictable using traditional linear regression techniques (Pearce, 1997).
In the case of the LHSV Cost and Weight Estimation Model, there are several given dependent variables
to observe, which include speed and displacement. The boon of using dependent variables is that it can
be up to the project design team which other design aspects will be monitored. Parameters like the
ship’s internal density 5 can be surveyed to investigate how it changes with changes in cost. The ability
to map changes in design cost with changes in model dependent variables is important when developing
an artificial neural network because the mapping of dependent variables to model output costs allows
the project team to move from only capturing model trends based on independent variables to express
in great detail how model outputs vary with changes to dependent variables like speed and density.
Essentially the possibilities for a project team are endless when it comes to choosing model parameters
to scrutinize in order to make design decisions.
5
A ship’s density is defined as the total added weight of SWBS groups 200 through 700 divided by the ship’s
volume.
26
3.1.3 Cost Outputs
Like most designs, cost is a significant factor that often drives designs towards less desirable outcomes
in terms of other attributes like performance. The methodology of using ANNs as a way to capture a
design space was developed to allow for project managers to use this tool to allow for better
understanding of how cost will fluctuate with design decisions and how to mitigate those costs while still
being able to deliver a design which meets the needs of the stakeholder. This is accomplished by
training the ANN to accept an input and to produce an output in terms of cost, specifically direct costs.
Direct costs are costs that are obviously and physically related to a project at the time they are incurred
and are subject to influence of the project manager. Examples of direct costs include contractor-
supplied hardware and project labor, whether provided by civil service or contractor employees. (Cost
Estimating Handbook, 2005)
The key to effectively capture the design space with a small sample is by using proven statistical
techniques. For this project two methods are used and recommended when working with computer-
based modeling tools. These methods are a full factorial design and a Latin hypercube space-filling
design. Both techniques are discussed in detail in Chapter 2. MATLAB’s Model-Based Calibration
Toolbox is used to set up the experiments for this work, but there are other tools available that can
produce the same results.
27
Once the independent and dependent variables have been sorted through and selected for variation
and observation, a design of experiments (DOE) must be chosen to effectively capture the design space
and produce training data for the next step of artificial neural network development. For this type of
problem where the experimenter is trying to approximate a complex computer-based modeling tool, a
combination of a full factorial and Latin hypercube design has been deemed the most robust method at
capturing the design space.
The reason for the combination is to ensure that the extreme minimum and maximum values available
in the design space are captured so there is less of a chance that some of the design space is missed
because the sample set is relatively small when compared to the entire model spectrum. A full factorial
is used to capture the outer extremities of the model while the Latin hypercube is deployed to map
what is going on in the rest of the design space.
Something unique about the LHSV Cost and Weight Estimation model is that some of the dependent
variables are set up for categorical or discrete inputs while some are set up for continuous. This can
present problems when transferring training data from the LHSV Cost and Weight Estimation Model to
the ANN, but there are ways to circumvent this issue. One of which is documenting a categorical
variable’s output based on a continuous parameter associated with the categorical variable’s input. For
example, structural material type for the LHSV model is broken up into six categories ranging from mild
steel to Eglass / Kevlar sandwich. In order to come up with an adequate input for the ANN, instead of
documenting cost changes based on a categorical change, it can be documented as a continuous change
of the material’s yield strength, density, etc… This way each independent variable is modeled as
continuous and will ensure that there are no discontinuities in the sample data.
Once the DOE methodology has been established, value ranges for each independent variable must be
chosen to produce model variants. After the ranges have been sorted out, the next step is to design a
full factorial experiment only looking at the chosen range minimum and maximum values. This will
ensure that the outermost regions of the design space are modeled and available to train the network.
Once that is complete, a Latin hypercube is used to come up with a set of variants distributed evenly
throughout the design space in order to adequately encapsulate the heart of the space. Since the LHSV
Cost and Weight Estimation Model model is unique in that some of the independent variables have to
be discrete inputs, a stratified Latin hypercube is used so the DOE output will provide a valid mix of
discrete and continuous inputs to the LHSV Cost and Weight Estimation Model. The two are then
combined to capture the design space (Figure 3-2)
28
Figure 3-2 Example Merged Design of Experiments
Once the experimenter is content with the chosen experiments, the MATLAB Model-Based Calibration
Toolbox will output the number of variants that must be modeled back in the LHSV Cost and Weight
Estimation Model. Each JHSV variant is now modeled and the direct costs related to each variant along
with any other chosen dependent variables are documented and are used as the representative sample
set of input / target pairs that will be used to train the artificial neural network.
3.3.1 Overview
Now that the input and matching target data have been documented, it is now time to develop the
neural network model that will be trained and used to simulate LHSV Cost and Weight Estimation Model
inputs and outputs over the entire chosen design space. The chosen network type for this problem is a
function approximation or regression network where the goal is to not only determine cost outputs over
a large range of inputs, but to also be able to observe and correlate changes in cost outputs to changes
in network inputs. This type of information can provide valuable insight into both the relationships that
exist in the design space that a project manager might not otherwise be privy to using other regression
techniques or methods.
Neural networks for this project were designed, developed, and modeled using the Neural Network
Toolbox and code available through MATLAB. Other neural network tools and programs are available;
however, the MATLAB Neural Network Toolbox proved to be the most flexible and robust tool to use for
this problem.
29
3.3.2 Assemble the Training Data
Before the network can be developed and trained, the training data must be formatted correctly for it to
be fed into the network. The first step involves taking the training data that was gained from the LHSV
Cost and Weight Estimation Model and determine which variables (either independent or dependent)
that will be mapped to the target or in this case direct cost outputs.
Once the input / target pairs have been established, then it is time to determine how the training data
will be fed into the network. There are several options to accomplish this step. The first is to have the
training data in an Excel spreadsheet and to set up the MATLAB code to read the input / target data
from there. Because the LHSV Cost and Weight Estimation Model is itself a Microsoft Excel spreadsheet,
chances are the output data was also stored in one and therefore it is convenient to leave the training
data there and have MATLAB bring it into the network as necessary. Another way is to import the data
into MATLAB and set up the code to read the data from there. There are multiple ways to import the
training data and using the MATLAB help feature will guide the user.
30
3.3.3.1 Inputs and Targets
To define a function approximation problem for the MATLAB toolbox, arrange a set of input vectors as
columns in a matrix. Then, arrange another set of target vectors (the correct output vectors for each of
the input vectors) into a second matrix. For example, define a function approximation problem for a
Boolean AND gate with four sets of two-element input vectors and one-element targets as follows:
Inputs = [0 1 0 1; 0 0 1 1];
Targets = [0 0 0 1];
The number of network inputs and size of the network input is not the same thing. The number of
inputs defines how many sets of vectors the network receives as an input. The size of each input (i.e.,
the number of elements in each input vector) is determined by the input size. Most networks have only
one input whose size is determined by the problem.
What would happen if the input / target data was not preconditioned before entering the network? In
the case of the LHSV Cost and Weight Estimation Model, the output data (direct cost) would be quite
large when compared with the input – over $100,000,000 on average. To map such large targets to such
smaller inputs, the weight from input to output must become very small – about 0.000001. Now
assume that that the target is 10% away from the optimal value. This would cause an error of
0.0000001*100,000,000 = 10 at the output. At learning rate α, the weight change resulting from this
error would be α*100,000,000*10 = 1,000,000,000 α. For stable convergence, this should be smaller
than the distances to the weight’s optimal value: 1,000,000,000 α < 0.0000001, giving us α < 10-16, a
very small learning rate (Orr, 1999).
This is a significant problem if each neuron bias has a constant output of 1; a bias weight that is 0.1 away
from its optimal value would have a gradient of 0.1. At a learning rate of 10-16, it would take
1,000,000,000,000,000 steps to move the bias weight by this distance. This is a clear case of
31
inappropriate conditioning cause by vastly different scales of the input and target values. One solution
is to normalize the values before introducing them into the network. There are several methods to
choose from when normalizing the data. For this discussion it is assumed that each value is “squashed”
using a linear compression formula:
This step will squash or normalize the data to where all of the values are in similar states of magnitude
and will allow for a smoother process of the data by the neural network.
Similarly, network outputs can also have associated processing functions. Output processing functions
are used to transform user-provided target vectors for network use. Then, network outputs are reverse-
processed using the same functions to produce output data with the same characteristics as the original
user-provided targets.
Not that it is common for the number of inputs to a layer to be different from the number of neurons. A
layer is not constrained to have the number of its inputs equal to the number of its neurons. The only
exception is in the output layer. The number of neurons in the output layer must equal to the number
of outputs in the target vector.
32
Figure 3-3 Single-Layer Network (Hagan, Demuth, & Beale, 2004)
A network can have several layers. Each layer has a weight matrix W, a bias vector b, and an output
vector a. Each layer of a multilayer network can also have a different role within a network. A layer that
produces the network output is called an output layer. All other layers are called hidden layers.
It is important to remember that when developing network architectures, more neurons requires more
computation, but they allow the network to solve more complicated problems.
Figure 3-4 Log-Sigmoid Transfer Function (Hagan, Demuth, & Beale, 2004)
The log-sigmoid transfer function takes the input, which can have a value anywhere between plus and
minus infinity, and squashed the output to a value between 0 and 1. Multiple layers of neurons with
nonlinear transfer functions allow the network to learn both linear and nonlinear relationships between
input and output vectors. (Demuth, Beale, & Hagan, 2009)
33
3.3.3.3 Network Initialization
Neural networks can be highly versatile and efficient in adapting to data when approximating nonlinear
Functions, however, these qualities can be achieved only if neural networks are initialized properly. In
Chapter 2, it was said that in a typical situation, initial weights and biases should be chosen as small,
random values. That way the network stays away from a possible saddle point at the origin and does
not drift out to the flat regions of the performance surface. To accomplish this task, it is recommended
that the Nguyen-Widrow initialization function be used.
The Nguyen-Widrow algorithm generates initial weight and bias values for a layer so that the active
regions of the layer’s neurons are distributed approximately evenly over the input space. The values
contain a degree of randomness, so they are not the same each time the function is called (Demuth,
Beale, & Hagan, 2009). Advantages over purely random weights and biases are:
1. Few neurons are wasted (because all of the neurons are in the input space).
2. Training works faster (because each area of the input space has neurons).
1 1 2
𝑚𝑚𝑚𝑚𝑚𝑚 = ∑𝑄𝑄𝑘𝑘=1 𝑒𝑒(𝑘𝑘)2 = ∑𝑄𝑄𝑘𝑘=1�𝑡𝑡(𝑘𝑘) − 𝑎𝑎(𝑘𝑘)� .
𝑄𝑄 𝑄𝑄
As each input is applied to the network, the network output is compared to the target. The error is
calculated as the difference between the target output and the network output. The goal of the
algorithm is to minimize the average of the sum of these errors. The LMS algorithm adjusts the weights
and biases of the ANN so as to minimize this mean square error.
The mean square error performance index for the ANN is a quadratic function. Therefore, the
performance index with either has one global minimum, a weak minimum, or no minimum, depending
on the characteristics of the input vectors. Specifically, the characteristics of the input vectors
determine whether or not a unique solution exists (Demuth, Beale, & Hagan, 2009).
3.3.3.4.1 Generalization
In most cases the multilayer network is trained with a finite number of examples of proper network
behavior (i.e., the input and target data). This training set is normally representative of a larger class of
34
possible input / output pairs. Therefore, it is important that the network successfully generalize what it
has learned for the whole population.
For a network to be able to generalize, it should have fewer parameters than there are data points in
the training set (Hagan, Demuth, & Beale, 2004). If the number of parameters in the network is much
smaller than the total number of points in the training set, then there is little to no chance of overfitting.
If it is possible to collect more data and increase the size of the training set, then there is no need to
worry about using techniques like early stopping and regularization, which are explained in more detail
in Appendix A, to prevent overfitting.
35
3.3.4.1.4 Levenberg-Marquardt
This algorithm is a hybrid between a quasi-Newton’s method and gradient based methods that were
designed for minimizing functions that are sums of squares of other non-linear functions. It requires the
calculation of the Jacobian and some matrix multiplication operations. This algorithm is well suited to
neural network training where the performance index is the mean squared error. This is also the fastest
training method available in the MATLAB neural network toolbox, but it requires a large amount of
computer memory when compared with other available algorithms.
6
The Bayesian Regularization approach involves modifying the usually used objective function, such as the mean sum of
squared network errors
36
3.3.4.3 Performance Goal
During training the weights and biases of the ANN are iteratively adjusted to minimize the network
performance function (mean square error). The performance goal is the least mean square error that
the network should train to. This value can be set to whatever value the user wants and only depends
on the level of accuracy needed for simulated values extracted from the network later on.
When generating and testing network architecture, run the network multiple times at each
configuration to ensure that the network is stable and produces similar outputs from each previous
training cycle.
37
The network is trained using the Levenberg-Marquardt algorithm. The application randomly divides
input vectors and target vectors into three sets as follows:
• 60 percent are used for training.
• 20 percent are used to validate that the network is generalizing and to stop training
before overfitting.
• The final 20 percent are used as a completely independent test of network
generalization.
During training, the Neural Network Training GUI (Figure 3-5) will open. This window displays training
progress and allows training to be interrupted at any point by clicking the Stop Training button.
Figure 3-5 MATLAB Neural Network Training Tool (Demuth, Beale, & Hagan, 2009)
38
This training stopped when the validation error increased for six iterations, which occurred at iteration
17. If the Performance button is clicked in the training window, a plot of the training errors, validation
errors, and test errors appears, as shown in Figure 3-6. In this example, the result is reasonable because
of the following considerations:
4
Best Validation Performance is 14.7575 at epoch 11
10
Train
Validation
3 Test
10 Best
Mean Squared Error (mse)
2
10
1
10
0
10
-1
10
0 2 4 6 8 10 12 14 16
17 Epochs
39
A simple analysis of the network response can be done by clicking the Regression button in the training
window. This performs a linear regression analysis between the network outputs and the corresponding
targets. Figure 3-7 shows the results of the example network response.
Output~=0.99*Target+0.25
Data Data
Output~=0.86*Target+3
Fit 40 Fit
40
Y= T Y= T
30
30
20
20
10
10
0
10 20 30 40 50 0 10 20 30 40 50
Target Target
50 Data 50 Data
Output~=0.84*Target+2.9
Output~=0.94*Target+1.3
Fit Fit
40 40
Y= T Y= T
30
30
20
20
10
10
0
10 20 30 40 50 0 20 40
Target Target
The R-value is over 0.95 for the total response. If even more accurate results are required, the following
approaches can be used:
• Reset the initial network weights and biases to new values and train again.
• Increase the number of hidden neurons.
• Increase the number of training vectors.
• Increase the number of input values.
• Try a different training algorithm.
40
applications. These two methods are only used when incremental training is preferred. Levenberg-
Marquardt training is normally used for small to medium size networks, if the computer being used is
powerful enough.
The error surface of a nonlinear network is complex. The problem is that nonlinear transfer functions in
multilayer networks introduce many local minima in the error surface. As gradient descent is performed
on the error surface it is possible for the network solution to become trapped in one of these local
minima. This can happen, depending on the initial starting conditions. Settling in a local minimum can be
good or bad depending on how close the local minimum is to the global minimum and how low an error
is required. In any case, be cautioned that although a multilayer backpropagation network with enough
neurons can implement just about any function, backpropagation does not always find the correct
weights for the optimum solution. You might want to reinitialize the network and retrain several times
to guarantee that you have the best solution. (Demuth, Beale, & Hagan, 2009)
Picking a learning rate for a nonlinear network is a challenge. A learning rate that is too large leads to
unstable learning; conversely, a learning rate that is too small results in incredibly long training times.
Networks are also sensitive to the number of neurons in their hidden layers. Too few neurons can lead
to under fitting. Too many neurons can contribute to overfitting, in which all training points are well
fitted, but the fitting curve oscillates wildly between these points. Ways of dealing with these issues are
discussed in Appendix A.
41
3.3.5.1 Design Space Selection
A space-filling design, similar to the one used to develop the training set, is recommended to effectively
capture the entire model design space. This time however, since all of the input variables introduced
into the ANN were continuous, the experimental design can be a pure Latin hypercube, Figure 3-8.
The output from the space-filling design includes all of the values for the chosen input variables that
make up the chosen number of design variants in the trade-space. These continuous model input values
are now fed into the trained ANN resulting in a robust output of design direct costs, which can be
further analyzed using various statistical techniques. From this data, cost correlations can be discovered
using other statistical techniques, which will be discussed at the end of this chapter.
7
Probability density function of a continuous random variable is a function that describes the relative likelihood for
this random variable to occur at a given point in the observation space.
8
Cumulative distribution function completely describes the probability distribution of a real-valued random
variable.
42
To generate the cost-probability functions, the comprehensive set of input variables that was generated
using the Latin hypercube is normalized and fed into the trained ANN model. The resulting output of
direct costs is then extracted and “reversed” from normalized values back to real values and can be
plotted as histograms to generate PDFs and CDFs.
The sensitivity analysis is a simple and effective way of determining the cost correlations associated with
a ship design and provides valuable information to the project team. However, the validity of the
findings comes into question if there are significant interactions or correlations between the input
variables. Generally, it is better not to place highly correlated independent variables in the same model
for two reasons. First, it does not make scientific sense to place into a model two or three independent
variables which the modeler knows manipulate the same aspect of outcome. Second, correlation
between independent variables can reduce the power of the sensitivity analysis. If the interactions
between independent variables are possible or in question, then there are statistical techniques that can
be utilized to determine the correlation and whether or not independent variables need to be changed
or dropped from the analysis.
43
3.5.1.2 Design Parameter to Cost Covariance Assessment
To validate findings from the single-factor sensitivity analysis and to get a more solid image of how
changes to direct costs values correlate to changes in input parameters, a multivariate analysis of
covariance is performed.
Covariance analysis examines each pair of measurement variables to determine whether the two
measurement variables tend to move together – that is, whether large values of one variable tend to be
associated with large values of the other (positive covariance), whether small values of one variable
tend to be associated with large values of the other (negative covariance), or whether values of both
variables tend to be unrelated (covariance near 0 (zero)).
3.5.1.3 Scatterplots
While the covariance analysis is a quantifiable method to help better understand, scatterplots take the
same data and place it in a form that visually can show relationships between two quantitative
variables.
The combination of these analysis techniques will generate links between design decisions and project
costs which can be used to better understand how decisions affect the overall design and to eliminate
some of the risk that is inherent in ship design.
4 Case Studies
4.1 Overview
Two Case studies were formulated and executed using the evaluation process dictated in Chapter 3 to
demonstrate the ability of backpropagation neural networks to generate information about design
costs, their relationships to engineering decisions made during the design process and to show the
process’s robustness and reproducibility.
Case number one focused on a monohull variant of the Joint High Speed Vessel (JHSV). The monohull is
the most commonly used design by the United States Navy and serves as an initial test of the
methodology developed for this thesis to ensure that the output results of the artificial neural network
model make sense and to serve as a baseline to compare cost differences between the more familiar
monohull designs and novel vessel concepts. In case number two, the methodology is applied to a
catamaran variant of the JHSV. This hull form is selected because it is the Navy’s and Army’s preferred
44
design type for the high speed vessel application and can prove useful to compare results from the
catamaran to the traditional monohull.
4.2.1.1 Fixed
Table 4-1 lists the independent variables that were fixed for both case studies. One significant
difference between the monohull and catamaran fixed independent variables is ship’s length. The
monohull is fixed at 425 feet while the catamaran is fixed at 340 feet. Fixed variables were determined
based upon known delineated requirements, surface ship historical data, and a JHSV program overview
brief given by Major Chris Frey to the SNAME SD-5 / HIS Joint Dinner Meeting on 22 July 2009.
Table 4-1 Fixed Independent Parameters
4.2.1.2 Varied
Six of the available independent variables in the LHSV Cost and Weight Estimation Model were
manipulated to generate a representative population of possible design variants able to be modeled.
45
These variables are Structural Material Type, Ship Armament, Ship Length to Beam Ratio, Command and
Control, Installed Ship Horsepower, and Troop Capacity. These parameters were selected for three
reasons: (1) because they were thought to have the most affect on total direct cost when compared to
the other available independent variables available in the model, (2) the mission profile of the JHSV, and
(3) to be used as proxy variables 9 for some dependent variables like ship’s internal density that are not a
direct input to or output from the model, but are interesting to observe in terms of the relationships
between them and design costs.
9
In statistics, a proxy variable is something that is probably not in itself of any real interest, but from which a
variable of interest can be acquired. In order for a proxy variable to be valid, it must have a close correlation to the
inferred value.
46
to ensure that the design can withstand both onboard and seaborne loads while underway; therefore, it
is important to be able to model and understand how material decisions affect overall design costs.
There are six structural material choices available in the LHSV Cost and Weight Estimation Model; (1)
Mild Steel, (2) High Strength Steel, (3) High Yield (HY) 80 Steel, (4) Aluminum (Al) 5454, (5) Aluminum
(Al) 5456, and (6) Eglass-Kevlar Sandwich. Each of these materials was modeled along with the other
five parameters as fixed and plotted versus total direct cost for each design type.
An important note when dealing with hull material type in the LHSV Cost and Weight Estimation Model
is that it is a categorical variable. In order to develop a valid ANN, it was necessary to capture model
parameters as continuous variables. For structural material type, the property yield strength, Table 4-3,
was observed as the continuous variable for changes in material type and those values were plotted
versus total direct cost, Figure 4-1.
47
Structural Material Type
$300,000,000.00
$250,000,000.00
Monohull Catamaran
Changes to the yield strength causes non-linear changes to direct cost. The spike in cost at yield
strength of 8 TSI is possibly due to the Eglass-Kevlar sandwich being difficult to fabricate and assemble
for ship applications. If this hull type was eliminated as a possible selection, then the data would show a
more linear trend with changes in yield strength. However, because composite materials and sandwich
panels are becoming more common in ship structural design, it was deemed important to include those
inflections and it will be interesting to see how the design space will be affected by the inclusion of the
Eglass-Kevlar sandwich material.
The catamaran design looks to cost slightly more at each value of yield strength. It is hypothesized that
the difference was the result of the catamarans baseline beam value is larger than the monohull design
and that this value offsets the fact that the length overall was a smaller value than the monohull.
48
Table 4-4 Armament Configurations
Figure 4-2 shows the plot of ship armament weight for the monohull and catamaran designs expressed
in long tons versus total direct Cost. The graph shows that an increase in ship armament weight
produces a linear to curvilinear change in total direct cost. It is difficult to infer what the direct cost
trend is from only three data points, but because armament is the only parameter that has any effect on
SWBS 700 direct cost, ship’s armament was included in the ANN training cases.
Ship's Armament
$86,000,000.00
Total Direct Cost ($)
$84,000,000.00
$82,000,000.00
$80,000,000.00
$78,000,000.00
$76,000,000.00
$74,000,000.00
0 1 2 3 4 5
Armament Weight (LT)
Monohull Catamaran
49
catamarans tend to be more rectangular or cube-shaped in appearance. The length to beam ratio range
looked at during study is 3.23 to 3.77 with a fixed length of 340 feet.
Monohull Catamaran
Figure 4-3 shows a plot of total direct cost versus ship length to beam ratio for the monohull and
catamaran designs. The resulting trend is curvilinear and as the length to beam ratio decreases (i.e., the
beam increases), the slope of the curve seems to increase.
Using information provided in the JHSV AoA, five C4I variants were modeled for both the monohull and
catamaran designs and the total direct cost output was observed to see the resulting trend. Figure 4-4
shows how total direct cost varies with increases to command and control complexity. As command and
control component weight increases, the total direct cost increases in a non-linear fashion.
50
Command and Control Integration
(C4I)
$200,000,000.00
Monohull Catamaran
For the JHSV, the number of propulsion engines onboard has been fixed to four. The parameters engine
type and size are varied to determine effects on total direct cost. The two engine types available are
diesel and gas turbine. The engine size ranges from 12,200 to 33,800 horsepower. The LHSV Cost and
Weight Estimation Model is set up to switch from diesel to gas turbine at 13,000 horsepower. Figure 4-5
shows how total direct cost varies over four different engine configurations.
$100,000,000.00
$80,000,000.00
$60,000,000.00
$40,000,000.00
$20,000,000.00
$0.00
40000 60000 80000 100000 120000 140000
Horsepower
Monohull Catamaran
The plot shows other than linear trend in total direct cost as horsepower is increased for both designs.
This is a combination of engine size and the change from diesel to gas turbine at 13,000 horsepower.
51
4.2.1.3.6 Troop Capacity
The primary mission of the JHSV is to transfer payload (i.e., troops and cargo) from one point to another.
To capture this parameter, the ability of the JHSV to carry troops was modeled over a range of 0 to 1000
personnel. The LHSV Cost and Weight Estimation Model assumes that there is a total of 20,000 square
feet of payload area available to a combination of troops and other cargo. When there are zero troops
onboard, the JHSV can carry 520 long tons of cargo. If the JHSV is required to carry between 0 and 500
troops, the ship can sustain itself underway for 15 days without returning to port or underway
replenishment. If the troop number is greater than 500, the ship converts to a short range operation
and can only sustain itself underway for a maximum of one-half day. Figure 4-6 shows the effect on
total direct cost versus number of troops onboard.
Troop Capacity
$88,000,000.00
$86,000,000.00
Total Direct Cost
$84,000,000.00
$82,000,000.00
$80,000,000.00
$78,000,000.00
$76,000,000.00
$74,000,000.00
$72,000,000.00
0 200 400 600 800 1000 1200
Troops
Monohull Catamaran
From 0 to 500 troops there is a linear increase in the total direct cost of the JHSV. There is also a smaller
linear increase in total direct cost when troop capacity ranges from greater than 500 to 1000 troops.
The larger increase from 0 to 500 occurs because the ship is required to sustain it for 15 days. This
requires increased levels in accommodations that are not required for a one-half day mission. Each
curve exhibits linear behavior but with a large step change when troop size increases to levels above
500.
52
be able to capture a complex design space is a valid way to map these types of parameters and their
relationships to design costs fairly easily and accurately.
The following dependent parameters are selected to be inputs for training the ANN because they are
often found or thought to be significant to overall design costs and often hard to prove or determine
how they affect final design costs until the final design is completed and at the waterfront.
4.2.2.3 Displacement
The displacement is equal to the JHSV’s full load weight expressed in Long Tons.
4.2.2.4 Speed
The JHSV’s speed is calculated at full load using 90% of ship’s installed horsepower and is expressed in
knots.
53
Installed electrical power is generated by two diesel engines and depends on multiple parameters, like
HVAC, lighting, etc…, that vary with changes to independent variable inputs.
10
A space-filling design that allows for a mixture of discrete and continuous variables
54
Table 4-6 gives the six independent parameters that were discussed earlier in the chapter along with
variable ranges for each. It was important to note that although some of the independent variables
were discrete inputs to the LHSV Cost and Weight Estimation Model, all outputs from the model were
documented as continuous and the continuous data was used to train the ANN.
For the monohull design, 304 test cases were used to capture the model. Capturing the design space
was an arbitrary process and was performed based on previous modeling experience and ANN
requirements. 64 of the cases were generated using a full factorial design while 240 of the cases were
generated through the stratified Latin hypercube. Figure 4-7 illustrates a four-dimensional plot of the
LHSV Cost and Weight Estimation Model design space with Command and Control Integration (C4I),
Length to Beam Ratio, and Armament on the principle axes. Each point in the design space represents a
model variant color coded based on the fourth dimension, installed horsepower. Other variable
combinations using the same plotting techniques show similar illustrations of the design space.
55
Figure 4-7 MONOHULL Training Cases
Each of the 304 training cases modeled using the LHSV Cost and Weight Estimation Model ensured that
all convergence criteria were met. The resulting variant costs and other parameter outputs are
documented and formatted to be used to train the ANN.
56
4.3.2.2 Create the Network Object
The network object was generated based on what neural network software package was being used. In
the MATLAB neural network toolbox there are methods given which show how ANNs are developed and
tested. Examples of the ANN code developed for the monohull study are given in Appendix B.
Several performance factors were checked to look for network robustness and suitability. The first was
the performance or mean square error between the input and target values. The network code for the
JHSV was set up to quit training if the mean square error reaches 1*10-5 or 0.00001. Time to train was
also monitored and set at a maximum of 90 seconds to ensure that the network was not stuck in a local
minimum. The number of epochs was monitored and set to a maximum of 500 to ensure that the
network did not over learn or be left to sit in a local minimum. Table 4-7 shows the average values for
each of the five network training runs over each of the different learning rates in stage one. Case
number four was chosen as the best because it exhibited the best performance overall and was refined
further in stage two.
57
Table 4-7 MONOHULL Stage One ANN Test Results
Case # #HL # neuron Initialization Fcn Divide Fcn Perform Fcn Train Fcn Learning Rate Performance Time Epochs Max Error Stop Cause Regression
1 1 5 Nguyen-Widrow random mse trainlm 0.01 0.007392 1.6 29.2 0.5248 Validation 0.962
2 1 5 Nguyen-Widrow random mse trainlm 0.02 0.007388 1.8 35.8 0.641 Validation 0.96364
3 1 5 Nguyen-Widrow random mse trainlm 0.03 0.007596 2.8 52.4 0.5678 Validation 0.96144
4 1 5 Nguyen-Widrow random mse trainlm 0.04 0.007118 2.2 41.8 0.558 Validation 0.9643
5 1 5 Nguyen-Widrow random mse trainlm 0.05 0.00735 2 40 0.557 Validation 0.9597
6 1 5 Nguyen-Widrow random msereg trainbr 0.01 10.585 2 30 0.504589 Validation 0.9613
7 1 5 Nguyen-Widrow random msereg trainbr 0.02 10.784 1.6 26.6 0.5556555 Validation 0.960174
8 1 5 Nguyen-Widrow random msereg trainbr 0.03 10.142 2 31 0.6483978 Validation 0.96258
9 1 5 Nguyen-Widrow random msereg trainbr 0.04 11.048 2.2 41.8 0.5609586 Validation 0.961948
10 1 5 Nguyen-Widrow random msereg trainbr 0.05 11.308 1.4 28.8 0.5700494 Validation 0.961116
11 1 5 Nguyen-Widrow random mse traingdx variable 0.05818 4 237.2 0.8837896 Validation 0.70836
12 1 5 Nguyen-Widrow random mse traingda variable 0.0894 4.8 180.2 0.9931774 Validation 0.496474
13 1 5 Nguyen-Widrow random mse trainrp 0.01 0.01674 5 276.8 0.6643697 Validation 0.91928
14 1 5 Nguyen-Widrow random mse trainrp 0.02 0.01346 4.6 246 0.6844488 Validation 0.93231
15 1 5 Nguyen-Widrow random mse trainrp 0.03 0.018692 4.4 254 0.632178 Validation 0.90758
16 1 5 Nguyen-Widrow random mse trainrp 0.04 0.01772 3.4 201.2 0.6729779 Validation 0.917794
17 1 5 Nguyen-Widrow random mse trainrp 0.05 1.68E-02 3.2 184.4 0.5944972 Validation 0.916474
58
Table 4-8 MONOHULL Stage Two ANN Test Results
Case # #HL # neuron Initialization Fcn Divide Fcn Perform Fcn Train Fcn Learning Rate Performance Time Epochs Max Error Stop Cause Regression
1 1 5 Nguyen-Widrow random mse trainlm 0.04 2.37E-02 8 63 1 Validation 0.90214
2 1 6 Nguyen-Widrow random mse trainlm 0.04 6.52E-02 10 117 1 Validation 0.737
3 1 7 Nguyen-Widrow random mse trainlm 0.04 1.21E-03 4 37 0.1154709 Validation 0.99431
4 1 8 Nguyen-Widrow random mse trainlm 0.04 5.34E-04 9 77 0.1467501 Validation 0.99743
5 1 9 Nguyen-Widrow random mse trainlm 0.04 2.96E-02 12 94 1 Validation 0.894
6 1 10 Nguyen-Widrow random mse trainlm 0.04 1.60E-04 10 82 0.1035394 Validation 0.99899
7 1 11 Nguyen-Widrow random mse trainlm 0.04 4.61E-04 11 71 0.09505 Validation 0.99752
8 1 12 Nguyen-Widrow random mse trainlm 0.04 5.54E-02 1 11 0.9926527 Validation 0.817
9 1 13 Nguyen-Widrow random mse trainlm 0.04 6.03E-04 10 57 0.1034695 Validation 0.99705
10 1 14 Nguyen-Widrow random mse trainlm 0.04 1.12E-04 34 184 0.050714 Validation 0.99931
11 1 15 Nguyen-Widrow random mse trainlm 0.04 2.26E-04 20 103 0.0748757 Validation 0.99864
12 1 16 Nguyen-Widrow random mse trainlm 0.04 6.82E-05 69 222 0.0668017 Validation 0.99943
13 1 17 Nguyen-Widrow random mse trainlm 0.04 3.90E-04 12 58 0.0712554 Validation 0.9978
14 1 18 Nguyen-Widrow random mse trainlm 0.04 4.84E-05 40 97 0.0430514 Validation 0.9996
15 1 19 Nguyen-Widrow random mse trainlm 0.04 1.03E-04 20 78 0.0611152 Validation 0.99891
16 1 20 Nguyen-Widrow random mse trainlm 0.04 6.66E-05 41 141 0.0370584 Validation 0.99952
17 2 20-5 Nguyen-Widrow random mse trainlm 0.04 1.37E-02 8 37 0.9999019 Validation 0.93341
18 2 20-6 Nguyen-Widrow random mse trainlm 0.04 1.09E-03 7 23 0.1278056 Validation 0.99413
19 2 20-7 Nguyen-Widrow random mse trainlm 0.04 4.33E-05 90 231 0.0388927 Time 0.99939
20 2 20-8 Nguyen-Widrow random mse trainlm 0.04 3.50E-05 29 103 0.0424703 Validation 0.99931
21 2 20-9 Nguyen-Widrow random mse trainlm 0.04 2.56E-02 70 156 0.9867108 Validation 0.89664
22 2 20-10 Nguyen-Widrow random mse trainlm 0.04 1.61E-04 7 83 0.0927004 Validation 0.99894
23 2 20-11 Nguyen-Widrow random mse trainlm 0.04 9.41E-05 48 100 0.052265 Validation 0.99909
24 2 20-12 Nguyen-Widrow random mse trainlm 0.04 2.84E-04 49 65 0.1003609 Validation 0.99794
25 2 20-13 Nguyen-Widrow random mse trainlm 0.04 8.43E-05 56 109 0.0463584 Validation 0.99914
26 2 20-14 Nguyen-Widrow random mse trainlm 0.04 1.72E-04 20 31 0.0600308 Validation 0.99829
27 2 20-15 Nguyen-Widrow random mse trainlm 0.04 1.68E-04 36 31 0.0569424 Validation 0.99839
At this stage, performance and regression plots available in the MATLAB neural network toolbox were
used to allow the experimenter to visually understand how the network behaves during training. The
best performing network architecture in stage two was case 16. This network exhibits the best overall
performance and has the lowest maximum error. The performance plot for this network is given in
Figure 4-8. The training was relatively smooth. At no time during training does it appear that the
network was caught in any local minimums. Also, the test line shows that the network generalized the
LHSV Cost and Weight Estimation model well and there was no sign of over learning. The regression
plot, given in Figure 4-9, is another visual representation of how well the network generalized the
model. The overall R value, which includes the training, validation, and test cases, is 0.99959. This is a
very strong indication that the network is robust and is an adequate representation of the LHSV Cost
and Weight Estimation model.
59
0
Best Validation Performance is 0.00016973 at epoch 71
10
Train
Validation
-1
10 Test
Best
-3
10
-4
10
-5
10
-6
10
0 10 20 30 40 50 60 70
77 Epochs
60
4.3.4 Simulate the Network Response to New Inputs
Now that the network has been trained, it could be used to simulate the entire design space. To ensure
that the entire space was encompassed, another experimental design was developed. Specifically, a
Latin hypercube was used to generate cases to be used to simulate the network response to new inputs.
Table 4-9 gives the input parameters with the chosen variable range for each. It should be noted that
the input parameters used during simulation must be the same as those that were used to train the
model otherwise results will not be valid.
Figure 4-10 shows a four-dimensional (Density, Displacement, Payload Fraction, and Speed)
representation of the monohull vessel design space. The number of simulated variants was 16,384,
which includes all 9 varied input parameters. This specific number of variants selected to be simulated
was an arbitrary process and was based on only two factors: (1) the four-dimensional representation of
the design space and (2) the processing abilities of software programs like Microsoft Excel.
61
Figure 4-10 MONOHULL Simulated Cases
The resulting variants, based on the input parameter ranges, are fed into the trained ANN and results in
project range estimates and density functions that are discussed in the following section.
62
-9
x 10
9
5
Density
0
1 1.5 2 2.5 3 3.5 4 4.5
Total Direct Cost ($) 8
x 10
0.9
0.8
0.7
Cumulative probability
0.6
0.5
0.4
0.3
0.2
0.1
0
1 1.5 2 2.5 3 3.5 4 4.5
Total Direct Cost ($) 8
x 10
63
4.3.6 Cost Mapping
Mapping costs to engineering decisions involved performing three statistical analysis techniques. These
are (1) single-factor sensitivity analysis on the trained neural network, (2) analysis of covariance and (3)
scatterplots of the simulated design space data. An explanation of the process is given in Chapter 3.
Before performing these analyses, a simple correlation analysis using the Spearman’s rank correlation
algorithm was performed on the design inputs using a statistical software package to ensure that there
was only minimal correlation between the input variables.
64
4.3.6.2 Single-factor Sensitivity Analysis
A single-factor sensitivity analysis was performed by running a nine by nine identity matrix through the
trained monohull ANN. The identity matrix will turn on each input parameter individually and the
output for each parameter will be a weighted value that ranks the relative influence of each input
parameter had on each of the SWBS group direct costs. Figure 4-14 gives a stacked bar graph which
shows each input parameter’s normalized 11 effect on each of the SWBS 100-700 output costs.
According to the single-factor sensitivity results, ship’s internal density has an effect on more SWBS
group costs than any of the other input parameters. Several of the other parameters do have a
significant effect on some, but not all of the SWBS group costs. These include displacement, speed,
power density, armament, and command and control.
SWBS700 Cost
SWBS600 Cost
SWBS500 Cost
SWBS400 Cost
SWBS300 Cost
SWBS200 Cost
SWBS100 Cost
11
Each input weight value was normalized based on the minimum and maximum values of each SWBS output and
ranked versus the other parameter weight values
65
4.3.6.3 Design Parameter to Cost Covariance Assessment
Besides the single-factor sensitivity analysis, which uses the trained model to map cost to engineering
decisions, the simulated variant input and output data can be used to provide a more quantifiable
determination of relationships using multivariate analysis of covariance techniques.
The monohull simulated design space data was analyzed using a simple covariance analysis to compare
the changes in the input variables with changes to total direct cost outputs to determine where
relationships exist and if these relationships are similar to what the single-factors sensitivity analysis
results state. The results of this analysis are given in Figure 4-15.
The results of the covariance analysis between total direct cost and each of the design parameters
(Column #1) do not correspond with the results from the single-factor sensitivity analysis. As it can be
seen in Figure 4-15, total direct cost fluctuate the most with changes in the design’s displacement.
Displacement is followed by yield strength, command and control integration, L/B ratio, and speed.
Internal density follows those parameters and shows only to be more influential then payload fraction,
power density, and armament.
66
4.3.6.4 Scatterplots
Scatterplots are another analysis technique that was used to look at the simulated monohull design
space and accomplishes two things when looking at the continuous data. First, the scatterplots help
validate the previous analysis techniques by giving a visual representation of how total direct cost
changes with changes in design parameters and second they can help the design team understand how
each design variant’s parameters come together as the final design and where the cost is going. This
information can be funneled back into the design spiral at the concept stage where most decisions have
not been determined and the design is still relatively fluid allowing for changes to be made without
resulting in cost increases.
Figure 4-16 is a set of scatterplots which maps changes to each input parameter and its influence on
total direct cost for all 16,384 design variants. While it is very difficult to infer information from this plot
for each individual variant, the data point density allows the observer to see patterns that emerge for
each design parameter. For example, the command and control integration (C4I) plot shows a
significant increase in total direct cost as C4I complexity increases. For ship’s internal density (Dens), it
is not as obvious as C4I, but the point density increases in the upper ranges of total direct cost as
internal density increases.
67
Another way to examine the relationships is to use a smaller sample scatter plot, example given in
Figure 4-17 which shows a sample of 100 variants from the original 16,384 variant population. The
entire design space is not shown because the plot becomes densely filled to the point where individual
variants cannot be read. From these plots, individual variants can be highlighted and the nine input
parameters can be studied in how they differ in terms of the variants total direct cost. For example,
Variant 47 is highlighted in Figure 4-17 with a red asterisk. This variant has a total direct cost of
approximately $400,000,000. It has low payload fraction, high internal density, low displacement, low
speed, low power density, high armament, low end yield strength, medium command and control
integration, and a low length to beam ratio. For the JHSV mission, the ship will most likely have to carry
maximum payload (high payload fraction) and be faster than the typical surface combatant. Based on
the results of the simulated design space, Variant 47 does not meet those requirements and could be
thrown out as a feasible design. Variant number 90 (Green X) from the scatter plot matrix, seems to be
a more desirable design for the JHSV. It exhibits a high payload fraction, high speed, high power density,
and high command and control integration and would have a direct cost of approximately $180,000,000,
much lower than Variant 47.
68
going on in the model, unlike parametric model types, where assumptions (i.e., distribution, etc...) need
to be made in order to develop density functions. This is an important point to make because often,
design teams will budget projects based on a CDF which is formulated by using a point design with an
assumption that the distribution is a normal or bell-shaped curve. Assumptions like these introduce risk
that does not occur when using ANNs to model the design space.
Unlike the density functions, mapping design parameter choices to cost is not as clear and requires more
effort to determine relationships or correlations. The single-factor sensitivity results alone are abstract
in nature. They are difficult to analyze in a quantitative way because of the nature of the outputs. Each
parameter value from the trained ANN is a normalized weighting based on the maximum and minimum
values at each SWBS direct cost. What the sensitivity results are good for in this form is a visual,
qualitative way to see how each design parameter ranks versus the other chosen design parameters in
terms of influence on each SWBS group direct cost. The information gained from this analysis is
valuable, but cannot be used alone. Other quantitative analyses like the covariance checks and the
scatterplots to validate or refute those findings.
Performing the covariance analysis on the simulated data gave a more solid representation of how total
direct costs are influenced by the design parameters because it shows quantitatively how changes in
each design parameter affect the cost outputs. For the monohull, displacement seems to have the most
influence on direct cost in a negative way. In other words, as displacement decreases, cost increases.
This is followed by command and control, yield strength, and L/B ratio. For the monohull, those results
do not imitate the single-factor sensitivity ranking results. Another problem with the results is the
correlation strength between displacement, yield strength, and L/B ratio. Intuitively, costs should
increase as displacement increases. But because the Eglass-Kevlar sandwich composite material causes
a significant cost increase for less weight and yield strength, total costs increase when displacement
decreases for this design. The inclusion of the composite material has skewed the results and would
have been better to be left out of the design.
Of all of the analysis methods, the scatterplots are the more robust and unambiguous. The data point
density displayed in all of the plots clearly shows what happens to total direct cost when the design
parameters are changed. The scatterplots are also an excellent tool to perform side by side
comparisons of different design variants and what the associated costs are so the design team can make
more informed decisions.
69
4.4 Catamaran Case
For the catamaran design, 304 test cases were used to capture the model design space. This number
was selected for the catamaran to have commonality with the monohull design and to validate the ANN
methodology’s reproducibility. 64 of those cases were generated using a full factorial design while 240
of the cases were generated through the stratified Latin hypercube. Figure 4-18 illustrates a four-
dimensional plot of the LHSV Cost and Weight Estimation Model design space with Command and
Control Integration (C4I), Length to Beam Ratio, and Armament on the principle axes. Each point in the
design space represents a model variant color coded based on the fourth dimension, installed
horsepower.
70
Figure 4-18 CATAMARAN Training Cases
Each of the 304 training cases was modeled using the LHSV Cost and Weight Estimation Model ensuring
all convergence criteria was met. The resulting variant costs and other parameter outputs were then
documented and formatted to be used to train the ANN.
71
4.4.2.2 Create the Network Object
The network object was generated based on what neural network software package was being used. In
the MATLAB neural network toolbox there were methods given which show how ANNs were developed
and tested. Examples of the ANN code developed for the catamaran study are given in Appendix B.
Several performance factors were checked to look for network robustness and suitability. The first was
the performance or mean square error between the input and target values. The network code for JHSV
was set up to quit training if the mean square error reaches 1*10-5 or 0.00001. Time to train was also
monitored and set at a maximum of 90 seconds to ensure that the network was not stuck in a local
minimum. The number of epochs was monitored and set to a maximum of 500 to ensure that the
network does not over learn or be left to sit in a local minimum. Table 4-11 shows the average values
for each of the five network training runs over each of the different learning rates in stage one. Case
number five was chosen as the best because it exhibited the best performance overall and will be
refined further in stage two.
72
Table 4-11 CATAMARAN Stage One ANN Test Results
Case # #HL # neuron Initialization Fcn Divide Fcn Perform Fcn Train Fcn Learning Rate Performance Time Epochs Max Error Stop Cause Regression
1 1 5 Nguyen-Widrow random mse trainlm 0.01 1.06E-02 2 52 0.524658 Validation 0.959046
2 1 5 Nguyen-Widrow random mse trainlm 0.02 6.64E-03 6 142.2 0.635 Validation 0.96951
3 1 5 Nguyen-Widrow random mse trainlm 0.03 7.96E-02 5.8 139 0.56556 Validation 0.705632
4 1 5 Nguyen-Widrow random mse trainlm 0.04 3.02E-02 3.4 78.6 0.565292 Validation 0.872582
5 1 5 Nguyen-Widrow random mse trainlm 0.05 2.15E-03 1.4 32.8 0.601117 Validation 0.989158
6 1 5 Nguyen-Widrow random msereg trainbr 0.01 2.75E+00 2 43.6 0.559656 Validation 0.988738
7 1 5 Nguyen-Widrow random msereg trainbr 0.02 2.81E+00 1.4 30.2 0.655373 Validation 0.988766
8 1 5 Nguyen-Widrow random msereg trainbr 0.03 2.80E+00 1.4 36 0.034022 Validation 0.988568
9 1 5 Nguyen-Widrow random msereg trainbr 0.04 2.85E+00 2 49.4 0.559639 Validation 0.9889
10 1 5 Nguyen-Widrow random msereg trainbr 0.05 2.76E+00 2.8 62.8 0.578442 Validation 0.988894
11 1 5 Nguyen-Widrow random mse traingdx variable 3.87E-02 3.2 247.2 0.814899 Validation 0.78874
12 1 5 Nguyen-Widrow random mse traingda variable 5.65E-02 2.6 210 0.888613 Validation 0.639008
13 1 5 Nguyen-Widrow random mse trainrp 0.01 4.68E-03 16.6 470.8 0.248525 Validation 0.97656
14 1 5 Nguyen-Widrow random mse trainrp 0.02 5.31E-03 6 391.4 0.280501 Validation 0.972676
15 1 5 Nguyen-Widrow random mse trainrp 0.03 6.82E-03 4.6 319.6 0.499852 Validation 0.945378
16 1 5 Nguyen-Widrow random mse trainrp 0.04 8.74E-03 4.4 308.8 0.534069 Validation 0.950898
17 1 5 Nguyen-Widrow random mse trainrp 0.05 3.86E-03 5.6 446.2 0.259903 Validation 0.980592
73
Table 4-12 CATAMARAN Stage Two ANN Test Results
Case # #HL # neuron Initialization Fcn Divide Fcn Perform Fcn Train Fcn Learning Rate Performance Time Epochs Max Error Stop Cause Regression
1 1 5 Nguyen-Widrow random mse trainlm 0.05 2.07E-03 4 38 0.210707 Validation 0.98857
2 1 6 Nguyen-Widrow random mse trainlm 0.05 5.88E-04 4 65 0.08456 Validation 0.99708
3 1 7 Nguyen-Widrow random mse trainlm 0.05 4.24E-04 11 79 0.081326 Validation 0.99795
4 1 8 Nguyen-Widrow random mse trainlm 0.05 2.50E-04 9 122 0.085013 Validation 0.99863
5 1 9 Nguyen-Widrow random mse trainlm 0.05 5.47E-04 2 29 0.090408 Validation 0.9973
6 1 10 Nguyen-Widrow random mse trainlm 0.05 5.23E-04 2 25 0.104345 Validation 0.99709
7 1 11 Nguyen-Widrow random mse trainlm 0.05 1.01E-04 12 159 0.063507 Validation 0.99944
8 1 12 Nguyen-Widrow random mse trainlm 0.05 4.82E-05 65 310 0.036351 Validation 0.99973
9 1 13 Nguyen-Widrow random mse trainlm 0.05 5.43E-05 4 45 0.048216 Validation 0.99969
10 1 14 Nguyen-Widrow random mse trainlm 0.05 4.01E-05 32 189 0.031786 Validation 0.99975
11 1 15 Nguyen-Widrow random mse trainlm 0.05 3.77E-04 6 45 0.084094 Validation 0.99811
12 1 16 Nguyen-Widrow random mse trainlm 0.05 2.51E-05 10 73 0.030089 Validation 0.99983
13 1 17 Nguyen-Widrow random mse trainlm 0.05 1.49E-05 62 190 0.020498 Validation 0.99989
14 1 18 Nguyen-Widrow random mse trainlm 0.05 3.20E-05 45 125 0.039104 Validation 0.99976
15 1 19 Nguyen-Widrow random mse trainlm 0.05 9.85E-06 60 140 0.019392 Performance 0.99994
16 1 20 Nguyen-Widrow random mse trainlm 0.05 1.05E-05 47 103 0.019174 Performance 0.99993
17 2 20-5 Nguyen-Widrow random mse trainlm 0.05 1.30E-03 20 44 0.186766 Validation 0.99333
18 2 20-6 Nguyen-Widrow random mse trainlm 0.05 9.86E-05 58 106 0.051668 Validation 0.99943
19 2 20-7 Nguyen-Widrow random mse trainlm 0.05 2.79E-05 49 61 0.023106 Validation 0.9998
20 2 20-8 Nguyen-Widrow random mse trainlm 0.05 7.42E-05 61 106 0.035246 Validation 0.99947
21 2 20-9 Nguyen-Widrow random mse trainlm 0.05 1.60E-03 20 33 0.23764 Validation 0.97069
22 2 20-10 Nguyen-Widrow random mse trainlm 0.05 1.83E-05 90 125 0.028602 Time 0.99979
23 2 20-11 Nguyen-Widrow random mse trainlm 0.05 3.51E-05 66 81 0.028478 Validation 0.9997
24 2 20-12 Nguyen-Widrow random mse trainlm 0.05 9.66E-06 62 98 0.02105 Performance 0.99991
25 2 20-13 Nguyen-Widrow random mse trainlm 0.05 7.35E-05 43 37 0.058187 Validation 0.99933
26 2 20-14 Nguyen-Widrow random mse trainlm 0.05 9.80E-06 82 104 0.020971 Performance 0.99979
27 2 20-15 Nguyen-Widrow random mse trainlm 0.05 1.99E-04 27 27 0.062105 Validation 0.99846
At this stage, performance and regression plots available in the MATLAB neural network toolbox were
used to allow the experimenter to visually understand how the network behaves during training. The
best performing network architecture in stage two was Case 15. This network exhibits the best overall
performance and has the lowest maximum error. The performance plot for this network was given in
Figure 4-19. The training was very smooth. At no time during training does it appear that the network
was caught in any local minimums. The test line shows that the network generalized the LHSV Cost and
Weight Estimation model well and there was no sign of over learning. The regression plot, given in
Figure 4-20, was another visual representation of how well the network generalized the model. The
overall R value, which includes the training, validation, and test cases, was 0.99994. This was a very
strong indication that the network was robust and an adequate representation of the LHSV Cost and
Weight Estimation model.
74
0
Best Validation Performance is 0.00010406 at epoch 70
10
Train
Validation
-1
10 Test
Best
-3
10
-4
10
-5
10
-6
10
0 10 20 30 40 50 60 70
76 Epochs
75
Table 4-13 gives the input parameters with the chosen variable range for each. It should be noted that
the input parameters used during simulation must be the same as those that were used to train the
model otherwise results will not be valid.
Figure 4-21 shows a four-dimensional representation of the monohull vessel design space. The number
of simulated variants was 16,384. This specific number of variants was selected because it adequately
captures the model design space and does not exceed processing abilities of some computer software
packages.
76
Figure 4-21 CATAMARAN Simulated Cases
The resulting variants, based on the input parameter ranges, were fed into the trained ANN and results
in project range estimates and density functions that are discussed in the following section.
77
-8
x 10
1
0.9
0.8
0.7
0.6
Density
0.5
0.4
0.3
0.2
0.1
0
1 1.5 2 2.5 3 3.5 4
Data 8
x 10
0.9
0.8
0.7
Cumulative probability
0.6
0.5
0.4
0.3
0.2
0.1
0
1 1.5 2 2.5 3 3.5 4
Data 8
x 10
78
4.4.6 Cost Mapping
Mapping costs to engineering decisions involved performing three statistical analysis techniques. These
are (1) single-factor sensitivity analysis on the trained neural network, (2) analysis of covariance and (3)
scatterplots of the simulated design space data. An explanation of the process is given in Chapter 3.
Before performing these analyses, a simple correlation analysis using the Spearman’s rank correlation
algorithm was performed on the design inputs using a statistical software package to ensure that there
was only minimal correlation between the input variables.
79
4.4.6.2 Single-factor Sensitivity Analysis
A single-factor sensitivity analysis was performed by running a nine by nine identity matrix through the
trained monohull ANN. The identity matrix will turn on each input parameter individually and the
output for each parameter will be a weighted value that ranks the relative influence of each input
parameter had on each of the SWBS group direct costs. Figure 4-25 gives a stacked bar graph which
shows each input parameter’s normalized 12 effect on each of the SWBS 100-700 output costs.
According to the single-factor sensitivity results, ship’s displacement has an effect on more SWBS group
costs than any of the other input parameters. Several of the other parameters do have a significant
effect on some, but not all of the SWBS group costs. These include internal density, speed, power
density, armament, and command and control.
SWBS700 Cost
SWBS600 Cost
SWBS500 Cost
SWBS400 Cost
SWBS300 Cost
SWBS200 Cost
SWBS100 Cost
12
Each input weight value was normalized based on the minimum and maximum values of each SWBS output and
ranked versus the other parameter weight values
80
4.4.6.3 Design Parameter to Cost Covariance Assessment
Besides the single-factor sensitivity analysis, which uses the trained model to map cost to engineering
decisions, the simulated variant input and output data can be used to provide a more quantifiable
determination of relationships using multivariate analysis of covariance techniques.
The monohull simulated design space data was analyzed using a simple covariance analysis to compare
the changes in the input variables with changes to total direct cost outputs to determine where
relationships exist and if these relationships are similar to what the single-factors sensitivity analysis
results state. The results of this analysis are given in Figure 4-26.
The results of the covariance analysis between total direct cost and each of the design parameters
(Column #1) better correspond with the results from the single-factor sensitivity analysis for the
catamaran then they did for the monohull design. As it can be seen in Figure 4-15, total direct cost
fluctuate the most with changes in the design’s displacement. Displacement is followed by yield
strength, command and control integration, L/B ratio, and speed. Internal density follows those
parameters and shows only to be more influential then payload fraction, power density, and armament.
4.4.6.4 Scatterplots
Scatterplots are another analysis technique that was used to look at the simulated monohull design
space and accomplishes two things when looking at the continuous data. First, the scatterplots help
validate the previous analysis techniques by giving a visual representation of how total direct cost
changes with changes in design parameters and second they can help the design team understand how
each design variant’s parameters come together as the final design and where the cost is going. This
information can be funneled back into the design spiral at the concept stage where most decisions have
81
not been determined and the design is still relatively fluid allowing for changes to be made without
resulting in cost increases.
Figure 4-27 is a set of scatterplots which maps changes to each input parameter and its influence on
total direct cost for all 16,384 design variants. While it is very difficult to infer information from this plot
for each individual variant, the data point density allows the observer to see patterns that emerge for
each design parameter. For example, the command and control integration (C4I) plot shows a
significant increase in total direct cost as C4I complexity increases. For ship’s internal density (Dens), it
is not as obvious as C4I, but the point density increases in the upper ranges of total direct cost as
internal density increases.
Scatter plots can also be used to analyze and compare a smaller sample of variants. An example, given
in Figure 4-28, shows a sample of 100 variants from the original 16,384 variant population produced
from the simulated catamaran design space. The entire design space is not shown because the plot
becomes densely filled to the point where it cannot be read. From these plots, individual variants can
be highlighted and the nine input parameters can be studied in how they differ in terms of the variants
total direct cost. Variant 16,001 is highlighted in the Figure 4-26 with a red asterisk. This variant, with a
total direct cost of approximately $260,000,000, has low payload fraction, high internal density, low
displacement, low speed, medium power density, high armament, low end yield strength, medium
command and control integration, and a low length to beam ratio. For the JHSV mission, the design will
most likely have to carry maximum payload (high payload fraction) and be faster than the typical surface
combatant. Based on the results of the simulated design space, variant 16,001 does not meet those
requirements and could be thrown out as a feasible design. Variant number 4,251 (Green X) from the
scatter plot matrix seems to be a more feasible design for the JHSV. It exhibits a high payload fraction,
82
high speed, high power density, high command and control integration, and would have a direct cost of
approximately $210,000,000, much lower than variant 16,001.
Unlike the density functions, mapping design parameter choices to cost is not as clear and requires more
effort to determine relationships or correlations. The single-factor sensitivity results alone are abstract
in nature. They are difficult to analyze in a quantitative way because of the nature of the outputs. Each
parameter value from the trained ANN is a normalized weighting based on the maximum and minimum
values at each SWBS direct cost. What the sensitivity results are good for in this form is a visual,
qualitative way to see how each design parameter ranks versus the other chosen design parameters in
terms of influence on each SWBS group direct cost. The information gained from this analysis is
83
valuable, but cannot be used alone. Other quantitative analyses like the covariance checks and the
scatterplots to validate or refute those findings.
Performing the covariance analysis on the simulated data gave a more solid representation of how total
direct costs are influenced by the design parameters because it shows quantitatively how changes in
each design parameter affect the cost outputs. For the catamaran, displacement seems to have the
most influence on direct cost in a negative way. In other words, as displacement decreases, cost
increases. This is followed by command and control, yield strength, and L/B ratio. For the monohull,
those results do not imitate the single-factor sensitivity ranking results. A problem with the results is
the correlation strength between displacement, yield strength, and L/B ratio. Intuitively, costs should
increase as displacement increases. But because the Eglass-Kevlar sandwich composite material causes
a significant cost increase for less weight and yield strength, total costs increase when displacement
decreases for this design. The inclusion of the composite material has skewed the results and would
have been better to be left out of the design.
Of all of the analysis methods, the scatterplots are the more robust and unambiguous. The data point
density displayed in all of the plots clearly shows what happens to total direct cost when the design
parameters are changed. The scatterplots are also an excellent tool to perform side by side
comparisons of different design variants and what the associated costs are so the design team can make
more informed decisions.
Unlike the monohull design, all three analysis results show that the design parameter displacement has
the strongest correlation to changes in direct costs.
84
5 Conclusion
0.9
0.8 Catamaran
0.7
Monohull
Cumulative probability
0.6
0.5
0.4
0.3
0.2
0.1
0
1 1.5 2 2.5 3 3.5 4 4.5
Total Direct Cost ($) 8
x 10
Figure 5-1 shows the CDFs for both designs and it can be seen that both curves exhibit the same
behaviors with a change in slopes at a direct cost of approximately $200 million. However, the “knee” in
the curve occurs for each design at a different cumulative probability with the catamaran occurring at 80
percent while the monohull’s occurs at 70 percent.
Based on the cost data extracted from the simulated design space, the catamaran is the best design for
the JHSV based on cost and risk factors. The catamaran variant will accomplish the same mission
functions as the monohull but has the possibility to incur $35 million less in direct costs than the
monohull design. Also, there is a 10 percent greater chance that the direct cost of the catamaran will be
less than or equal to $200 million than for the monohull design.
85
5.1.2 ANNs Enable Robust, Efficient Modeling of the Design Space
For decision-makers, a range estimate with an understanding of the certainty of how likely it is to occur
within that range is generally more useful than a point estimate (Cost Estimating Handbook, 2005). The
ANNs developed for this thesis effectively capture the entire model design space by generating a
comprehensive set of model variants which can be used to make more informed design decisions as
opposed to point estimates which are often used in the ship design process. The number of variants
that can be modeled is limited only by design team requirements.
The design’s PDF and CDF give and accurate representation of how direct costs will vary with changes to
input parameters and because of the number of simulated cases, the smoothness of the generated
curves allows for a more accurate understanding of the design space when small iterations are made to
the design. Also, because the curves are non-parametric in nature, the ANN output data and curves are
a more realistic representation of what is going on in the model, unlike parametric model types, where
assumptions (i.e., distribution, etc...) need to be made in order to develop density functions. This is an
important point to make because often, design teams will budget projects based on a CDF which is
formulated by using a point design with an assumption that the distribution is a normal or bell-shaped
curve. Assumptions like these introduce risk that does not occur when using ANNs to model the design
space.
5.1.3 Potential Impacts of Cost Mapping and Risk Information on the Ship Design Process
The ability to extract relationships between engineering decisions and project direct costs holds promise
for improved management of the planning and design processes in ship design. It will enable project
stakeholders to make better informed decisions when specifying initial design parameters, and prevent
design decisions from being made without merit when those decisions have the potential to significantly
impact design costs. Also, with the newfound awareness of the potential impacts of engineering
decisions early in the design process, information from future designs can be compiled in a manner to
provide real case data to develop more robust ANN models.
In addition, knowledge of the cost / probability functions will enable project stakeholders to better
estimate the risk of cost variance from the initial conceptual estimate, facilitating the budgeting process
and help promote “tighter” design control when the risk of cost variance is unacceptability high.
Together, these project cost control tools can help improve project performance in terms of cost by
providing quantitative data previously available to project stakeholders only through heuristic data.
86
5.1.4 Applicability and Validity of Cost Mapping Methodology
The validity of mapping costs to engineering inputs is a valid one as long as several shortfalls can be
avoided that were not during this research. Based on both case studies, the three data analyses
performed did an adequate job of showing the positive and negative influences each design parameter
had on direct costs. Having five of the six analyses (monohull and catamaran combined) show that
changes in displacement have the largest influence on direct cost is a fair indication that the
methodology is a valid one when all three analyses are used.
A shortfall that was demonstrated during this research and should be avoided are looking out for and
avoiding input variables that have strong correlations with one another. Displacement had the most
influence on direct costs, but the results were counter-intuitive because as displacement decreased, the
cost increased. This occurred because of the relationships between Displacement, Yield Strength, and
Length to Beam Ratio. Because the composite material is much more expensive than the other material
options for less weight and lower yield strength, the model results show that positive changes in
displacement cause strong negative changes in direct costs. This can be misleading because in real
practice, increases in a ship’s displacement usually cause increases to a design’s cost.
There were other correlations between some of the other variables that could have negative influence
on the final analysis results that are not discussed in detail. These include the relationship between
speed and internal density for both designs. This correlation does make sense because as speed
increases, two things are likely happening. Either the ship is getting smaller which causes the design to
require smaller propulsion equipment, but cause internal density to increase or the more power is
installed to reach speeds for a larger vessel which also causes internal density to increase.
In conclusion, even though the mapping results displayed in this research are not as lucid as they need
to be for a design team to effectively use, the techniques themselves are valid and will provide better
insight if the modelers have a better understanding of the correlations between the chosen design
parameter inputs. It is recommended that this be accomplished by analyzing a smaller number of
design parameters like two at a time to see how they compare. For instance, the design team could look
at internal density versus power density, which has a very small correlation value between the two, to
see which has more influence on direct costs. Using that information, the design team can break down
the most influential parameter into its parts and perform another mapping to see which of those smaller
components of has the most influence on cost. These iterations can be performed until the designers
determine that enough information has been gathered and valid decisions can be made.
87
5.2 Application and Future Work
5.2.1 Continued Refinement of ANN Performance
The first area of future research is to continue to experiment with additional configurations of ANN
parameters and architectures, along with additional research into developing representative sample
sets using clean, simulated data. While the purpose of this research was to demonstrate the concept of
mapping costs to engineering decisions, additional refinement and development of theory could lend
substantial benefit to this type of work. Future research should also include testing and validation of
ANN models and cost mapping techniques using real data.
88
6 Bibliography
Araokar, S. A Technical Overview of Artificial Neural Networks.
Bishop, C. (1995). Neural Networks for Pattern Recognition. Oxford University Press.
Choudhury, A. (2009). Statistical Correlation. Retrieved February 14, 2010, from Experiment-Resources:
http://www.experiment-resources.com/statistical-correlation.html
Demuth, H., Beale, M., & Hagan, M. (2009, December 11). Neural Network Toolbox 6 User's Guide.
Natick, Massachusetts, USA.
Foster, J. J., Barkus, E., & Yavorsky, C. (2006). Understanding And Using Advanced Statistics. Thousand
Oaks: SAGE Publications.
Gates, J., & Greenberg, M. (2006, April). Defense Acquisition University Teaching Notes. Retrieved
September 23, 2009, from Defense Acquisition University:
https://acc.dau.mil/CommunityBrowser.aspx?id=30373
Griendling, K. A., Balestrini-Robinson, S., & Mavris, D. N. (2008). DoDAF Based System Architecture
Selection using a Comprehensive Modeling Process and Multi-Criteria Decision Making. Multidisciplinary
Analysis and Optimization Conference.
Hagan, M. T., Demuth, H. B., & Beale, M. H. (2004). Neural Network Design. Boulder: University of
Colorado.
Hornik, K. M., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal
approximators. Neural Networks , 359-366.
Jones, R. R., Jeffers, M. F., & Greenberg, M. W. (n.d.). Performance-Based Cost Models. Carderock
Technical Digest .
Linear Neural Networks. (2009, December). Retrieved December 14, 2009, from Willamette University:
www.willamette.edu/~gorr/classes/cs449/linear2.html
Naval Surface Warfare Carderock Division, Combatant Craft Division. (2009, December 13). Retrieved
December 13, 2009, from NAVSEA Warfare Centers: www.boats.dt.navy.mil
Orr, G. (1999). Neural Networks. Retrieved March 06, 2010, from Williamette University:
http://www.willamette.edu/~gorr/classes/cs449/intro.html
Pearce, A. R. (1997). Cost-Based Risk Prediction and Identification of Project Cost Drivers Using Artificial
Neural Networks.
lxxxix
Proust, M. (2008). Design of Experiments. Cary: SAS Institute Incorporated.
Thomas Lamb. (2003). Ship Design and Construction. Jersey City: The Society of Naval Architects and
Marine Engineers.
xc
A Artificial Neural Network Theory
A.1 Notation
Standard mathematical notation and architectural representations for artificial neural networks have
not yet been established. In addition, papers and books on neural networks have come from many
diverse fields, including engineering, physics, psychology and mathematics, and many authors tend to
use jargon specifically related to their disciplines making it very difficult to read and decipher otherwise
simple concepts. For this thesis, the notation format was adopted from the textbook, Neural Network
Design by Hagan, Demuth, and Beale (2004).
Relating this simple model back to the biological neuron discussed in section Chapter 2, the weight w
corresponds to synapse strength, the cell body is characterized by the summation and the transfer
function, and the neuron output a represents the signal on the axon.
91
A-1 Single-Input Neuron (Hagan, Demuth, & Beale, 2004)
The bias is much like the weight and is added to the product of the weight and input. The Bias can be
given any arbitrary weight early on and altered as necessary while training the network.
Note that w and b are both adjustable scalar parameters of the neuron. Typically the transfer function is
chosen by the network designer and parameters w and b will be adjusted by a learning rule 1 so that the
neuron input / output relationship meets some specific objective (Hagan, Demuth, & Beale, 2004) .
1
Learning rule is a procedure for modifying the weights and biases of a network
92
The Hard Limit transfer function sets the output of the neuron to 0 if the function argument is less than
0, or 1 if the argument is greater than or equal to 0. This function is often used to create neurons that
classify inputs into two distinct categories.
The Linear transfer function has an output that is equal to its input: a = n. Neurons with this transfer
function are commonly used in adaptive learning neuron (ADALINE) networks.
The Log-Sigmoid transfer function takes the input (which may have any value between plus and minus
infinity) and squashes the output into the range 0 and 1. The log-sigmoid transfer function is commonly
used in multilayer networks that are trained using the backpropagation2 algorithm, in part because the
function is differentiable.
Figure A-3 Multiple Input Neuron (Hagan, Demuth, & Beale, 2004)
The neuron has a bias b, which is summed with the weighted inputs to form the net input n:
2
Backpropagation is a generalization of the least mean squares algorithm
93
n = Wp + b,
where the matrix W for the single neuron case has only one row. Now the neuron can be written as:
a = f(Wp + b).
Artificial neural networks tend to be made up of several neurons, each with several inputs and often
more than one layer of neurons which can become quite complex if all of the necessary connections are
drawn. To prevent unnecessary ambiguity, abbreviated notation, Figure A-4, will be used from this
point forward.
The input vector p is represented by the solid vertical bar at the left. The dimensions of p are displayed
below the variable as R x 1, indicating that the input is a single vector of R elements. These inputs go to
the weight matrix W, which has R columns but only one row in this single neuron case. A constant 1
enters the neuron as an input and is multiplied by a scalar bias b. The net input to the transfer function f
is n, which is the sum of the bias b and the product Wp. The neuron’s output (a) is a scalar in this case.
If there was more than on neuron, the network output would be a vector. Also, note that the number of
inputs to a network is set by the external specifications of the problem.
94
Figure A-5 Layer of S Neurons (Hagan, Demuth, & Beale, 2004)
It is common for the number of inputs to a layer to be different from the number of neurons (i.e. R ≠ S).
In addition, each neuron in the layer can have a transfer function that is different from the other
neurons in the layer.
Now consider a network with several layers. Each layer will have its own weight matrix W, its own bias
vector b, a net input vector n and an output vector a. Superscripts will be used to identify each layer.
Specifically, each variable will have a superscript identifying the number of the layer it is associated with
in the network. Thus, the weight matrix for the first layer is W1, and the weight matrix for the second
layer is W2. This notation is used in the three-layer network shown in Figure A-6.
As shown, there are R inputs, S1 neurons in the first layer, S2 neurons in the second layer, etc. Different
layers can have different numbers of neurons.
The outputs of layers one and two are the inputs for layers two and three. Thus, layer two can be
viewed as a one-layer network with R = S1 neurons, and an S2 x S1 weight matrix W2. The input layer 2 is
a1, and the output is a2.
95
A layer whose output is the network output is called an output layer. The other layers in the network
are called hidden layers.
The following steps illustrate how to pick the proper network architecture:
A.7 Backpropagation
Backpropagation is the generalization of the Widrow-Hoff or Delta learning rule to multiple layer
networks and nonlinear differentiable transfer functions. Input vectors and the corresponding target
vectors are used to train a network until it can approximate a function.
Standard backpropagation is a gradient descent algorithm, in which the network weights are moved
along the negative of the gradient of the performance function. The term backpropagation refers to the
way in which the gradient is computed for nonlinear multilayer networks.
Properly trained backpropagation networks exhibit the ability to give reasonable answers when
presented with inputs not previously seen by the model. In other words, a new input leads to an output
96
similar to the target outputs with which the network was previously trained. This ability to generalize
complex analysis models makes it possible to train an artificial neural network on a representative set of
input / target pairs and get valid results without training the network on all possible input / output pairs.
The multilayer perceptron, trained by the backpropagation algorithm or learning rule, is currently the
most widely used neural network and variations of this algorithm were explored during this study.
To identify the structure of a multilayer network, it is convenient to use the following shorthand
notation, where the inputs are followed by the number of neurons in each layer:
R – S1 − S2 − S3
Further explanation, including examples of multilayer perceptrons used for pattern classification and
function approximation, of these concepts are available in Chapter 11 of Neural Network Design by
Hagan, Demuth, and Beale.
3
Signals flow from inputs, forwards through any hidden units, eventually reaching the output units
97
A.7.2 Function Approximation
The traditional method for creating a simplified model of a complex analysis tool is Response Surface
Methodology (RSM). In this method, a polynomial is regressed through a set of data determined through
Design of Experiments (DoE) techniques. For most problems a second-order form of the equation,
Equation 1, is sufficient. If this does not provide an acceptable regression it is possible to add higher-
order terms and make dependent variable transformations to improve the quality of the fit.
k k k −1 k
R = bo + ∑ bi xi + ∑ bii xi2 + ∑ ∑ bij xi x j + ε
i =1 i =1 i =1 j =i +1
Equation 1
Neural Networks are a different form of regression for highly non-linear or discrete problems.
Fundamentally, Neural Networks are different only in form from Response Surface Methods. Neural
Networks are an alternative to Response Surface Methods in the creation of regression models for
problems where the polynomial representation of the Response Surface Equation does not perform
well.
Consider the two-layer, 1-2-1 network shown in Figure A-8. The transfer function for the first layer is a
log-sigmoid and the transfer function for the second layer is linear:
1
𝑓𝑓 1 (𝑛𝑛) = and 𝑓𝑓 2 (𝑛𝑛) = 𝑛𝑛.
1+𝑒𝑒 −𝑛𝑛
The nominal values of the weights and biases for this network are:
98
The network response for theses parameters is shown in Figure A-9, which plots the network output a2
as the input p is varied over the range [-2, 2].
Figure A-9 Nominal Network Response (Hagan, Demuth, & Beale, 2004)
The response consists of two steps, one for each of the log-sigmoid neurons in the first layer. By
adjusting the network parameters, the shape and the step location can be modified Figure A-10.
Figure A-10 Effect of Parameter Changes on Network Response (Hagan, Demuth, & Beale, 2004)
From this example, it can be shown how flexible a multilayer network is. Similar networks can be used
to approximate almost any function as long as there are a sufficient number of neurons in the hidden
layer(s). In fact is has been shown that two-layer networks, with sigmoid transfer functions in the
hidden layer and linear transfer functions in the output layer, can approximate virtually any function of
interest to any degree of accuracy, provided sufficiently many hidden units are available (Hornik,
Stinchcombe, & White, 1989) .
99
A.7.3 Backpropagation Algorithm
For multilayer networks the output of one layer becomes the input to the following layer. The
equations that describe this operation are:
where M is the number of layers in the network. The neurons in the first layer receive external inputs:
a0 = p,
which provides the starting point. The outputs of the neurons in the last layer are considered the
network outputs:
a = aM.
where pq is an input to the network, and tq is the corresponding target output. As each input is applied
to the network, the network output is compared to the target value. The backpropagation algorithm
should then adjust the network parameters in order to minimize the mean square error:
where x is the vector of network weights and biases. If the network has multiple outputs this
generalizes to
100
where the expectation of the squared error has been substituted by the squared error at iteration k.
The steepest descent (gradient descent) algorithm 4 for the approximate mean square error is:
𝑚𝑚 𝑚𝑚 𝜕𝜕𝐹𝐹�
𝑤𝑤𝑖𝑖,𝑗𝑗 (𝑘𝑘 + 1) = 𝑤𝑤𝑖𝑖,𝑗𝑗 (𝑘𝑘)− ∝ 𝑚𝑚 ,
𝜕𝜕𝑤𝑤 𝑖𝑖,𝑗𝑗
𝜕𝜕𝐹𝐹�
𝑏𝑏𝑖𝑖𝑚𝑚 (𝑘𝑘 + 1) = 𝑏𝑏𝑖𝑖𝑚𝑚 (𝑘𝑘)− ∝ 𝑚𝑚 ,
𝜕𝜕𝑏𝑏
𝑖𝑖
Conversely, if ∝ is too large, the algorithm may end up bouncing around the surface and actually
diverge from the optimal solution, Figure A-12.
4
Take steps proportional to the negative of the gradient of the function at the current point
101
Figure A-12 Fast Learning Rate (Linear Neural Networks, 2009)
For steepest descent there are two general methods for determining the learning rate. One approach is
to minimize the performance index F(x) with respect to ∝ at each iteration, called an adaptive learning
rate. The other method for selecting the learning rate is to use a fixed value and decrease or increase as
necessary during network optimization.
Because the error is an indirect function of the weights in the hidden layers, it is necessary to use the
chain rule of calculus to calculate the derivatives. To review the chain rule, suppose the function f is an
explicit function of the variable n. The goal is to take the derivative of f with respect to a third variable
w. The chain rule is:
For example, if
then
The chain rule concept is used to find the derivatives of the following equations:
102
𝜕𝜕𝐹𝐹� 𝜕𝜕𝐹𝐹� 𝜕𝜕𝑛𝑛 𝑚𝑚
= 𝑥𝑥 𝜕𝜕𝑏𝑏 𝑚𝑚𝑖𝑖 .
𝜕𝜕𝑏𝑏𝑖𝑖𝑚𝑚 𝜕𝜕𝑛𝑛 𝑖𝑖𝑚𝑚 𝑖𝑖
The second term in each of these equations can be easily computed, since the net input to layer m is an
explicit function of the weights and bias in that layer:
𝑚𝑚 −1
𝑛𝑛𝑖𝑖𝑚𝑚 = ∑𝐼𝐼=1
𝑆𝑆 𝑚𝑚 𝑚𝑚 −1
𝑤𝑤𝑖𝑖,𝑗𝑗 𝑎𝑎𝑗𝑗 + 𝑏𝑏𝑖𝑖𝑚𝑚 .
therefore
now define
𝜕𝜕𝐹𝐹�
𝑠𝑠𝑖𝑖𝑚𝑚 = ,
𝜕𝜕𝑛𝑛 𝑖𝑖𝑚𝑚
as the sensitivity of 𝐹𝐹� to changes in the ith element of the net input at layer m, then the above
equations can be simplified to:
𝜕𝜕𝐹𝐹�
𝑚𝑚 = 𝑠𝑠𝑖𝑖𝑚𝑚 𝑎𝑎𝑗𝑗𝑚𝑚 −1 ,
𝜕𝜕𝑤𝑤 𝑖𝑖,𝑗𝑗
𝜕𝜕𝐹𝐹�
= 𝑠𝑠𝑖𝑖𝑚𝑚 ,
𝜕𝜕𝑏𝑏𝑖𝑖𝑚𝑚
𝑚𝑚 𝑚𝑚
𝑤𝑤𝑖𝑖,𝑗𝑗 (𝑘𝑘 + 1) = 𝑤𝑤𝑖𝑖,𝑗𝑗 (𝑘𝑘)− ∝ 𝑠𝑠𝑖𝑖𝑚𝑚 𝑎𝑎𝑗𝑗𝑚𝑚−1 ,
where
103
𝜕𝜕𝐹𝐹�
⎡ 𝜕𝜕𝑛𝑛 1𝑚𝑚 ⎤
⎢ 𝜕𝜕𝐹𝐹� ⎥
𝜕𝜕𝐹𝐹� ⎢ ⎥
𝒔𝒔𝑚𝑚 ≡ = ⎢ 𝜕𝜕𝑛𝑛 2𝑚𝑚 ⎥ .
𝜕𝜕𝒏𝒏𝑚𝑚
⎢ ⋮ ⎥
⎢ 𝜕𝜕𝐹𝐹� ⎥
⎣𝜕𝜕𝑛𝑛 𝑠𝑠𝑚𝑚𝑚𝑚 ⎦
A Jacobian matrix 5 is used to derive the recurrence relationship for the sensitivities:
The next step involves finding an expression for this matrix. Consider the 𝑖𝑖, 𝑗𝑗 element of the matrix:
𝑚𝑚 𝑚𝑚 +1 𝑚𝑚
𝜕𝜕𝑛𝑛 𝑖𝑖𝑚𝑚 +1 𝜕𝜕(∑𝑆𝑆𝐼𝐼=1 𝑤𝑤 𝑖𝑖,𝑗𝑗 𝑎𝑎 𝑗𝑗 +𝑏𝑏𝑖𝑖𝑚𝑚 +1 ) 𝑚𝑚 +1 𝜕𝜕𝑎𝑎 𝑗𝑗𝑚𝑚
𝑚𝑚 +1 𝜕𝜕𝑓𝑓 𝑚𝑚 (𝑛𝑛 𝑗𝑗𝑚𝑚 ) 𝑚𝑚 +1 𝑚𝑚
𝜕𝜕𝑛𝑛 𝑗𝑗𝑚𝑚
= 𝜕𝜕𝑛𝑛 𝑗𝑗𝑚𝑚
= 𝑤𝑤𝑖𝑖,𝑗𝑗 𝜕𝜕𝑛𝑛 𝑚𝑚
= 𝑤𝑤𝑖𝑖,𝑗𝑗 = 𝑤𝑤𝑖𝑖,𝑗𝑗 𝑓𝑓 (𝑛𝑛𝑗𝑗𝑚𝑚 ) ,
𝑗𝑗 𝜕𝜕𝑛𝑛 𝑗𝑗𝑚𝑚
where
𝜕𝜕𝒏𝒏𝑚𝑚 +1
= 𝑾𝑾𝑚𝑚 +1 𝑭𝑭𝑚𝑚 (𝒏𝒏𝑚𝑚 ) ,
𝜕𝜕𝒏𝒏𝑚𝑚
where
𝑓𝑓 𝑚𝑚 (𝑛𝑛1𝑚𝑚 ) 𝟎𝟎 ⋯ 𝟎𝟎
⎡ 𝑚𝑚 𝑚𝑚 ⎤
𝟎𝟎 𝑓𝑓 (𝑛𝑛 2 ) ⋯ 𝟎𝟎
𝑭𝑭𝑚𝑚 (𝒏𝒏𝑚𝑚 ) = ⎢ ⎥ ,
⎢ ⋮ ⋮ ⋮ ⎥
⎣ 𝟎𝟎 𝟎𝟎 ⋯ 𝑓𝑓 𝑚𝑚 (𝑛𝑛𝑠𝑠𝑚𝑚𝑚𝑚 )⎦
5
In vector calculus, the Jacobian matrix is the matrix of all first-order partial derivatives of a vector-valued function
104
Now the recurrence relation for the sensitivity can be written out by using the chain rule in matrix form:
𝐼𝐼
𝜕𝜕𝐹𝐹� 𝜕𝜕𝒏𝒏𝑚𝑚 +1 𝜕𝜕𝐹𝐹� 𝜕𝜕𝐹𝐹�
𝒔𝒔𝑚𝑚 = 𝑚𝑚
= � 𝑚𝑚 � 𝑚𝑚 +1
= 𝑭𝑭𝑚𝑚 (𝒏𝒏𝑚𝑚 )(𝑾𝑾𝑚𝑚+1 )𝑻𝑻 𝑚𝑚 +1 = 𝑭𝑭𝑚𝑚 (𝒏𝒏𝑚𝑚 )(𝑾𝑾𝑚𝑚 +1 )𝑻𝑻 𝒔𝒔𝑚𝑚+1
𝜕𝜕𝒏𝒏 𝜕𝜕𝒏𝒏 𝜕𝜕𝒏𝒏 𝜕𝜕𝒏𝒏
Now it can be seen where the backpropagation algorithm derives its name. The sensitivities are
propagated backward through the network from the last layer to the first layer:
There is one more step in order to complete the backpropagation algorithm. There needs to be an
initial value, 𝒔𝒔𝑚𝑚 , for the recurrence relation. This value is obtained from the final layer in the network:
𝑀𝑀
𝜕𝜕𝐹𝐹� 𝜕𝜕(𝒕𝒕−𝒂𝒂)𝑇𝑇 (𝒕𝒕−𝒂𝒂) 𝜕𝜕 ∑𝑠𝑠𝑗𝑗 =1 (𝑡𝑡 𝑗𝑗 −𝑎𝑎 𝑗𝑗 )2 𝜕𝜕 𝑎𝑎
𝒔𝒔𝑀𝑀
𝑖𝑖 = = = = −2(𝑡𝑡𝑖𝑖 − 𝑎𝑎𝑖𝑖 ) 𝜕𝜕𝑛𝑛 𝑀𝑀𝑖𝑖 .
𝜕𝜕𝑛𝑛 𝑖𝑖𝑀𝑀 𝜕𝜕𝑛𝑛 𝑖𝑖𝑀𝑀 𝜕𝜕𝑛𝑛 𝑖𝑖𝑀𝑀 𝑖𝑖
Now, since
it can be written as
𝒔𝒔𝑀𝑀 𝑀𝑀 𝑀𝑀
𝑖𝑖 = −2(𝑡𝑡𝑖𝑖 − 𝑎𝑎𝑖𝑖 ) 𝑓𝑓 (𝑛𝑛𝑖𝑖 ) .
A.7.7 Summary
1. Propagate the input forward through the network:
𝒂𝒂0 = 𝒑𝒑,
𝒂𝒂𝑚𝑚 +1 = 𝒇𝒇𝑚𝑚 +1 (𝑾𝑾𝑚𝑚+1 𝒂𝒂𝑚𝑚 + 𝒃𝒃𝑚𝑚 +1 ) for 𝑚𝑚 = 0, 1, … , 𝑀𝑀 − 1,
𝒂𝒂 = 𝒂𝒂𝑀𝑀 .
2. Propagate the sensitivities backward through the network:
3. Weights and biases are updated using the approximate steepest descent rule:
105
𝑾𝑾𝑚𝑚 (𝑘𝑘 + 1) = 𝑾𝑾𝑚𝑚 (𝑘𝑘)− ∝ 𝒔𝒔𝑚𝑚 (𝒂𝒂𝑚𝑚 −1 )𝑇𝑇
A.7.8 Generalization
One major problem with the approach outlined above is that it doesn't actually minimize the error that
we are really interested in - which is the expected error the network will make when new cases are
submitted to it. In other words, the most desirable property of a network is its ability to generalize to
new cases. In reality, the network is trained to minimize the error on the training set, and short of
having a perfect and infinitely large training set, this is not the same thing as minimizing the error on the
real error surface - the error surface of the underlying and unknown model (Bishop, 1995).
For a network to be able to generalize, it should have fewer parameters than there are data points in
the training set (Hagan, Demuth, & Beale, 2004). Also, if the number of parameters in the network is
much smaller than the total number of points in the training set, then there is little or no chance of
overfitting. If you can easily collect more data and increase the size of the training set, then there is no
need to worry about techniques which can prevent overfitting.
In neural networks, as in all modeling problems, the simplest network that can adequately represent the
training set should be used. In other words, don’t use a larger network when a smaller network will
suffice (a concept often referred to as Ockham’s razor).
Figure A-13 shows the response of a 1-20-1 neural network that has been trained to approximate a
noisy sine function (Demuth, Beale, & Hagan, 2009). The underlying sine function is shown by the
dotted line, the noisy measurements are given by the ‘+’ symbols, and the neural network response is
given by the solid line. Clearly this network has overfitted the data and will not generalize well.
106
Figure A-13 Network Response (Demuth, Beale, & Hagan, 2009)
There are two techniques that can be used to prevent overfitting: (1) Early Stopping and
(Regularization).
When using the early stopping technique the available training data is divided into three subsets. The
first subset is the training set, which is used for computing the gradient and updating the network
weights and biases. The second subset is the validation set. The error on the validation set is monitored
during the training process. The validation error normally decreases during the initial phase of training,
as does the training set error. However, when the network begins to overfit the data, the error on the
validation set typically begins to increase. When the validation error increases for a specified number of
iterations, the training is stopped, and the weights and biases at the minimum of the validation error are
returned to the network. The final subset (test set) error is not used during training, but it is used to
compare different models. It is also useful to plot the test set error during the training process. If the
error in the test set reaches a minimum at a significantly different iteration number than the validation
set error, this might indicate a poor division of the data set (Demuth, Beale, & Hagan, 2009).
The regularization technique involves modifying the performance function, which is normally chosen to
be the sum of squares of the network errors on the training set.
107
1 𝑁𝑁 1 𝑁𝑁
𝐹𝐹 = 𝑚𝑚𝑚𝑚𝑚𝑚 = � (𝑒𝑒𝑖𝑖 )2 = � (𝑡𝑡𝑖𝑖 − 𝑎𝑎𝑖𝑖 )2
𝑁𝑁 𝑖𝑖=1 𝑁𝑁 𝑖𝑖=1
It is possible to improve generalization if the performance function is modified by adding a term that
consists of the mean of the sum of squares of the network weights and biases
1 𝑛𝑛
𝑚𝑚𝑚𝑚𝑚𝑚 = � 𝑤𝑤𝑗𝑗2
𝑛𝑛 𝑖𝑖=1
Using this performance function causes the network to have smaller weights and biases, and forces the
network response to be smoother and less likely to overfit the data.
Figure A-14 Squared Error Surface (Hagan, Demuth, & Beale, 2004)
Figure A-14 illustrates an important property of multi-layer networks: they have a symmetry to them.
Looking at the figure there are two local minimum points and they both have the same value of squared
error. The second solution corresponds to the same network being turned upside down (i.e. the top
neuron in the first layer is exchanged with the bottom neuron). It is because of this characteristic of
neural networks that the initial weight and bias values should be numbers other than zero. The
symmetry causes zero to be a saddle point on the performance surface. Also from the figure, it can be
seen that initial parameters should not be set to large values. This is because the performance surface
tends to have flat regions as we move away from the optimum point.
108
Typically, initial weights and biases should be chosen as small, random values. That way the network
stays away from a possible saddle point at the origin and does not drift out to the flat regions of the
performance surface.
A.7.11 Momentum
Momentum is a technique that can be used to propel a network passed local minima. With momentum,
m, the weight update at a given time, t, becomes
109
where 0 < m < 1 is a new global parameter which can be determined by trial and error. Momentum
simply adds a fraction m of the previous weight update to the current one. When the gradient
continues in the same direction, the momentum term will increase the size of the steps taken towards
the minimum. It is often necessary to reduce the global learning rate ∝ when using a large momentum
term (m close to 1). The combination of a high learning rate and a large momentum coefficient will
cause the network to dash past the minimum in huge steps.
When the gradient keeps changing direction, momentum will smooth out the variations. This is valuable
when the network is not well-trained. In such cases, the error surface has significantly different
curvature along different directions, leading to the formation of long, narrow valleys. For most points
on the surface, the gradient does not point towards the minimum, and successive steps of gradient
descent can oscillate from one side to the other, progressing only very slowly to the minimum (Figure A-
17). Figure A-18 shows how the addition of momentum helps speed up convergence to the minimum by
damping these oscillations.
110
The error surface for the multilayer network is not a quadratic function. The shape of the surface can be
very different in different regions of the design space. It is possible to speed up the convergence by
adjusting the learning rate during the course of network training.
There are multiple different approaches for varying the learning rate. For this thesis, a simple batching
procedure will be used. The rules for the adaptive learning rate backpropagation algorithm (ALBP) are:
If the mean squared error (over the entire training set) increases by more than a certain percentage rate
backpropagation algorithm (ALBP) are:
1. If the mean squared error (over the entire training set) increases by more than a certain
percentage ℶ (usually one to five percent) after a weight update, then the weight update is
discarded, the learning rate is multiplied by some factor 0 < 𝜌𝜌 < 1, and the momentum
coefficient (if used) is set to zero.
2. If the mean squared error decreases after a weight update, then the weight update is accepted
and the learning rate is multiplied by some factorℵ > 1. If momentum has previously been set
to zero, it is reset to its original value.
3. If the mean squared error increases by less than ℶ, then the weight update is accepted but the
learning rate is unchanged. If the momentum has been previously set to zero, it is reset to its
original value.
The LM algorithm is a very simple, but robust, method for approximating a function. It consists of solving
the equation:
Where 𝐽𝐽 is the Jacobian matrix for the system, 𝜆𝜆 is the Levenberg damping factor, 𝛿𝛿 is the weight update
vector that is found during the optimization process and 𝐸𝐸 is the error vector containing the output
6
Newton’s method is used for finding successively better approximations to the zeroes (or roots) of a real-valued
function.
111
errors for each input vector used on training the network. The magnitude of 𝛿𝛿 indicates by how much
the network weights should be varied to achieve a better solution. The 𝐽𝐽𝑇𝑇 𝐽𝐽 matrix is known as the
approximated Hessian 7.
The damping factor is adjusted at each iteration and guides the optimization process. If the error
reduction is rapid, a smaller value can be used, bringing the algorithm closer to Newton’s method,
whereas if an iteration gives insufficient reduction in the residual, 𝜆𝜆 can be increased, giving a step
closer to the gradient descent direction. A more detailed explanation of the LM algorithm is available in
the textbook Neural Network Design.
The Levenberg-Marquardt is very sensitive to the initial network weighs. Also, it does not consider
outliers in the data, what may lead to overfitting. To avoid those situations, regularization can be used.
Also, using the LM algorithm requires large amounts of computer power because of the Hessian matrix.
7
Hessian is a matrix of second order partial derivatives.
112
net.outputConnect = [0 1];
net.inputs{1}.exampleInput = pn;
net.inputs{1}.processFcns = {'removeconstantrows'};
net.inputWeights{1,1}.learnfcn = 'learnwh';
net.layers{1}.size = 19;
net.layers{1}.transferFcn = 'logsig';
net.layers{1}.initFcn = 'initnw';
net.layerWeights{2,1}.learnfcn = 'learnwh';
net.layers{2}.transferFcn = 'logsig';
net.layers{2}.initFcn = 'initnw';
net.outputs{2}.exampleOutput = tn;
net.outputs{2}.processFcns = {'removeconstantrows'};
%%
%Network Functions
net.performFcn = 'mse';
net.initFcn = 'initlay';
net.divideFcn = 'dividerand';
net.trainFcn = 'trainlm';
net.adaptFcn = 'trains';
net.plotFcns = {'plotperform','plottrainstate','plotregression'};
%Training the Network
net = init(net);
net.trainParam.epochs = 500;
net.trainParam.goal = 1e-5;
net.trainParam.lr = 0.04;
%
net.trainParam.min_grad = 1e-10;
net.trainParam.show = 25;
net.trainParam.time = 90;
net.trainParam.showWindow = 1;
[net,tr,Y,E] = train(net,pn,tn);
Emono = xlswrite('TRAINING.xlsx',abs(E'),'MONO','A307');
113
net = network;
%Neural Network object
net.numInputs = 1;
%Number of input sources
net.numLayers = 2;
%Number of network layers
net.biasConnect = [1;1];
%Layer Bias Connections
net.inputConnect = [1;0];
%net.inputConnect(i,j): Represents the presence of an input weight going from
the ith layer to the jth input
net.layerConnect = [0 0;1 0];
%Layer weight connection going from the ith layer to the jth layer
net.outputConnect = [0 1];
%Connect to one destination (External World) from the network layers
net.inputs{1}.exampleInput = pn;
%Example Inputs
net.inputs{1}.processFcns = {'removeconstantrows'};
%Chosen Network Process Functions
net.layers{1}.size = 5;
%Number of Neurons
net.layers{1}.transferFcn = 'logsig';
%Layer Transfer Function
net.layers{1}.initFcn = 'initnw';
%Weights and Bias initialization Function
net.layers{2}.transferFcn = 'logsig';
%Layer Transfer Function
net.layers{2}.initFcn = 'initnw';
%Weights and Bias initialization Function
net.outputs{2}.exampleOutput = tn;
%Example output
net.outputs{2}.processFcns = {'removeconstantrows'};
%Output process functions
%%
%Network Functions
net.performFcn = 'msereg';
%Network performance function
net.initFcn = 'initlay';
%Network initialization function. Calls layer weight and bias initialization
functions
net.divideFcn = 'dividerand';
%Separates inputs into training, validation, and test values
net.trainFcn = 'trainbr';
%Network training function
net.plotFcns = {'plotperform','plottrainstate','plotregression'};
%Network plot function
%%
%Training the Network
net = init(net);
%Initialize the Network
net.trainParam.epochs = 500;
%Maximum number of epochs to train
net.trainParam.goal = 1e-5;
%Performance goal
net.trainParam.lr = 0.01;
%Learning rate
114
net.trainParam.min_grad = 1e-10;
%Minimum performance gradient
net.trainParam.show = 25;
%Epochs between displays
net.trainParam.time = inf;
%Maximum time to train in seconds
net.trainParam.showWindow = 1;
%Show training GUI
[net,tr,Y,E] = train(net,pn,tn);
%Call Training Function
Emono = xlswrite('TRAINING.xlsx',abs(E'),'MONO','A307');
%Export Training Data
W = net.IW{1,1};
%Final Input Layer Weight Matrix
B = net.b{1};
%Final Input Bias Vector
%Ymono = mapminmax('reverse',Y,ps);
%returns X, given Y and settings PS
%print -dmeta (copy figures to clipboard)
115
net.inputs{1}.exampleInput = pn;
%Example Inputs
net.inputs{1}.processFcns = {'removeconstantrows'};
%Chosen Network Process Functions
net.layers{1}.size = 5;
%Number of Neurons
net.layerWeights{1,1}.learnfcn = 'learnwh';
net.layers{1}.transferFcn = 'logsig';
%Layer Transfer Function
net.layers{1}.initFcn = 'initnw';
%Weights and Bias initialization Function
net.layerWeights{2,1}.learnfcn = 'learnwh';
net.layers{2}.transferFcn = 'logsig';
%Layer Transfer Function
net.layers{2}.initFcn = 'initnw';
%Weights and Bias initialization Function
net.outputs{2}.exampleOutput = tn;
%Example output
net.outputs{2}.processFcns = {'removeconstantrows'};
%Output process functions
%%
%Network Functions
net.performFcn = 'mse';
%Network performance function
net.initFcn = 'initlay';
%Network initialization function. Calls layer weight and bias initialization
functions
net.divideFcn = 'dividerand';
%Separates inputs into training, validation, and test values
net.trainFcn = 'traingda';
%Network training function
net.plotFcns = {'plotperform','plottrainstate','plotregression'};
%Network plot function
%%
%Training the Network
net = init(net);
%Initialize the Network
net.trainParam.epochs = 500;
%Maximum number of epochs to train
net.trainParam.goal = 1e-5;
%Performance goal
net.trainParam.lr = 0.01;
%Learning rate
net.trainParam.lr_inc = 1.05;
%Ratio to increase learning rate
net.trainParam.lr_dec = 0.7;
%Ratio to decrease learning rate
net.trainParam.max_fail = 6;
%Maximum validation failure
net.trainParam.max_perf_inc = 1.04;
%Maximum performance increase
net.trainParam.min_grad = 1e-10;
%Minimum performance gradient
net.trainParam.show = 25;
%Epochs between displays
net.trainParam.time = 90;
%Maximum time to train in seconds
116
net.trainParam.showWindow = 1;
%Show training GUI
[net,tr,Y,E] = train(net,pn,tn);
%Call Training Function
Emono = xlswrite('TRAINING.xlsx',abs(E'),'MONO','A307');
%Export Training Data
117
net.layers{2}.initFcn = 'initnw';
%Weights and Bias initialization Function
net.outputs{2}.exampleOutput = tn;
%Example output
net.outputs{2}.processFcns = {'removeconstantrows'};
%Output process functions
%%
%Network Functions
net.performFcn = 'mse';
%Network performance function
net.initFcn = 'initlay';
%Network initialization function. Calls layer weight and bias initialization
functions
net.divideFcn = 'dividerand';
%Separates inputs into training, validation, and test values
net.trainFcn = 'train';
%Network training function
net.plotFcns = {'plotperform','plottrainstate','plotregression'};
%Network plot function
%%
%Training the Network
net = init(net);
%Initialize the Network
net.trainParam.epochs = 5000;
%Maximum number of epochs to train
net.trainParam.goal = 1e-5;
%Performance goal
net.trainParam.lr = 0.01;
%Learning rate
net.trainParam.max_fail = 6;
%Maximum validation failure
net.trainParam.mc = 0.9;
%Momentum Constant
net.trainParam.min_grad = 1e-10;
%Minimum performance gradient
net.trainParam.show = 25;
%Epochs between displays
net.trainParam.time = 90;
%Maximum time to train in seconds
net.trainParam.showWindow = 1;
%Show training GUI
[net,tr,Y,E] = train(net,pn,tn);
%Call Training Function
Emono = xlswrite('TRAINING.xlsx',abs(E'),'MONO','A307');
%Export Training Data
W = net.IW{1,1};
%Final Input Layer Weight Matrix
B = net.b{1};
%Final Input Bias Vector
%Ymono = mapminmax('reverse',Y,ps);
%returns X, given Y and settings PS
%print -dmeta (copy figures to clipboard)
118
%A Quantitative Methodology for Mapping Project Costs to
%Engineering Decisions in Naval Ship Design and Procurement
%Copyright 2010
%%
%Definitions
%%
%Assemble the Training Data
%%
%Import model data from Excel
%num = xlsread(filename, sheet, 'range') reads data from a specific
Pmono = (xlsread('TRAINING.xlsx', 'MONO', 'A1:I304'))';
Tmono = (xlsread('TRAINING.xlsx', 'MONO', 'J1:P304'))';
[pn,ps] = mapminmax(Pmono,0,1);
%R X Q1 matrix of Q1 sample R-element input vectors
[tn,ts] = mapminmax(Tmono,0,1);
%SN X Q2 matrix of Q2 sample SN-element target vectors
%%
%Create the Network Object
net = network;
%Neural Network object
net.numInputs = 1;
%Number of input sources
net.numLayers = 2;
%Number of network layers
net.biasConnect = [1;1];
%Layer Bias Connections
net.inputConnect = [1;0];
%net.inputConnect(i,j): Represents the presence of an input weight going from
the ith layer to the jth input
net.layerConnect = [0 0;1 0];
%Layer weight connection going from the ith layer to the jth layer
net.outputConnect = [0 1];
%Connect to one destination (External World) from the network layers
net.inputs{1}.exampleInput = pn;
%Example Inputs
net.inputs{1}.processFcns = {'removeconstantrows'};
%Chosen Network Process Functions
net.layers{1}.size = 5;
%Number of Neurons
net.layerWeights{1,1}.learnfcn = 'learnwh';
net.layers{1}.transferFcn = 'logsig';
%Layer Transfer Function
net.layers{1}.initFcn = 'initnw';
%Weights and Bias initialization Function
net.layerWeights{2,1}.learnfcn = 'learnwh';
net.layers{2}.transferFcn = 'logsig';
%Layer Transfer Function
net.layers{2}.initFcn = 'initnw';
%Weights and Bias initialization Function
net.outputs{2}.exampleOutput = tn;
%Example output
net.outputs{2}.processFcns = {'removeconstantrows'};
%Output process functions
%%
%Network Functions
net.performFcn = 'mse';
%Network performance function
119
net.initFcn = 'initlay';
%Network initialization function. Calls layer weight and bias initialization
functions
net.divideFcn = 'dividerand';
%Separates inputs into training, validation, and test values
net.trainFcn = 'traingdx';
%Network training function
net.plotFcns = {'plotperform','plottrainstate','plotregression'};
%Network plot function
%%
%Training the Network
net = init(net);
%Initialize the Network
net.trainParam.epochs = 500;
%Maximum number of epochs to train
net.trainParam.goal = 1e-5;
%Performance goal
net.trainParam.lr = 0.01;
%Learning rate
net.trainParam.lr_inc = 1.05;
%Ratio to increase learning rate
net.trainParam.lr_dec = 0.7;
%Ratio to decrease learning rate
net.trainParam.max_fail = 6;
%Maximum validation failure
net.trainParam.max_perf_inc = 1.04;
%Maximum performance increase
net.trainParam.mc = 0.9;
%Momentum constant
net.trainParam.min_grad = 1e-10;
%Minimum performance gradient
net.trainParam.show = 25;
%Epochs between displays
net.trainParam.time = 90;
%Maximum time to train in seconds
net.trainParam.showWindow = 1;
%Show training GUI
[net,tr,Y,E] = train(net,pn,tn);
%Call Training Function
Emono = xlswrite('TRAINING.xlsx',abs(E'),'MONO','A307');
%Export Training Data
120
Tmono = (xlsread('TRAINING.xlsx', 'MONO', 'J1:P304'))';
[pn,ps] = mapminmax(Pmono,0,1);
%R X Q1 matrix of Q1 sample R-element input vectors
[tn,ts] = mapminmax(Tmono,0,1);
%SN X Q2 matrix of Q2 sample SN-element target vectors
%%
%Create the Network Object
net = network;
%Neural Network object
net.numInputs = 1;
%Number of input sources
net.numLayers = 2;
%Number of network layers
net.biasConnect = [1;1];
%Layer Bias Connections
net.inputConnect = [1;0];
%net.inputConnect(i,j): Represents the presence of an input weight going from
the ith layer to the jth input
net.layerConnect = [0 0;1 0];
%Layer weight connection going from the ith layer to the jth layer
net.outputConnect = [0 1];
%Connect to one destination (External World) from the network layers
net.inputs{1}.exampleInput = pn;
%Example Inputs
net.inputs{1}.processFcns = {'removeconstantrows'};
%Chosen Network Process Functions
net.layers{1}.size = 5;
%Number of Neurons
net.layers{1}.transferFcn = 'logsig';
%Layer Transfer Function
net.layers{1}.initFcn = 'initnw';
%Weights and Bias initialization Function
net.layers{2}.transferFcn = 'logsig';
%Layer Transfer Function
net.layers{2}.initFcn = 'initnw';
%Weights and Bias initialization Function
net.outputs{2}.exampleOutput = tn;
%Example output
net.outputs{2}.processFcns = {'removeconstantrows'};
%Output process functions
%%
%Network Functions
net.performFcn = 'mse';
%Network performance function
net.initFcn = 'initlay';
%Network initialization function. Calls layer weight and bias initialization
functions
net.divideFcn = 'dividerand';
%Separates inputs into training, validation, and test values
net.trainFcn = 'traingdm';
%Network training function
net.plotFcns = {'plotperform','plottrainstate','plotregression'};
%Network plot function
%%
%Training the Network
net = init(net);
%Initialize the Network
121
net.trainParam.epochs = 500;
%Maximum number of epochs to train
net.trainParam.goal = 1e-5;
%Performance goal
net.trainParam.lr = 0.01;
%Learning rate
net.trainParam.delt_inc = 1.2;
%Increment to weight change
net.trainParam.delt_dec = 0.5;
%Decrement to weight change
net.trainParam.delta0 = 0.07;
%Initial weight change
net.trainParam.max_fail = 6;
%Maximum validation failure
net.trainParam.deltamax = 50;
%Maximum weight change
net.trainParam.mc = 0.9;
%Momentum Constant
net.trainParam.min_grad = 1e-10;
%Minimum performance gradient
net.trainParam.show = 25;
%Epochs between displays
net.trainParam.time = 90;
%Maximum time to train in seconds
net.trainParam.showWindow = 1;
%Show training GUI
[net,tr,Y,E] = train(net,pn,tn);
%Call Training Function
Emono = xlswrite('TRAINING.xlsx',abs(E'),'MONO','A307');
%Export Training Data
122
%A Quantitative Methodology for Mapping Project Costs to
%Engineering Decisions in Naval Ship Design and Procurement
%Copyright 2010
%%
%Trained Network Input
Psim = (xlsread('TRAINING.xlsx', 'MONOSIM', 'A2:I16385'))';
%Read in input data from excel
[px,pz] = mapminmax(Psim,0,1);
%R X Q1 matrix of Q1 R-element input vectors
%%
%Simulate Monohull model data
MONOSIMSQUASHED = sim(net,px);
%Squashed monohull SWBS cost output
%%
MONOSIMACTUAL = mapminmax('reverse',MONOSIMSQUASHED,ts);
%Transform network outputs into actual SWBS cost values
B.2 Catamaran
B.2.1 Levenberg - Marquardt
%LT Kristopher Netemeyer
%Naval Engineer / SDM Masters Thesis
%A Quantitative Methodology for Mapping Project Costs to
%Engineering Decisions in Naval Ship Design and Procurement
%Copyright 2010
%%
%Definitions
%%
%Assemble the Training Data
%%
%Import model data from Excel
%num = xlsread(filename, sheet, 'range') reads data from a specific
Pcat = (xlsread('TRAINING.xlsx', 'CATTRAIN', 'A1:I304'))';
Tcat = (xlsread('TRAINING.xlsx', 'CATTRAIN', 'J1:P304'))';
[pn,ps] = mapminmax(Pcat,0,1);
%R X Q1 matrix of Q1 sample R-element input vectors
[tn,ts] = mapminmax(Tcat,0,1);
%SN X Q2 matrix of Q2 sample SN-element target vectors
%%
%Create the Network Object
net = network;
%Neural Network object
net.numInputs = 1;
%Number of input sources
net.numLayers = 2;
%Number of network layers
net.biasConnect = [1;1];
%Layer Bias Connections
net.inputConnect = [1;0];
%net.inputConnect(i,j): Represents the presence of an input weight going from
the ith layer to the jth input
net.layerConnect = [0 0;1 0];
%Layer weight connection going from the ith layer to the jth layer
123
net.outputConnect = [0 1];
%Connect to one destination (External World) from the network layers
net.inputs{1}.exampleInput = pn;
%Example Inputs
net.inputs{1}.processFcns = {'removeconstantrows'};
%Chosen Network Process Functions
net.layers{1}.size = 5;
%Number of Neurons
net.layers{1}.transferFcn = 'logsig';
%Layer Transfer Function
net.layers{1}.initFcn = 'initnw';
%Weights and Bias initialization Function
net.layers{2}.transferFcn = 'logsig';
%Layer Transfer Function
net.layers{2}.initFcn = 'initnw';
%Weights and Bias initialization Function
net.outputs{2}.exampleOutput = tn;
%Example output
net.outputs{2}.processFcns = {'removeconstantrows'};
%Output process functions
%%
%Network Functions
net.performFcn = 'mse';
%Network performance function
net.initFcn = 'initlay';
%Network initialization function. Calls layer weight and bias initialization
functions
net.divideFcn = 'dividerand';
%Separates inputs into training, validation, and test values
net.trainFcn = 'trainlm';
%Network training function
net.adaptFcn = 'trains';
net.plotFcns = {'plotperform','plottrainstate','plotregression'};
%Network plot function
%%
%Training the Network
net = init(net);
%Initialize the Network
net.trainParam.epochs = 500;
%Maximum number of epochs to train
net.trainParam.goal = 1e-5;
%Performance goal
net.trainParam.lr = 0.02;
%Learning rate
net.trainParam.min_grad = 1e-10;
%Minimum performance gradient
net.trainParam.show = 25;
%Epochs between displays
net.trainParam.time = inf;
%Maximum time to train in seconds
net.trainParam.showWindow = 1;
%Show training GUI
[net,tr,Y,E] = train(net,pn,tn);
%Call Training Function
Ecat = xlswrite('TRAINING.xlsx',abs(E'),'CAT','A1');
%Export Training Data
124
B.2.2 Levenberg - Marquardt with Bayesian Regularization
%LT Kristopher Netemeyer
%Naval Engineers / SDM Masters Thesis
%A Quantitative Methodology for Mapping Project Costs to
%Engineering Decisions in Naval Ship Design and Procurement
%Copyright 2010
%%
%Definitions
%%
%Assemble the Training Data
%%
%Import model data from Excel
%num = xlsread(filename, sheet, 'range') reads data from a specific
Pcat = (xlsread('TRAINING.xlsx', 'CATTRAIN', 'A1:I304'))';
Tcat = (xlsread('TRAINING.xlsx', 'CATTRAIN', 'J1:P304'))';
[pn,ps] = mapminmax(Pcat,0,1);
%R X Q1 matrix of Q1 sample R-element input vectors
[tn,ts] = mapminmax(Tcat,0,1);
%SN X Q2 matrix of Q2 sample SN-element target vectors
%%
%Create the Network Object
net = network;
%Neural Network object
net.numInputs = 1;
%Number of input sources
net.numLayers = 2;
%Number of network layers
net.biasConnect = [1;1];
%Layer Bias Connections
net.inputConnect = [1;0];
%net.inputConnect(i,j): Represents the presence of an input weight going from
the ith layer to the jth input
net.layerConnect = [0 0;1 0];
%Layer weight connection going from the ith layer to the jth layer
net.outputConnect = [0 1];
%Connect to one destination (External World) from the network layers
net.inputs{1}.exampleInput = pn;
%Example Inputs
net.inputs{1}.processFcns = {'removeconstantrows'};
%Chosen Network Process Functions
net.layers{1}.size = 5;
%Number of Neurons
net.layers{1}.transferFcn = 'logsig';
%Layer Transfer Function
net.layerWeights{1,1}.learnfcn = 'learnwh';
net.layers{1}.initFcn = 'initnw';
%Weights and Bias initialization Function
net.layerWeights{2,1}.learnfcn = 'learnwh';
net.layers{2}.transferFcn = 'logsig';
%Layer Transfer Function
net.layers{2}.initFcn = 'initnw';
%Weights and Bias initialization Function
net.outputs{2}.exampleOutput = tn;
%Example output
net.outputs{2}.processFcns = {'removeconstantrows'};
%Output process functions
125
%%
%Network Functions
net.performFcn = 'msereg';
%Network performance function
net.initFcn = 'initlay';
%Network initialization function. Calls layer weight and bias initialization
functions
net.divideFcn = 'dividerand';
%Separates inputs into training, validation, and test values
net.trainFcn = 'trainbr';
%Network training function
net.plotFcns = {'plotperform','plottrainstate','plotregression'};
%Network plot function
%%
%Training the Network
net = init(net);
%Initialize the Network
net.trainParam.epochs = 500;
%Maximum number of epochs to train
net.trainParam.goal = 1e-5;
%Performance goal
net.trainParam.lr = 0.05;
%Learning rate
net.trainParam.min_grad = 1e-10;
%Minimum performance gradient
net.trainParam.show = 25;
%Epochs between displays
net.trainParam.time = 90;
%Maximum time to train in seconds
net.trainParam.showWindow = 1;
%Show training GUI
[net,tr,Y,E] = train(net,pn,tn);
%Call Training Function
Ecat = xlswrite('TRAINING.xlsx',abs(E'),'CAT','A1');
%Export Training Data
126
%Create the Network Object
net = network;
%Neural Network object
net.numInputs = 1;
%Number of input sources
net.numLayers = 2;
%Number of network layers
net.biasConnect = [1;1];
%Layer Bias Connections
net.inputConnect = [1;0];
%net.inputConnect(i,j): Represents the presence of an input weight going from
the ith layer to the jth input
net.layerConnect = [0 0;1 0];
%Layer weight connection going from the ith layer to the jth layer
net.outputConnect = [0 1];
%Connect to one destination (External World) from the network layers
net.inputs{1}.exampleInput = pn;
%Example Inputs
net.inputs{1}.processFcns = {'removeconstantrows'};
%Chosen Network Process Functions
net.layers{1}.size = 5;
%Number of Neurons
net.layerWeights{1,1}.learnfcn = 'learnwh';
net.layers{1}.transferFcn = 'logsig';
%Layer Transfer Function
net.layers{1}.initFcn = 'initnw';
%Weights and Bias initialization Function
net.layerWeights{2,1}.learnfcn = 'learnwh';
net.layers{2}.transferFcn = 'logsig';
%Layer Transfer Function
net.layers{2}.initFcn = 'initnw';
%Weights and Bias initialization Function
net.outputs{2}.exampleOutput = tn;
%Example output
net.outputs{2}.processFcns = {'removeconstantrows'};
%Output process functions
%%
%Network Functions
net.performFcn = 'mse';
%Network performance function
net.initFcn = 'initlay';
%Network initialization function. Calls layer weight and bias initialization
functions
net.divideFcn = 'dividerand';
%Separates inputs into training, validation, and test values
net.trainFcn = 'traingda';
%Network training function
net.plotFcns = {'plotperform','plottrainstate','plotregression'};
%Network plot function
%%
%Training the Network
net = init(net);
%Initialize the Network
net.trainParam.epochs = 500;
%Maximum number of epochs to train
net.trainParam.goal = 1e-5;
%Performance goal
127
net.trainParam.lr = 0.01;
%Learning rate
net.trainParam.lr_inc = 1.05;
%Ratio to increase learning rate
net.trainParam.lr_dec = 0.7;
%Ratio to decrease learning rate
net.trainParam.max_fail = 6;
%Maximum validation failure
net.trainParam.max_perf_inc = 1.04;
%Maximum performance increase
net.trainParam.min_grad = 1e-10;
%Minimum performance gradient
net.trainParam.show = 25;
%Epochs between displays
net.trainParam.time = 90;
%Maximum time to train in seconds
net.trainParam.showWindow = 1;
%Show training GUI
[net,tr,Y,E] = train(net,pn,tn);
%Call Training Function
Ecat = xlswrite('TRAINING.xlsx',abs(E'),'CAT','A1');
%Export Training Data
128
net.layerConnect = [0 0;1 0];
%Layer weight connection going from the ith layer to the jth layer
net.outputConnect = [0 1];
%Connect to one destination (External World) from the network layers
net.inputs{1}.exampleInput = pn;
%Example Inputs
net.inputs{1}.processFcns = {'removeconstantrows'};
%Chosen Network Process Functions
net.layers{1}.size = 5;
%Number of Neurons
net.layers{1}.transferFcn = 'logsig';
%Layer Transfer Function
net.layers{1}.initFcn = 'initnw';
%Weights and Bias initialization Function
net.layers{2}.transferFcn = 'logsig';
%Layer Transfer Function
net.layers{2}.initFcn = 'initnw';
%Weights and Bias initialization Function
net.outputs{2}.exampleOutput = tn;
%Example output
net.outputs{2}.processFcns = {'removeconstantrows'};
%Output process functions
%%
%Network Functions
net.performFcn = 'mse';
%Network performance function
net.initFcn = 'initlay';
%Network initialization function. Calls layer weight and bias initialization
functions
net.divideFcn = 'dividerand';
%Separates inputs into training, validation, and test values
net.trainFcn = 'train';
%Network training function
net.plotFcns = {'plotperform','plottrainstate','plotregression'};
%Network plot function
%%
%Training the Network
net = init(net);
%Initialize the Network
net.trainParam.epochs = 5000;
%Maximum number of epochs to train
net.trainParam.goal = 1e-5;
%Performance goal
net.trainParam.lr = 0.01;
%Learning rate
net.trainParam.max_fail = 6;
%Maximum validation failure
net.trainParam.mc = 0.9;
%Momentum Constant
net.trainParam.min_grad = 1e-10;
%Minimum performance gradient
net.trainParam.show = 25;
%Epochs between displays
net.trainParam.time = 90;
%Maximum time to train in seconds
net.trainParam.showWindow = 1;
%Show training GUI
129
[net,tr,Y,E] = train(net,pn,tn);
%Call Training Function
Emono = xlswrite('TRAINING.xlsx',abs(E'),'CAT','A307');
%Export Training Data
130
net.layers{2}.initFcn = 'initnw';
%Weights and Bias initialization Function
net.outputs{2}.exampleOutput = tn;
%Example output
net.outputs{2}.processFcns = {'removeconstantrows'};
%Output process functions
%%
%Network Functions
net.performFcn = 'mse';
%Network performance function
net.initFcn = 'initlay';
%Network initialization function. Calls layer weight and bias initialization
functions
net.divideFcn = 'dividerand';
%Separates inputs into training, validation, and test values
net.trainFcn = 'traingdx';
%Network training function
net.plotFcns = {'plotperform','plottrainstate','plotregression'};
%Network plot function
%%
%Training the Network
net = init(net);
%Initialize the Network
net.trainParam.epochs = 500;
%Maximum number of epochs to train
net.trainParam.goal = 1e-5;
%Performance goal
net.trainParam.lr = 0.01;
%Learning rate
net.trainParam.lr_inc = 1.05;
%Ratio to increase learning rate
net.trainParam.lr_dec = 0.7;
%Ratio to decrease learning rate
net.trainParam.max_fail = 6;
%Maximum validation failure
net.trainParam.max_perf_inc = 1.04;
%Maximum performance increase
net.trainParam.mc = 0.9;
%Momentum constant
net.trainParam.min_grad = 1e-10;
%Minimum performance gradient
net.trainParam.show = 25;
%Epochs between displays
net.trainParam.time = 90;
%Maximum time to train in seconds
net.trainParam.showWindow = 1;
%Show training GUI
[net,tr,Y,E] = train(net,pn,tn);
%Call Training Function
Ecat = xlswrite('TRAINING.xlsx',abs(E'),'CAT','A1');
%Export Training Data
131
%Copyright 2010
%%
%Definitions
%%
%Assemble the Training Data
%%
%Import model data from Excel
%num = xlsread(filename, sheet, 'range') reads data from a specific
Pcat = (xlsread('TRAINING.xlsx', 'CATTRAIN', 'A1:I304'))';
Tcat = (xlsread('TRAINING.xlsx', 'CATTRAIN', 'J1:P304'))';
[pn,ps] = mapminmax(Pcat,0,1);
%R X Q1 matrix of Q1 sample R-element input vectors
[tn,ts] = mapminmax(Tcat,0,1);
%SN X Q2 matrix of Q2 sample SN-element target vectors
%%
%Create the Network Object
net = network;
%Neural Network object
net.numInputs = 1;
%Number of input sources
net.numLayers = 2;
%Number of network layers
net.biasConnect = [1;1];
%Layer Bias Connections
net.inputConnect = [1;0];
%net.inputConnect(i,j): Represents the presence of an input weight going from
the ith layer to the jth input
net.layerConnect = [0 0;1 0];
%Layer weight connection going from the ith layer to the jth layer
net.outputConnect = [0 1];
%Connect to one destination (External World) from the network layers
net.inputs{1}.exampleInput = pn;
%Example Inputs
net.inputs{1}.processFcns = {'removeconstantrows'};
%Chosen Network Process Functions
net.layers{1}.size = 5;
%Number of Neurons
net.layerWeights{1,1}.learnfcn = 'learnwh';
net.layers{1}.transferFcn = 'logsig';
%Layer Transfer Function
net.layers{1}.initFcn = 'initnw';
%Weights and Bias initialization Function
net.layerWeights{2,1}.learnfcn = 'learnwh';
net.layers{2}.transferFcn = 'logsig';
%Layer Transfer Function
net.layers{2}.initFcn = 'initnw';
%Weights and Bias initialization Function
net.outputs{2}.exampleOutput = tn;
%Example output
net.outputs{2}.processFcns = {'removeconstantrows'};
%Output process functions
%%
%Network Functions
net.performFcn = 'mse';
%Network performance function
132
net.initFcn = 'initlay';
%Network initialization function. Calls layer weight and bias initialization
functions
net.divideFcn = 'dividerand';
%Separates inputs into training, validation, and test values
net.trainFcn = 'trainrp';
%Network training function
net.plotFcns = {'plotperform','plottrainstate','plotregression'};
%Network plot function
%%
%Training the Network
net = init(net);
%Initialize the Network
net.trainParam.epochs = 500;
%Maximum number of epochs to train
net.trainParam.goal = 1e-5;
%Performance goal
net.trainParam.lr = 0.05;
%Learning rate
net.trainParam.delt_inc = 1.2;
%Increment to weight change
net.trainParam.delt_dec = 0.5;
%Decrement to weight change
net.trainParam.delta0 = 0.07;
%Initial weight change
net.trainParam.max_fail = 6;
%Maximum validation failure
net.trainParam.deltamax = 50;
%Maximum weight change
net.trainParam.mc = 0.9;
%Momentum Constant
net.trainParam.min_grad = 1e-10;
%Minimum performance gradient
net.trainParam.show = 25;
%Epochs between displays
net.trainParam.time = 90;
%Maximum time to train in seconds
net.trainParam.showWindow = 1;
%Show training GUI
[net,tr,Y,E] = train(net,pn,tn);
%Call Training Function
Ecat = xlswrite('TRAINING.xlsx',abs(E'),'CAT','A1');
%Export Training Data
133
%%
%Simulate Monohull model data
CATDRIVESQUASHED = sim(net,pa);
%Squashed monohull SWBS cost output
%%
CATDRIVEACTUAL = mapminmax('reverse',CATDRIVESQUASHED,ts);
%Transform network outputs into actual SWBS cost values
C Performance Plots
Input Neurons – Hidden Layer Neurons – Output Neurons
134
C.1 MONOHULL
C.1.1 9-5-7
C.1.2 9-6-7
135
C.1.3 9-7-7
C.1.4 9-8-7
136
C.1.5 9-9-7
C.1.6 9-10-7
137
C.1.7 9-11-7
C.1.8 9-12-7
138
C.1.9 9-13-7
C.1.10 9-14-7
139
C.1.11 9-15-7
C.1.12 9-16-7
140
C.1.13 9-17-7
C.1.14 9-18-7
141
C.1.15 9-19-7
C.1.16 9-20-7
142
C.1.17 9-20-5-7
C.1.18 9-20-6-7
143
C.1.19 9-20-7-7
C.1.20 9-20-8-7
144
C.1.21 9-20-9-7
C.1.22 9-20-10-7
145
C.1.23 9-20-11-7
C.1.24 9-20-12-7
146
C.1.25 9-20-13-7
C.1.26 9-20-14-7
147
C.1.27 9-20-15-7
C.2 CATAMARAN
C.2.1 9-5-7
148
C.2.2 9-6-7
C.2.3 9-7-7
149
C.2.4 9-8-7
C.2.5 9-9-7
150
C.2.6 9-10-7
C.2.7 9-11-7
151
C.2.8 9-12-7
C.2.9 9-13-7
152
C.2.10 9-14-7
C.2.11 9-15-7
153
C.2.12 9-16-7
C.2.13 9-17-7
154
C.2.14 9-18-7
C.2.15 9-19-7
155
C.2.16 9-20-7
C.2.17 9-20-5-7
156
C.2.18 9-20-6-7
C.2.19 9-20-7-7
157
C.2.20 9-20-8-7
C.2.21 9-20-9-7
158
C.2.22 9-20-10-7
C.2.23 9-20-11-7
159
C.2.24 9-20-12-7
C.2.25 9-20-13-7
160
C.2.26 9-20-14-7
C.2.27 9-20-15-7
161
D Cost Plots
D.1 MONOHULL
D.1.1 SWBS Group 100 Cost PDF
-8
x 10
5
4.5
3.5
3
Density
2.5
1.5
0.5
0
0.5 1 1.5 2 2.5
SWBS Group 100 Direct Cost ($) 8
x 10
0.9
0.8
0.7
Cumulative probability
0.6
0.5
0.4
0.3
0.2
0.1
0
0.5 1 1.5 2 2.5
SWBS Group 100 Direct Cost ($) 8
x 10
162
D.1.3 SWBS 200 Group Cost PDF
-7
x 10
1.2
0.8
Density
0.6
0.4
0.2
0
1.5 2 2.5 3 3.5 4
SWBS Group 200 Direct Cost ($) 7
x 10
0.9
0.8
0.7
Cumulative probability
0.6
0.5
0.4
0.3
0.2
0.1
0
1.5 2 2.5 3 3.5 4
SWBS Group 200 Direct Cost ($) 7
x 10
163
D.1.5 SWBS 300 Group Cost PDF
-6
x 10
0.8
Density
0.6
0.4
0.2
0
5.5 6 6.5 7 7.5 8 8.5
SWBS Group 300 Direct Cost ($) 6
x 10
0.9
0.8
0.7
Cumulative probability
0.6
0.5
0.4
0.3
0.2
0.1
0
5.5 6 6.5 7 7.5 8 8.5
SWBS Group 300 Direct Cost ($) 6
x 10
164
D.1.7 SWBS 400 Group Cost PDF
-8
x 10
1.6
1.4
1.2
1
Density
0.8
0.6
0.4
0.2
0
2 3 4 5 6 7 8 9 10
SWBS Group 400 Direct Cost ($) 7
x 10
0.9
0.8
0.7
Cumulative probability
0.6
0.5
0.4
0.3
0.2
0.1
0
2 3 4 5 6 7 8 9 10
SWBS Group 400 Direct Cost ($) 7
x 10
165
D.1.9 SWBS Group 500 Cost PDF
-7
x 10
2.5
1.5
Density
0.5
0
0.9 1 1.1 1.2 1.3 1.4 1.5
SWBS Group 500 Direct Cost ($) 7
x 10
0.9
0.8
0.7
Cumulative probability
0.6
0.5
0.4
0.3
0.2
0.1
0
0.9 1 1.1 1.2 1.3 1.4
SWBS Group 500 Direct Cost ($) 7
x 10
166
D.1.11 SWBS 600 Group Cost PDF
-7
x 10
2.5
2
Density
1.5
0.5
0
0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6
Data 7
x 10
0.9
0.8
0.7
Cumulative probability
0.6
0.5
0.4
0.3
0.2
0.1
0
0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6
SWBS Group 600 Direct Cost ($) 7
x 10
167
D.1.13 SWBS 700 Group Cost PDF
-6
x 10
4
Density
0
3 4 5 6 7 8 9
SWBS Group 700 Direct Cost ($) 5
x 10
0.9
0.8
0.7
Cumulative probability
0.6
0.5
0.4
0.3
0.2
0.1
0
3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5
SWBS Group 700 Direct Cost ($) 5
x 10
168
D.1.15 Total Direct Cost PDF
-9
x 10
9
5
Density
0
1 1.5 2 2.5 3 3.5 4 4.5
Total Direct Cost ($) 8
x 10
0.9
0.8
0.7
Cumulative probability
0.6
0.5
0.4
0.3
0.2
0.1
0
1 1.5 2 2.5 3 3.5 4 4.5
Total Direct Cost ($) 8
x 10
169
D.2 CATAMARAN
D.2.1 SWBS Group 100 Cost PDF
-8
x 10
5
4.5
3.5
3
Density
2.5
1.5
0.5
0
0.5 1 1.5 2 2.5
Data 8
x 10
0.9
0.8
0.7
Cumulative probability
0.6
0.5
0.4
0.3
0.2
0.1
0
0.5 1 1.5 2 2.5
Data 8
x 10
170
D.2.3 SWBS Group 200 Cost PDF
-7
x 10
1.2
0.8
Density
0.6
0.4
0.2
0
1.5 2 2.5 3 3.5 4
Data 7
x 10
0.9
0.8
0.7
Cumulative probability
0.6
0.5
0.4
0.3
0.2
0.1
0
1.5 2 2.5 3 3.5 4
Data 7
x 10
171
D.2.5 SWBS Group 300 Cost PDF
-6
x 10
1.8
1.6
1.4
1.2
Density
0.8
0.6
0.4
0.2
0
5.2 5.4 5.6 5.8 6 6.2 6.4 6.6
Data 6
x 10
0.9
0.8
0.7
Cumulative probability
0.6
0.5
0.4
0.3
0.2
0.1
0
5.2 5.4 5.6 5.8 6 6.2 6.4 6.6
Data 6
x 10
172
D.2.7 SWBS Group 400 Cost PDF
-8
x 10
1.6
1.4
1.2
1
Density
0.8
0.6
0.4
0.2
0
2 3 4 5 6 7 8 9
Data 7
x 10
0.9
0.8
0.7
Cumulative probability
0.6
0.5
0.4
0.3
0.2
0.1
0
2 3 4 5 6 7 8 9
Data 7
x 10
173
D.2.9 SWBS Group 500 Cost PDF
-7
x 10
3
Density
0
0.85 0.9 0.95 1 1.05 1.1 1.15 1.2 1.25 1.3
Data 7
x 10
0.9
0.8
0.7
Cumulative probability
0.6
0.5
0.4
0.3
0.2
0.1
0
0.85 0.9 0.95 1 1.05 1.1 1.15 1.2 1.25 1.3
Data 7
x 10
174
D.2.11 SWBS Group 600 Cost PDF
-7
x 10
6
4
Density
0
0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5
Data 7
x 10
0.9
0.8
0.7
Cumulative probability
0.6
0.5
0.4
0.3
0.2
0.1
0
0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5
Data 7
x 10
175
D.2.13 SWBS Group 700 Cost PDF
-6
x 10
5
Density
0
3 4 5 6 7 8 9
Data 5
x 10
0.9
0.8
0.7
Cumulative probability
0.6
0.5
0.4
0.3
0.2
0.1
0
3 4 5 6 7 8 9
Data 5
x 10
176
D.2.15 Total Direct Cost PDF
-8
x 10
1
0.9
0.8
0.7
0.6
Density
0.5
0.4
0.3
0.2
0.1
0
1 1.5 2 2.5 3 3.5 4
Data 8
x 10
0.9
0.8
0.7
Cumulative probability
0.6
0.5
0.4
0.3
0.2
0.1
0
1 1.5 2 2.5 3 3.5 4
Data 8
x 10
177