Predicting Cost Overrun in Construction Projects: A. M. El-Kholy

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

International Journal of Construction Engineering and Management 2015, 4(4): 95-105

DOI: 10.5923/j.ijcem.20150404.01

Predicting Cost Overrun in Construction Projects


A. M. El-Kholy

Civil Engineering Dept., Faculty of Engineering, Beni-Suef University, Beni-Suef, Egypt

Abstract Two models for predicting cost overrun percentage in construction projects are presented. The first model
based on regression analysis. 44 factors that impact cost performance in construction projects gathered from literature. A
questionnaire survey was made on construction contractors in Egypt to evaluate the relative importance of these causes
from contractors' perspective. Eleven factors were obtained as the most significant causes that lead to cost overrun and
these are the independent variables of the proposed model. Data was collected for occurrence of the previous factors on
yes/no basis and the corresponding cost overrun percentage (dependent variable) for 30 construction projects and was
divided into two sets. The first set contains 20 projects for model building. The results revealed that there is a strong linear
relationship between cost overrun percentage and the previous 11 causes that significantly affect cost overrun of projects.
These causes are: financial condition of the owner, cash flow of contractor, method of procurement (open tender or
selective tender), material cost increase due to inflation, competition at tender stage (aggressive or not), fluctuations in the
currency that the payment will be made, project size (small or large), delay in design and approval, risk retained by client
for quantity variations, drawings (detailed or not), and inaccurate material estimating. The second set contains 10 projects
for validation purposes. The second model is a case based reasoning (CBR) model. CBR method can be an effective means
of utilizing knowledge gained from past experience to estimate percentage cost overrun in construction works. Validation
of the two models using projects of the second set revealed that regression model has prediction capabilities higher than
that of CBR model. Applying the absolute value of standardized coefficient (β) as attribute weight method provides the
highest prediction accuracy of cost overrun percentage. Also, feature counting gives results better than the original value of
(β).
Keywords Regression Analysis, Questionnaire Survey, Construction Projects, Case Based Reasoning Model

differing site conditions, safety, delayed payment on the


1. Introduction contract, and quality of work were presented as risks with
high importance. [5] focused on the contract related factors
The accuracy of early cost estimates in engineering and which play an important role in the allocation of risks
construction projects is extremely important to both owners between the owner and the contractor. [6] explained that
and project teams [1]. Decision making in the early stage of a country risk rating, material availability, type of contract,
project has a significant impact on the project. To evaluate advance payment were the major factors mpacting
alternatives, quick and accurate decision making is needed contingency decisions of the contractors. [7] developed a
under a limited definition of scope and constraints in multivariate regression model to predict cost estimate
available information and time [2]. However, limited and accuracy for capital projects.
uncertain information on the project and a complex The previous studies used various methodologies to solve
correlation among various factors that affect the project's the problem of predicting construction cost, cost contingency,
construction cost makes it difficult to predict and manage and cost overrun for construction projects. Some of the
pertinent task [3]. methods used in the previous studies include:
Several studies have attempted to determine the factors • Statistical methods such as multiple regression analysis
creating risk for construction projects. [4], has conducted a (MRA) for predicting construction cost [8-10]. [11]
survey to study the risk attitudes of large U.S. construction presented a regression model for predicting cost
firms. Among the 23 risk factors included in this survey, overrun of reconstruction projects. [7, 12, 13]
labor, equipment and material availability, labor and presented models for predicting cost contingency.
equipment productivity, defective design, changes in work, • Repetitive learning methods such as artificial neural
networks (ANN) for predicting construction cost [12,
* Corresponding author:
amrelkholy_2012@ yahoo.com (A. M. El-Kholy)
13]. [11] presented an ANN model for predicting cost
Published online at http://journal.sapub.org/ijcem overrun of reconstruction projects in addition to the
Copyright © 2015 Scientific & Academic Publishing. All Rights Reserved regression analysis mentioned above.
96 A. M. El-Kholy: Predicting Cost Overrun in Construction Projects

• Stochastic methods such as Monte- Carlo simulation Hereinafter, the two words: factor and cause are
(MCS). [14] conducted a simulation model for Synonymous.
predicting the construction cost.
• Analogical methods such as Case-based reasoning
(CBR) for predicting the construction cost [2,12,15-17]. 2. Research Scope and Methodology
[3] stated that such methodologies have distinct In the current research two proposed predictive models are
characteristics in terms of applied fields, analysis of data, intended to be applicable for predicting cost overrun
methods of system establishment, and types of results. percentage of construction projects. These models are based
Multiple regression analysis arrives at the result through on regression analysis and case based reasoning. A standard
statistical analysis, but its result is too linear to be used as a methodology will be adopted. As an initial step to meet the
standardized model. Artificial neural networks are more objectives, previous research papers that deal with causes of
accurate than MRA, but it has a black box that cannot explain cost overrun in construction projects were reviewed in
the structure of the model. Monte Carlo simulation has the previous section to investigate causes of cost overrun in
function of analyzing the outlier using the probability these projects. CBR technique is explained. A list of causes
approach [3]. In their work, [18] conducted an analysis of of cost overrun in construction projects is prepared to collect
time and cost overruns for a sample of 102 educational data about the significance of these causes through
projects. They showed that about 32.35% of the selected questionnaire survey. The next step is to analyze the survey
projects have exposed to cost overrun. On the other side, results to obtain the most significant causes of cost overrun
time overrun was only noticed on about 28.43% projects. to be incorporated into the predictive models. Building
The average percentage of the actual cost overrun was found regression based model is then demonstrated and a numerical
to be inversely proportional to the project size. They example is prepared to show how the model predicts cost
developed regression models for cost and time overruns. overrun percentage of a project. The next step is to apply the
They tested the validity of these models which assessed in case based model to an example project to show how the
expected cost and time overruns for any future projects at model performs step by step. The last step of this research is
level of confidence 96.67% and 94.88% respectively. [19] to validate the proposed models. Based on the validation
conducted a research to determine the influence ranks of 52 results, the prediction accuracy of the two models is
factors causing cost variation for constructing wastewater compared and conclusions are drawn.
projects in Egypt based on the quantified relative importance
indices. The factors were classified under four primary
classifications: (1) Owner originated category; (2) Designer 3. Case Based Reasoning
originated category; (3) Contractor originated category; and
(4) Miscellaneous category. The results were grouped under CBR is the process of retrieving previous cases similar to a
experience-based group and professional cadre of new problem, solving the new problem by adapting
respondents. This study revealed the importance of owner previously determined solutions of similar previous cases,
originated category effect on causes of cost variation for and storing the new successful solution for future use [20].
constructing wastewater projects over the other arranged [15] stated that CBR utilizes knowledge gained from past
three categories. The most predictable and significant factor experiences and can be viewed as an effective method for
was ‘‘Lowest bidding procurement method’’ related to estimation in construction. It has been observed that CBR
‘‘Owner Originated Category’’. Also, he declared that the methods can increase the accuracy of construction cost
most cost variation can also be made by the owner due to estimates [21-24]. [2] reported that CBR requires usually
additional work and bureaucracy in bidding/tendering four steps; case representation, case retrieval, case adaptation,
method. On the other hand, the less effect factor was and case retaining. Cases are represented by attributes
‘‘Domination of construction industry by foreign firms and describing the circumstance of the problem and its solution.
aids’’ related to ‘‘Miscellaneous Category’’. Similar previous cases best matching the new problem are
On the other hand, CBR has characteristics that are similar retrieved. The solution(s) of the retrieved cases are adapted
to humans' heuristic approach in which decisions are made to fit the new problem. New solution(s) are retained for
on experience. future use once it has been approved. [15] explained that
In this paper a MRA model is developed for predicting there are two challenges related to the retrieval process that
cost overrun percentage for construction projects. A second still needs to be addressed. One issue is the computation of
model is developed depending on CBR for the purpose of attribute similarity which is particularly important during the
comparison. retrieval process. For calculating attribute similarity: if an
The major objectives of this paper are as follows: (1) attribute is of nominal scale, and its value in a previous case
Investigate the causes that significantly affect cost overrun of is the same as in a new case, then the attribute is rated as one,
construction projects; (2) Propose two models: based on otherwise it is rated zero [3]. On the other hand, if an
regression analysis and case based reasoning method to attribute is either of interval scale or ratio scale, it is scored
predict percentage of cost overrun for construction projects. by Eq. 1. The second challenge is how to assign the attribute
International Journal of Construction Engineering and Management 2015, 4(4): 95-105 97

weight values that enable the most similar case to be 5. Questionnaire Survey
identified by an index of corresponding features. [3] declared
A questionnaire was developed to collect data about the
that more than one methodology can be used for calculating
significance of the causes that impact cost of construction
attribute weight, as follows:
projects compiled in Table 1. The questionnaire was divided
• Feature counting: this method applies the same weight into two main parts. The first part gathered basic information
to all the attributes. about the experience of the respondent, experience of the
• MRA: this method uses the original standardized value company, and volume of work of the company. In the second
of the coefficient (β) as the attribute weight. On the part the factors compiled in Table 1 was organized in the
other hand, [25] used the absolute value of standardized form of two priority scaling, one for occurrence frequency,
coefficient (β). while the other for severity scaling. The priority scaling for
• ANN: this method uses the sensitivity coefficient as the occurrence frequency was as follows: 5=Always, 4=often,
attribute weight. 3=usually, 2=sometimes, and 1=scarcely, while the severity
In the current research, the attributes are of nominal scale, scaling was: 5=very severe, 4=severe, 3=somewhat severe,
since the data gathered depends on a yes/no basis. Thus, any 2=little effect, 1=very little effect. The participants were
attribute will be rated one if its value in case base is the same asked to assign a number from 1 to 5 to each cause for both
as in test case, otherwise it will be rated zero. Also, for occurrence frequency and severity according to its
calculating the attributes weights: feature counting, original significance. Besides, the questionnaire included collection
value of the standardized value (β), and its absolute value of data for actual past construction projects. The data
will be used for the purpose of comparison to improve the included occurrence of previous factors impact cost
prediction capacity of CBR model. To calculate the case performance of construction projects presented in Table 1 on
similarity, the attribute similarity is multiplied by its weight a yes/no basis . In other words, if in a past project, one of the
of importance and summed up to obtain total similarity score previous causes occurred, the respondent assigns yes to this
of each case as given in Eq. 2, [2]. cause otherwise, he assigns no. Also, the actual cost overrun
percentages of these projects are gathered.
To determine the sample size of the questionnaire three
 Min(Av testcase , Av retricase )
  If FAS ≥ MCAS criteria usually will need to be specified: the level of
FAS =  Max(Av testcase , Av retricase ) (1) precision, the level of confidence or risk, and the degree of
0 If F < MCAS
 AS variability in the attributes being measured [34]. [35]
reported that, the level of precision is the range in which the
true value of the population is estimated to be. This range is
Where FAS is the function of the attribute similarity,
often expressed in percentage points, (e.g., ±10 percent).
Av is the attribute value of test case, Av Thus, if a researcher finds that 60% of respondents in the
testcase retricase sample have adopted a recommended practice with a
is the attribute value of the retrieved case, and MCAS is the
precision rate of ±10%, then he or she can conclude that
minimum criterion for scoring the attribute similarity.
between 50% and 70% of respondents in the population have
n adopted the practice. For the confidence or risk level, if a
S i = ∑ I ij *W j (2)
j 95% confidence level is selected, then 95 out of 100 samples
will have the true population value within the range of
Where i =case identification number; j =attribute precision specified earlier. The third criterion, the degree of
identification number; S = similarity value of case i ; variability in the attributes being measured refers to the
i distribution of attributes in the population. The more
I ij = similarity value of attribute j ; and W j = weight of heterogeneous a population, the larger the sample size
required to obtain a given level of precision. A proportion of
attribute j . 50% indicates a greater level of variability than either 20% or
80%. This is because 20% and 80% indicate that a large
majority do not or do, respectively, have the attribute of
4. Factors Impact Cost Performance in interest. Because a proportion of 0.5 indicates the maximum
Construction Projects variability in a population, it is often used in determining a
more conservative sample size.
In this study, 44 factors are identified as causes of cost
For population that are large, [36] developed Eq. 3 to yield
overrun in construction projects were gathered from
literature: [1, 6, 11, 26-33] as shown in Table 1. These a representative sample size for large population ( n0 ). If the
factors serve as the independent variables in the predictive population is small, one can use Eq. 4., where( n ) is the
model of cost overrun percentage for construction projects. sample size for small population.
98 A. M. El-Kholy: Predicting Cost Overrun in Construction Projects

2
Z 2 pq Where Z is the abscissa of the normal curve that cuts
n0 = (3) off an area at the tails (1 – equals the desired confidence level,
e2 e.g., 90%), e is the desired level of precision, p is the
n0 estimated proportion of an attribute that is presented in the
n=
(n0 − 1) (4) population, and q is 1 − p . The value for Z is found in
1+
N statistical tables which contain the area under the normal
curve. N is the population.
Table 1. Factors that Impact Cost in Construction Projects

No Factor Identification RIW Rank


F1 Financial condition of the owner 17.4 1
F2 Cash flow of contractor 14.3 2
F3 Method of procurement (open tender or selective tender) 14.2 3
F4 Material cost increase due to inflation 13.9 4
F5 Competition at tender stage( aggressive or not) 12.6 5
F6 Fluctuations in the currency that the payment will be made 11.8 6
F7 Project size (small or large) 11.6 7
F8 Delay in design and approval 11.4 8
F9 Risk retained by client for quantity variations 11.3 9
F10 Drawings (detailed or not) 10.3 10
F11 Inaccurate material estimating 10.3 11
F12 Estimated cost 10 12
F13 Adequacy of quality requirements 9.8 13
F14 Design change 9.5 14
F15 Location of project 9.1 15
F16 How the estimate is prepared? (detailed or not) 9.0 16
F17 Reluctance in timely decision 9.0 17
F18 Difference between low bid and owner's estimate 9.0 18
F19 What is known about the project at the tender stage? 8.7 19
F20 Client characteristics 8.6 20
F21 Unknown geological conditions 8.5 21
F22 Ignorance and lack of knowledge 8.5 22
F23 Liquidated damages 8.4 23
F24 Adequacy of schedule requirements 8.2 24
F25 Conflict among project participants 8.1 25
F26 Quality standards and specifications 7.9 26
F27 Design complexity 7.8 27
F28 Scope change by owner 7.8 28
F29 Time variance 7.8 29
F30 Advanced payment amount 7.5 30
F31 Prequalification of contractors 7.4 31
F32 Level of construction complexity related to new technology 7.4 32
F33 Equipment percentage 7.4 33
F34 Site layout 7.0 34
F35 Time allowed for preparation of estimate 6.9 35
F36 Workload 6.6 36
F37 Contract Type (unit price or lump sum) 6.4 37
F38 Adequacy of dispute settlement procedure 6.4 38
F39 Inspection and testing 6.4 39
F40 Adequacy of safety and environmental requirements 5.8 40
F41 Similar project experience 5.5 41
F42 Weather conditions 5.4 42
F43 Site access 5.4 43
F44 Site congestion 4.4 44
International Journal of Construction Engineering and Management 2015, 4(4): 95-105 99

The questionnaire survey was performed in Egypt, the Whereas, degree of severity refers to the negative impact that
population is 465, which represent the number of contractors the cause contribute to the project cost overrun. The
works in construction projects with LE 2.5 millions or more, importance indices were used to measure the relative weight
this number was obtained from Egyptian Federation for for each factor. The relative importance weight (RIW) was
Construction & Building Contractors. The population is computed using Eq. 6. The cause financial condition of the
large, thus Eq. 3 is applied first for determining an initial owner, for example, if it's assigned (4=often), for frequency
sample size ( n0 ). A confidence level, 90% is assumed, thus of occurrence, this means that the interviewer assigns 80%
Z =1.65 from normality tables, p is assumed 0.5, e is probability for the occurrence of this factor effect in previous
projects according to his experience. In these projects this
assumed (±15%). Substituting about: Z , p , q , and e in cause contributed to these projects cost overrun. On the other
Eq. 3, results in an initial sample size n0 =30.25. hand, if this factor assigned (4=severe) for the degree of
severity, this means that the impact of this factor was severe
Substituting about: n0 and N in Eq. 4, results in sample on these projects' cost overrun. Table 1 shows the factors
size n =28.5. arranged in descending order according to their
Logically the anticipated response rate, will not be 100%. corresponding RIW, such that the factor received the highest
Accordingly, the questionnaire was sent to 43 contracting RIW is assigned rank equal to one.
companies specialized in construction projects. Some of the Importance Index (II)
questionnaires were sent via mail after contacting the = Occurrence frequency*degree of severity (5)
participants through telephones, whereas, the other part was
Relative Importance Weight (RIW)
through individual meetings. Most of the participants were at
∏ *corresponding no. of respondents
the level of general managers. = ∑ (6)
Total no. of respondents

6. Survey Results and Analysis Financial condition of the owner comes out as the most
important factor contributing to cost overrun in construction
A total of 30 questionnaires were completed and returned. projects, it was ranked the first. Cash flow of contractor
The response rate was 69.8 %. This response rate is received the second rank. It seems that, if the contractor
considered acceptable for a survey focusing on gaining suffers from negative cash flow for most or all other projects,
responses from industry practitioners [37]. The respondents he fails to finance the project under consideration, thus the
included general managers, technical office managers, and project is extended, which leads to cost overrun. The third
construction managers. All the participants are involved in ranked factor was the method of procurement (open tender or
building projects in addition to other specializations. 82% of selective tender). It seems that, when open tender applied as
them are involved in public water and sewage projects. Also, a method for project procurement to the contractors, they
42% of them are involved in civil works (bridges, roads, and decrease cost contingency in projects or it is neglected
airports). The author believes that the variations in positions completely. Material cost increase due to inflation was
besides the variations in the specialization for the ranked 4, since the trend of inflation is probably due to
participants enrich this study to a great extent. This is demand exceeding supply, this creates scarcity of goods and
because data reliability is related to data source and the hence the prices of materials increase, which result in cost
identification of the position held by the person who overrun for the construction project. On the other hand, [32]
completed the questionnaire [38]. found that this factor is among three main causes of cost
To give additional credibility for the findings of this overrun. Competition at tender stage (aggressive or not)
survey, the participants were asked about their length of received the fifth rank (see Table 1), it seems that when the
experience and length of experience of their companies. 89% competition is aggressive, contractors decrease cost
of the respondents have an experience more than 10 years, contingency or it is neglected completely. Fluctuations in the
whereas, 57% have an experience more than 20 years. 92% currency, that the payment will be made ranked 6. Project
of the companies have an experience more than 10 years, size (small or large) ranked 7. It seems that cost overrun
whereas 50% have an experience equal to or greater than 25 appears to be more predominant among smaller projects
years. 78% of the companies have an annual volume of work compared to larger ones. Delay in design and approval, Risk
more than LE 25 millions, whereas 42% have an annual retained by client for quantity variations, drawings (detailed
volume of work equal to or more than LE 250 millions. or not), and inaccurate material estimating were ranked: 8, 9,
In order to assess the significance of the identified causes, 10, and 11, respectively. Causes received RIW less than 10
an importance index for each factor was calculated, as will not be considered in the predictive model to reduce the
illustrated in Eq. (5), by multiplying the frequency of number of variables to a manageable number. Table 2 lists
occurrence by the degree of severity or impact. Frequency the final 11 factors (independent variables) used to develop
occurrence refers to the probability that any cause given in the regression model.
Table 1 occurs in a project and contributes to its cost overrun.
100 A. M. El-Kholy: Predicting Cost Overrun in Construction Projects

7. Regression Based Model model's expected performance. The underlying formula of


the model is as follows:
Data for 30 construction projects was collected. These
data include the occurrence of factors presented in Table 1 on Percentage cost overrun
a yes/ no basis, and the corresponding actual cost overrun = 0.214+ 0.046 (Financial condition of the owner)
percentage. The data was divided into two sets. The first set + 0.201 (Cash flow of contractor)
contains 20 projects for the purpose of model building. The + 0.345 (Method of procurement (Open tender or
second set contains 10 projects for validation purposes. An Selective tender))- 0.177(Material cost increase due to
initial experimentation with a regression model that includes inflation) -0.197 (Competition at tender stage (aggressive
all 11 variables using SPSS 13 software was performed. or not))-0.108 (Fluctuations in the currency that the
Forward- stepping and backward-stepping methods were payment will be made)-0.078(Project size (small or
large))-0.284 (Delay in design and approval) +0.08 (Risk
used. Forward stepping begins with entering the most
retained by client for quantity variations) + 0.184
significant variable at the first step, and continues adding and
(Drawings (detailed or not))+0.08 (Inaccurate material
deleting variables until none can significantly improve the fit.
estimating). (7)
Backward stepping, on the other hand begins with all
candidate variables then removes the least significant Each of the 11 variables can have a 0 (unused), or 1 (used)
variable at the first step and continues until no insignificant value. To show how the model predicts the cost overrun
variable remains. Forward- stepping or backward-stepping percentage, an example project was obtained from Arab
technique gave the same model for predicting the percentage Contractors Company works in Egypt. This project is the
of cost overrun for construction projects depending on 11 construction of an hotel in Ismalia city in Egypt with a lump
variables (see Table 3) with a squared multiple R= 0.83. This sum contract. The contract value is LE 6 millions and the
indicates that the model is able to explain 83 % of the duration is 3 years. The project characteristics are as follows:
variability in the data, which is an excellent indicator of the
Table 2. Candidate Independent Variable Final List

No. Variable (RIW)

1 Financial condition of the owner 17.4


2 Cash flow of contractor 14.3
3 Method of procurement (open tender or selective tender) 14.2
4 Material cost increase due to inflation 13.9
5 Competition at tender stage (aggressive or not) 12.6
6 Fluctuations in the currency that the payment will be made 11.8
7 Project size (small or large) 11.6
8 Delay in design and approval 11.4
9 Risk retained by client for quantity variations 11.3
10 Drawings (detailed or not) 10.3
11 Inaccurate material estimating 10.3

Table 3. Regression Model

Constant and Variables Coefficient

Constant 0.214
Financial condition of the owner 0.046
Cash flow of contractor 0.201
Method of procurement (open tender or selective tender) 0.345
Material cost increase due to inflation -0.177
Competition at tender stage (aggressive or not) -0.197
Fluctuations in the currency that the payment will be made -0.108
Project size (small or large) -0.078
Delay in design and approval -0.284
Risk retained by client for quantity variations 0.080
Drawings (detailed or not) 0.184
Inaccurate material estimating 0.080

Squared Multiple R =0.83


International Journal of Construction Engineering and Management 2015, 4(4): 95-105 101

Financial condition of the owner was bad (1); cash flow of in the case based reasoning model, MRA must run first.
contractor was bad (1); method of procurement was open Feature counting method is used for calculating attributes
tender (1); material cost increased due to inflation was weights without running MRA. The similarity value of each
occurred (1); competition at tender stage was not aggressive attribute is multiplied by its weight resulted from any method
(0); fluctuations in the currency that the payment will be of the three previous methods and summed to obtain the
made were occurred (1); project size was small (1); delay in similarity value of each case (see Eq. 2).
design and approvals was occurred (1); there was a risk
retained by the client for quantity variations (1); drawings 8.2. Matching and Retrieval
was detailed (0); material estimating was accurate (0) In comparison with the new cases in which percentage
The predicted cost overrun percentage will be obtained as cost overrun will be estimated, the most similar case (the
follows: case with the highest similarity value from a case base) to the
Cost overrun percentage new case is retrieved. If more than one case in case base has
= 0.214 +0.046* 1 +0.201*1+0.345*1-0.177*1-0.197*0 the same similarity value, the author suggests using the mean
- 0.108*1-0.078*1-0.284*1+0.08*1+0.184*0 +0.08*0 value of cost overrun percentage for these cases.
= 0.239.
8.3. Adaptation
This result means that the predicted cost overrun
percentage is 23.9 %. This model will be validated later In this study, all the attributes are of nominal scale, thus no
using the second set of projects. adaptation is used to adjust cost overrun percentage for the
new cases.

8. Case-Based Reasoning Model 8.4. Case Retaining


In this research, the new cases are retained for future use,
Based on previous cases (first set of projects), a case base
i.e in finding the solution of the second new case, the first
is developed. Then, those cases that are similar to the new
new case is used among the cases of the case base. Also, in
cases are retrieved from the case base in order to estimate
finding the solution of the third new case, the first and second
cost overrun percentage of the new cases. To retrieve similar
cases, the similarity values are calculated by multiplying cases are used among the cases of the case base and so on.
each similarity value ( I 11 , I 12 , I 13 ,.., I 1 j ,…., I m1 , I m 2 ,
I m 3 ,…, I mj ) of each attribute (factor) for a case in the case 9. Example Project
base and new case by corresponding attribute weight To show how the CBR model performs, an example
(W1,W2,W3,….,Wj) and then summing all of them. The project is solved step by step to predict cost overrun
weights of attributes are variables. The case with the highest percentage for this project. This project is the previous
similarity value is used to estimate cost overrun percentage project used in predicting cost overrun in the regression
of the new case. In the following subsections, these based model. The values of different factors are given in
processes are described in details. Table 4 such that if the factor has been occurred it is assigned
a value of 1, otherwise it's assigned zero.
8.1. Case Representation and Attributes
Calculating the attributes similarity
Each case is represented by the attributes identification
Table 4 shows the data for actual 20 construction projects
and dependent variable (percentage of cost overrun). In
obtained from the questionnaire (the first set of projects).
attributes identification, the attributes are presented which,
These data are the attributes (factors) values according to
are the previous 11 factors that impact cost of construction
their occurrence and the corresponding actual percentage of
projects included in the regression model. The attributes are
cost overrun. The attribute F1 is assigned a similarity value
used in calculating degree of similarity between a new case
one as its value in the example project is similar to that of
(test case) and each case in the case base. In current research,
case 1. On the other hand, F5 (for example) in case 1 is
all the attributes are of nominal scale, since each attribute
assigned a value of one, whereas its assigned a value of zero
assigned a value of one if occurred, otherwise zero. Thus, a
in the example project, accordingly its similarity value is
similarity value of attribute is assigned one if its value in
zero. The similarity value of all other atrributes in case 1, are
each case of case base is the same as its value in test case,
given in Table 4. Other cases are calculated as presented in
otherwise, 0. Also, three methods are used for calculating the
example project.
attributes weights: feature counting, standardized coefficient
(β) of MRA, and the absolute value of the standardized Calculating attributes' weights
coefficient (β) for the purpose of comparison to improve the Table 5 shows, the previously mentioned three methods
prediction capacity. It must be noted that, standardized for calculating attributes weights. In feature counting method
coefficient (β) is a regression coefficient in a standard format each attribute receive a weight of (1/11 =0.0909). In the
as given in the results of SPSS software. Also, if this second and third methods, the standardized coefficient (β)
coefficient used as a method of calculating attributes weights, and its absolute value resulted from regression model are
102 A. M. El-Kholy: Predicting Cost Overrun in Construction Projects

used as weights to the attributes (see Table 5). =-0.062


Similarity value of each case Calculating similarity value for all cases in case base and
Applying Eq. 2, the total similarity value for each case in test case using [standardized coefficient (β)] for weighting
case base and test case can be calculated by multiplying attributes, (for example), revealed that case 3 received the
similarity value of each attribute by its weight. The total highest value (1.656). Thus, case 3 is retrieved and the
similarity value (S1) between case 1 and test case (example predicted value of cost overrun percentage for the example
project) using the second method (standardized coefficient project is 15% (actual value for case 3).
(β)) for weighting attributes (for example), is calculated as Case reataining
follows: The new case, which consists of attributes of example
S1=1*0.127+1*0.582+1*0.998-*1*0.513-0*0.596 project and 15% cost overrun percentage is retained for
-1*0.328-1*0.238-1*0.69+0*0.232+0*0.398+0*0.238 future use in addition to cases of case base.
Table 4. Profile of Cases for Case Base and Test Case (Example Project)

Attributes % Actual Cost


Case base No.
F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 Overrun
1 1 1 1 1 1 1 1 1 0 1 1 0.20
2 1 1 1 1 1 1 1 1 0 1 1 0.25
3 1 1 0 1 0 1 1 1 0 1 1 0.15
4 1 1 1 1 0 1 1 1 1 1 0 0.55
5 0 0 0 0 1 0 0 0 0 0 0 0.05
6 1 1 1 1 1 1 1 1 0 1 0 0.15
7 1 1 1 1 1 1 1 1 1 1 1 0.40
8 0 0 1 1 1 0 0 1 1 1 0 0.15
9 1 1 1 0 0 0 1 1 0 1 1 0.75
10 1 1 1 1 1 1 1 1 1 1 1 0.15
11 1 1 1 1 1 1 1 1 1 1 1 0.20
12 0 0 0 0 0 0 0 1 0 1 0 0.05
13 1 0 1 0 1 0 0 1 0 1 0 0.35
14 0 0 0 0 0 0 0 0 0 1 0 0.35
15 1 1 0 1 0 0 0 0 0 0 0 0.25
16 0 0 0 1 1 0 0 0 1 1 0 0.15
17 1 1 1 1 0 1 1 1 0 1 0 0.25
18 0 1 1 1 1 1 0 1 0 1 1 0.30
19 1 1 0 0 0 0 0 1 0 1 0 0.35
20 1 0 1 0 0 0 0 1 1 0 0 0.40
Example Project 1 1 1 1 0 1 1 1 1 0 0 0.25
Attr. Sim. for case1 1 1 1 1 0 1 1 1 0 0 0 -

Table 5. Attribute Weights by Methods

Method of Weighting
No. Attribute Features Standarized Absolute Standarized
Counting Coefficient ( β ) Coefficient ( β )
F1 Financial condition of the owner 0.0909 +0.127 0.127
F2 Cash flow of contractor 0.0909 +0.582 0.582
F3 Method of procurement (open tender or selective tender) 0.0909 +0.998 0.998
F4 Material cost increase due to inflation 0.0909 -0.513 0.513
F5 Competition at tender stage (aggressive or not) 0.0909 -0.596 0.596
F6 Fluctuations in the currency that the payment will be made 0.0909 -0.328 0.328
F7 Project size (small or large) 0.0909 -0.238 0.238
F8 Delay in design and approval 0.0909 -0.690 0.690
F9 Risk retained by client for quantity variations 0.0909 +0.232 0.232
F10 Drawings (detailed or not) 0.0909 +0.398 0.398
F11 Inaccurate material estimating 0.0909 +0.238 0.238
International Journal of Construction Engineering and Management 2015, 4(4): 95-105 103

Table 6. Models Validation

Percent Cost Overrun Output


Case Based Reasoning Model
Case Project Project Regression
Actual Model Feature Standardized Absolute Standardized
Counting Coefficient ( β ) Coefficient ( β )

Example Problem 25 23.9 55 15 55


2 40 22.6 22.5 75 23.3
3 30 22.6 22.5 75 23.3
4 15 9.8 25 35 25
5 10 15.8 15 25 15
6 45 30.6 25 75 25
7 25 27.3 20 15 25
8 20 30.6 25 75 25
9 45 34 40 75 35
10 60 49.5 42.5 35 40
Average
0.00 34.8 45.9 58.2 40.7
% Error

Average % Error = X actual − X estimated / X estimated × 100

cost performance. It was intended that causes received a


10. Models Validation relative importance weight higher than 10 are significant and
A comparison between the regression model estimate and incorporated into the model as independent variables.
the case based reasoning model estimate is shown in Table 6. Accordingly, 11 significant causes were identified. The
It provides the actual cost overrun percentage, predicted cost dependent variable was the cost overrun percentage.
overrun percentage, and the analysis of the average percent Two models were developed to predict cost overrun
error for 10 projects including the example project (these are percentage in construction projects. The first model based on
projects of the second set). In case based reasoning model, regression analysis. Data of 20 projects was used for model
three methods for calculating attributes weights were used as building, while the data of remaining 10 projects was used
presented previously. In general, the regression based model for validation purposes. The best model was found accurate
shows prediction accuracy better than that of case based in predicting cost overrun percentage contains the previous
reasoning model. Average % error=34.8 for regression based 11 causes. These are: financial condition of the owner, cash
model, whereas this percentage is varied for CBR model flow of contractor, method of procurement (open tender or
according to weight assignment method for attributes. Best selective tender), material cost increase due to inflation,
results for CBR model are obtained when applying absolute Competition at tender stage (aggressive or not), fluctuations
standardized coefficient (β) as assignment method for in the currency that the payment will be made, project size
attributes (average % error= 40.7). This percentage is 45.9 (small or large), project size (small or large), delay in design
for feature counting method. On the other hand, this and approval, risk retained by client for quantity variations,
percentage is 58.2 when applying the original value of drawings (detailed or not), and inaccurate material
standardized coefficient (β). estimating.
The second model based on case based reasoning.
Validation of the two models revealed that regression model
11. Conclusions and Future has prediction capabilities higher than that of CBR model in
Recommendations predicting cost overrun percentage for construction projects.
On the other hand, testing the case based reasoning model's
This paper investigated the effect of causes of cost overrun effectiveness with respect to the weight assignment method
affecting construction projects through a questionnaire for attributes, revealed that best results are obtained when
survey. These causes were established from literature. The applying absolute standardized coefficient (β). Feature
questionnaire survey used a structured format to obtain counting method gave results better than the original value of
information related to the occurrence of the previous causes (β). This research provides an approach for industry
in actual projects on a yes/no basis. Based on the results of practitioners to predict cost overrun percentage for
the questionnaires a relative importance weight was construction projects. On the other hand, it provides
established for each cause to quantify its effect on project researchers with a methodology to build regression and case
104 A. M. El-Kholy: Predicting Cost Overrun in Construction Projects

based reasoning models for cost overrun percentage Performance" J. of Constr. Eng. and Manage., 131 (12),
prediction. Computer implementation for case based 1257-1262, 2005.
reasoning model is suggested for future research, for easily [15] Ji, S.H., Park, M., and Lee, H.S." Cost Estimation Model for
implementation. Building Projects Using Case-Based Reasoning" Can. J. Civ.
Eng., 38, 570-581, 2011.
[16] Ryu, H. "Construction Planning Methodology Using
Case-Based Reasoning (COPLA-CBR)" Ph.D., thesis, Seoul,
National Univ., Seoul. Korea, 2007.
REFERENCES
[17] Duverlie, P., and Castelain, J.M., "Cost Estimation during
[1] Oberlender, G.D., and Trost, S.M. "Predicting Accuracy of Design Step: Parametric Method Versus Case Based
Early Cost Estimates Based on Estimate Quality." J. of Constr. Reasoning Method" Adv. Manuf. Technol. 15 (12),1999.
Eng. and Manage., 127, 173-182, 2001.
[18] Kholif, W., Hosny, H., and Sanad, A. "Analysis of Time and
[2] Kim, K. J., and Kim, K." Preliminary Cost Estimation Model Cost Overruns in Educational Building Projects in Egypt" Int.
Using Case-Based Reasoning and Genetic Algorithms" J. J. of Eng. and Technical Research (IJETR), ISSN: 2321-0869,
Comput. Civ. Eng., 24 (6), 499-505, 2010. 10(1), 2013.
[3] Koo, C.W., Hong, T., Hyun, C., and Koo, K. "A CBR-Based [19] Aziz, R., F., "Factors Causing Cost Variation for
Hybrid Model for Predicting a Construction Duration and Constructing Wastewater Projects in Egypt" Alexandria Eng.
Cost Based on Project Characteristics in Multi-Family J., 52, 51–66, 2013.
Housing Projects" Can. J. Civ. Eng., 37, 739-752, 2010.
[20] Pal, S.K., and Shiu, S.C.K. "Foundations of Soft Case-Based
[4] Kangari, R. "Risk Management Perceptions and Trends of Reasoning" Wiley, Hoboken, N.J., 2004.
U.S. Construction." J. of constr. Eng. and Manage., 121(4),
422-429, 1995. [21] Karshenas, S., and Tse, J. "A Case –Based Reasoning
Approach to Construction Cost Estimating" J. Comput. in
[5] Ibbs, W.C., and Ashley, D.B., "Impact of Various Civil Eng., 113-123, 2002.
Construction Contract Clauses." J. of Constr. Eng. and
Manage., 113(3), 501-521,1987. [22] Chua, D.K.H, and Loh, P.K. "CB- Contract: Case Based
Reasoning Approach to Construction Contract Strategy
[6] Sonmez, R., Ergin, A., and Birgonul. T. "Quantitative Formulation" "J. Comput. in Civil Eng., 20 (5), 339-350,
Methodology for Determination of Cost Contingency in 2006.
International Projects", J. of Manage. in Eng., 23,(1), 35-39,
2007. [23] Yi, J. "A Study on Case- Based Forecasting Model for
Monthly Expenditures of Residential Building Project"
[7] Trost, S., M., and Oberlender, G., D. "Predicting Accuracy of Korean J. of Constr. Eng. and Manage., 79 (1), 128-137,
Early Cost Estimates Using Factor Analysis and multivariate 2006.
Regression", J. of constr. Eng. and Manage., 129 (2), 198-204,
2003. [24] An, S. H., Kim G., and Kang, K." A Case Based Reasoning
Cost Estimating Model Using Experience by Analytic
[8] Abu Hammad, A. A., Ali, S. M. A., Sweis, G., J., and Basher, Hierarchy Process" Building and Environment, 42 (7),
A." Prediction Model for Construction Cost and Duration in 2573-2579, 2007.
Jordan", Jordan J. of Civil Engineering, 2(3), 250-266, 2008.
[25] Kim, G., Kim, S., and Kang, K. "Comparing Accuracy of
[9] Lowe, D.J., Emsley, M.W., and Harding, A. "Predicting Prediction Cost Estimation Using Case-Based Reasoning and
Construction Cost Using Multiple Regression Techniques" J. Neural Networks" J. Architectural Institute of Korea, 20, (5),
of Constr. Eng. And Manage., 132(7), 750-758,2006. 93-102, 2004.
[10] Phaobunjong, K. "Parametric Cost Estimating Model for [26] Burroughs and Juntima "Exploring Techniques for
Conceptual Cost Estimating of Building Construction Contingency Setting" AACE Transactions, 2004.
Projects", Ph.D. thesis, University of Texas, Austin, TX,
2002. [27] Iyer, K.C., and Jha, K.N." Factors Affecting Cost
Performance: Evidence from Indian Construction Projects"
[11] Attala, M., and Hegazy, T., "Predicting Cost Deviation in Intern. J. of Project Management, 23: 283-295, 2005.
Reconstruction Projects: Artificial Neural Networks Versus
Regression" J. of Constr. Eng. and Manage., 129 (4), 405-411, [28] Touran, A." Probabilistic Model for Cost Contingency" J. of
2003. Constr. Eng. And Manage., 129 (3), 280-284, 2003.
[12] Dogan, S.Z, Arditi, D., and Gunaydin, H.M. "Determining [29] Dissanayaka, S.M, and Kumaraswamy, M.M "Comparing
Attribute Weights in A CBR Model for Early Cost Prediction Contributors to Time and Cost Performance in Building
of Structural System" J. of Constr. Eng. and Manage., Projects, Building and Environment, 34, 31-42, 1999a.
132(10), 1092-1098, 2006.
[30] Dissanayaka, S.M, and Kumaraswamy, M.M "Evaluation of
[13] Hegazy, T., and Ayed "Neural Network Model for Parametric Factors Affecting Time and Cost Performance in Hong Kong
Cost Estimation of Highway Projects" J. of Constr. Eng. and Building Projects" Engineering, Construction and
Manage., 124 (3), 210-218,1998. Architectural Management, 6(3), 287-298,1999b.

[14] Nassar, K.M., Gunnarsson, H.G., and Hegab, M.Y. "Using [31] Bacon, R.R. and Besant, J "Estimating Construction Costs
Weibull Analysis for Evaluation of Cost and Schedule and Schedules: Experience with Power Generation Projects in
International Journal of Construction Engineering and Management 2015, 4(4): 95-105 105

Developing Countries, Energy Policy, 26(4), 317-333, 1998. [35] Israel, G.D." Determining Sample Size" Agricultural
Education and Communication Department, Florida
[32] Kaming, P. F., Olomolaiye, P. Holt, G., and Harris F. "Factors Cooperative Extension Service, Institute of Food and
Influencing Construction Time and Cost Overruns on Agricultural Sciences, University of Florida, 1992.
High-Rise Projects in Indinesia, 1997.
[36] Cochran, W.G. "Sampling Techniques", 2nd Ed., New York:
[33] Akinsola, A.O., Potts, K.F., Ndekugri, I., and Harris John Wiley and Sons, Inc, 1963.
F.C."Identification and evaluation of Factors Influencing
Variations on Building Projects" Intern. J. of Project [37] Alreck, P. L., and Settle, R. B. "The survey research
Management, 15(4), 263-267, 2005. handbook.’’ Richard D. Irwin, Inc., Homewood, Ill, 1985.
[34] Miaoulis, G. and Michener, R.D. "An Introduction to [38] Oppenheim, A. N. "Questionnaire Design, Interviewing, and
Sampling". Dubuque, Iowa: Kendall/Hunt Publishing Attitude Measurement". Pinter publisher, London, 1992.
Company, 1976.

You might also like