where X 1 L X p1 are p 1 predictor variables. Therefore, the It is a special case of (1), because it can be written as
definition (1) implies that the observations Yi are independent Yi = 0 + 1 X i1 + 2 X i 2 + 3 X i 3 + ei (10)
normal variables, with mean E[Yi ] as given by (2) and constant where X i1 = X i , X i 2 = X i2
, and X i 3 = . X i3
of hourly load (in kW) and temperature (in F) data from a polynomials were suggested in [3]. Fig. 3 shows the load
medium utility are used in the case study. The first three years temperature scatter plot of the utility in this study. The
are used as the history data, while the last year is the holdout piecewise functions need the cut-off temperature(s), which
sample. Fig. 1 shows the load series from 2005 to 2008, while may not be exactly the same in different service territories.
Fig. 2 shows the corresponding temperature series. The southern part of the United States may have a different
cut-off temperature from the northern part, because the
B. Linear Trend
comfortable temperature zone may be different for people
We define a quantitative variable (Trend) to capture the living in different regions. Therefore, the 3rd ordered
locally increasing (or decreasing) trend by assigning a natural polynomials of the temperature are used to predict the load for
number to each hour in ascending order. For instance, the the benchmarking purpose.
Trend variable of the first hour in 2005 is 1, the second hour in
2005 is 2, and the last hour of 2008 is 35064. It should be D. Calendar Variables
noticed that such a trend only belongs to the utilities with a It is well known that there are three seasonal blocks in the
stable service territory and local economics. The linear trend load series: day, week, and year. There can be different
can be interpreted as the linear approximation to the load treatments to each block depending upon the load
series: when the length of load history is relatively short consumption behavior of the particular service territory. For
comparing to the macroeconomic changes, such as recession example, the 7 days of a week can be modeled by a qualitative
and booming, the overall trend of the load during the interval variable with 2 classes (weekdays and weekends), 3 classes
of interest can be linearly approximated. The benchmarking (weekdays, two weekend classes), etc. In different countries,
model discussed in this paper is not directly applicable to the the weekends may be defined differently. For instance,
significant business events, such as merging two utilities and Thursday and Friday are weekends in Iran. As a benchmarking
splitting one utility into two. model, the qualitative variables (Hour, Day, and Month) with
24, 7, and 12 classes, are used to model the 24 hours of a day,
C. Temperature
7 days of a week, and 12 months of a year respectively.
The relationship between the load and temperature has been Namely each seasonal block is decomposed into the unit of the
intensively studied during the past several decades. For highest resolution.
instance, piecewise linear function was indicated in [2];
piecewise quadratic function was used in [5]; the 3rd ordered
E. Interaction Effects statistical software packages, such as SAS 9.2 with STAT.
Normally an afternoon is warmer than a midnight; a
summer is warmer than a winter. In other words, temperature IV. RESULTS AND DISCUSSIONS
is not independent of the hour of the day or the month of the This section presents the forecasting performance of the
year. Therefore, the interaction effects between the above benchmarking model GLMLF-B. The following error
temperature (appeared in the form of 3rd ordered polynomial analysis is often used in forecasting:
as discussed above) and the calendar variables Hour and 1) Error, the difference between a quantity and its estimated
Month should be in the model. Since it is not evidential that or measured quantity.
there is a relationship between days of the week and the 2) Absolute Error, the absolute value of error.
temperature, the interaction between temperature and days of 3) Absolute Percentage Error, 100% times the relative error
the week is not included in the model. (the error divided by the true value).
The hours in different days of a week may result in The distribution of the above errors can be characterized by
different load due to human activities. For instance, there may mean, standard deviation, minimum, Q1 (first quartile),
be lower load in the morning of the weekends than the other median, Q3 (third quartile), and maximum values.
mornings, because people do not have to get up as early as Several engineering concepts have been involved when the
weekdays to go to work, which results in less load at home load forecasts are communicated:
and office buildings in the weekend mornings. Therefore, the 1) Hourly load, the energy consumed in an hour. Sometimes
interaction effect between Hour and Day should be included in it is calculated by averaging several instantaneous
the benchmark model. measurements occurred in an hour.
When a qualitative predictor variable interacts with a 2) Energy, the summation of the hourly load within a
quantitative one, this quantitative term is not required to specific period.
appear in the model as a single regressor (main effect). When 3) Peak/valley load, the maximum/minimum of the hourly
two qualitative predictor variables interact together, either of load within a specific period.
them is not required in the model as a main effect. Therefore, 4) Peak/valley hour load, the load occurs during the hour
the benchmarking model GLMLF-B includes the follows: where actual peak/valley load occurs.
1) Quantitative variables: Trend, and TMP (the current hour
temperature); TABLE I
2) Class variables: Hour, Day, Month; RESULTS (MAPE, %) OF THE NAVE MODEL
Forecasting horizon (# of days)
3) Main effects: Trend, Month; 1 2 3 4 5 6 7
4) Interaction effects (also known as cross effects): Hourly load 4.98 5.00 5.01 5.01 5.02 5.03 5.04
DayHour, MonthTMP, MonthTMP2, MonthTMP3, Daily peak hour 4.27 4.29 4.29 4.29 4.30 4.30 4.30
HourTMP, HourTMP2, HourTMP3, where the cross sign Daily valley hour 5.44 5.46 5.47 5.49 5.50 5.50 5.52
Daily peak 3.94 3.96 3.97 3.97 3.98 3.99 3.99
represents the interaction effect; Daily valley 4.93 4.95 4.96 4.98 4.99 5.00 5.02
5) Intercept. Daily Energy 3.49 3.51 3.52 3.53 3.53 3.54 3.55
The model can be written as:
E(Load) = 0 + 1Trend + 2DayHour + 3Month +
4MonthTMP + 5MonthTMP2 + 6MonthTMP3 +
7HourTMP + 8 HourTMP2 + 9 HourTMP3 , (16)
With the above information, the model GLMLF-B as
specified in (16) can be easily implemented in commercial
