Auto-Regression and Distributed Lag Models

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 79
At a glance
Powered by AI
The document discusses several topics related to history and provides details about events and people. It also notes some challenges faced and conclusions.

The main topics discussed include historical events, people involved, cultural aspects and societal changes over time.

Challenges mentioned include dealing with conflicts, adapting to new situations and overcoming hardships.

AUTO-REGRESSION AND DISTRIBUTED LAG MODEL SHRI PRAKASH BIMTECH OBJECTIVES OF LEARNING The following are the objectives

of this module: (1) Knowing the following concepts: (i) Model and Modeling in the context of time series

modeling; (ii) Auto-regressive model- general auto-regressive model with infinite lags, and first and second order Auto-Regressive Models, (iii) Moving Average Integrated Auto-Regressive Modeling; and (2) Knowing the Concepts of (i) Distributed Lag Models, (ii) Distributed Lag Models as a

part of Auto-Regressive Models; (3) Reasons for Incorporation of Time-Lags in Time Series Models; (4)Koycks Distributed

Lag Model Specification, and Interpretation and Mark Nerloves, Shirley Almons Specification and Interpretation; and (5) finally, Problems and Procedures of Estimation of Auto-Regressive and Distributed Lag Models. 1. INTRODUCTION Social, political and economic systems are mostly resistant to change. Even if the systems change, the change is generally low and slow. There is an in-built inertia, which resists change. Most of the systems, therefore, change ever so slowly; the process of change is gradual. Gradualness of change ensures that the future values remain entrapped in the past values, though immediately preceding value remains most relevant for each succeeding value. Some Note-worthy Features (i) Socio-economic systems are whirlpools of inter-dependencies; it makes discerning and detection of causal relations intricate and complex. (ii) (iii) In time series analysis, practically all variables move with time directly or inversely. Time may be treated as a catch all variable in dynamic systems; time may be treated as a proxy of any and every variable that is subject to change. These are the traits of time series, which involves and treats time as a variable in dynamic models. Such models are considered as dynamic, since the values of the variable trace the nature and time path of change. Change characterizes only dynamic systems; static systems do not depict historically repeating or once for all change. Dynamism makes change a continuum in

time. This trait of the systems and their values, reflected in the time series of observed values, makes forecasting of future manageable despite the complexities and uncertainty of the direction and magnitude of change. Therefore, a simple single/multiple equation model without the backing of intricate theory may efficiently perform the function of describing the observed behavior and expected change in future values of a single variable, or a set of variables in terms of its/their own values observed in the past. This belief is strengthened by the fact that the large scale simultaneous multiple equation macro econometric models have not served the purpose of forecasting so efficiently as simple single equation time series models have done. This observation may, however, not apply to returns as the base of investment in MFs and Equities; their prices and returns are generally volatile. Incidentally, Campbell et al. suggest the use of returns rather than prices since they consider returns to investment relatively non volatile and to possess some beneficial attributes. The view is misplaced since returns are directly related to prices which fluctuate a lot from one to another time period (See, Prakash and Panigrahi, 2007). Single or multiple-variable time series models involve compact specification and only a few parameters for estimation. Involvement of limited number of parameters not only makes estimation simple and easy but it also effects saving of time and effort involved in analysis. This facet depends upon the number of exogenous or endogenous variables used in the model. Time series models have the following advantages: (i) Accurate forecast of future values by the modeling of time series rather than the use of elaborate and large scale macro econometric modeling; (ii) Close approximation to the empirical observations by interpolation and experience. These two properties have made Box-Jenkins (1984) type single variable time series models popular among researchers. 2. FORECASTING AND AUTO-REGRESSION MODEL Forecasting of future values of the variable(s) under consideration is one of the important functions of time series analysis. But forecasting cannot be done without proper explanation. Explanation of the behavior of the observed and forecasting of future values of the variable(s) require time series based modeling.

Time series is a set of data in which time enters as an independent entity. Time series models are, therefore, defined as dynamic models. We have to examine some basic questions in this context. Some of these questions are listed hereunder. 1. 2. 3. 4. 5. What is a model and what is modeling? What is a time series model? What is auto-regression (AR), or auto-regressive model (ARM)? What is the role of time in auto-regressive models? How does time enter into regression modeling?

These and some other questions relate to the basic concepts some of which are newly evolved and others are of older vintage. Answers to above questions will clear the conceptual framework of time series analysis. WORKING DEFINITION OF MODEL AND MODELING Working definitions of model and modeling are discussed here in the context of time series analysis. MODEL A model is generally conceived as the miniaturized replica of reality. Replication of reality involves the conceptual elaboration and the specification of the mechanism or the process by means of which the observed values can be generated. The time series model assumes that the observed values have been generated by the stochastic process which is embodied in the model. The process of generation of series of values in the continuum of time is stochastic Observed values are assumed to be (i) Governed by the principle or theory of probability. All observed values of the given

series constitute the part of a specified Probability Distribution Function; (PDF); and (ii) Expected Value of all probable/possible values that could have materialized at a given

point in time with a definite probability. This is explained as follows: Pit: P10 Yit : Y10 P11 P12 P13P1s P1n

Y11 Y12 Y13 Y1sY1n

In the above series of values, Yis refers to one value of Y that could have materialized and the probability of its materialization is depicted by Pis. Expected value of Y is given below: E(Y) = P1sYis Expected value is average value of the stochastic series of values. Above specification makes time series values to emanate from a stochastic process. MODELING As against the concept of model, modeling refers to the process/mechanism, or the method of formulation of the model. Each model encompasses a specific conceptual framework, assumptions and the process of generation of the series of values. General process of modeling also needs the formulation of causal relation that may be an essential component of the process of generation of the series of values. Conceptual framework is drawn from probability theory. For example, Markov Chain theory, or Classical Probability theory, or Baysesian Probability Theory may furnish the conceptual framework. Time series models are invariably stochastic in nature: (i) If the same stochastic process underlies the generation of several sets of values, observed over a given period of time, each set will be different;

(ii)

Observed values of every set will be governed by the same laws of probability distribution that characterizes the stochastic process of generation of values at each point in time of the period under consideration.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 3. AUTO REGRESSIVE MODELING Auto-Regressive (AR) Models are mainly a part of Time Series Forecasting Modeling. It implies explanation of behavior of the variables is secondary rather than primary function of time series modeling; forecasting is the basic function of time series modeling, though forecasting may involve explanation as well. The term Auto-Regression is used to describe the stochastic process of generation of a set of observed values of one or several
4

variables in which the current value of the variable under consideration and its random errors are treated as a function of their own lagged value(s). GENERAL AUTO-REGRESSION MODEL A general Auto Regressive Model is a single variable model of infinite lags, which stipulates the current value of variable, Yt, and its random error, Ut to depend upon their own past values, Yt:Yt-s; and Ut: Ut-s; t=0,1,.,T, t stands for time. In other words, an auto-regressive model treats current value of the variable and its errors to be the function of all their past values alone. Such models do not involve any exogenous variable(s), Xt, as the determinant of the dependent variable, Yt. It may, therefore, be inferred that auto-regression models are devoid of any theoretical backing. At the most, such models may be considered to have very weak theoretical backing. If, however, such models are treated as difference equation models, which these models actually are, then, the models may be taken to be encompassed in Cobweb Theorem. Besides, there is a theory of time being treated as a variable. The theory is embodied in the explanation of the four components of time series: seasonal, cyclical, trend and residual/random. Seasonal variation relate to behavior of demand in different seasons of the year. Such fluctuation in demand and price are short lived since the duration of seasons is a couple of weeks/months.

Cyclical fluctuations are of relatively longer duration. There are several competing theories of business cycles. So, explanation of cyclical or historically repeating changes has a theoretical base. Random fluctuations arise from the influence of random factors. But random factors do not have any definite theory for explanation. Behavior of trend may be examined on the basis of alternative hypotheses which may be derived from empirics. Therefore, above inference of absence of theory of time series analysis is erroneous. General Auto-Regression Model The general auto-regression model is specified as hereunder: Yt=+ tY0+ 1Yt-1+ 2Yt-2+.. sYt-s++Ut+V1Ut-1+V2Ut-2+...VsUt-s (1) No specific value is assigned to t. neither it is required, nor is it feasible in principle to exactly specify the origin of the process of generation of time series. Relation 1 is defined as an auto-regression model of infinite lags. If the length of the lag is specified, model will become auto-regression model with finite lags. Infinite lag model possesses certain mathematical properties, which give advantage in empirical research. Length of the lag appropriate for a given time series is a matter of empirics, though treatment of time as an independent entity is based on theoretical or a priori considerations.
3.1 FIRST ORDER AUTO-REGRESSIVE MODEL A first order auto-regressive model is the simplest of this genre of models. Such models treat current value to depend upon its immediately preceding value alone.
6

Such models are embedded in the concepts of seasonal and cyclical behavior of the series. In such cases, preceding value of the variable exercises more decisive influence than the remote values of the variable. First order auto-regressive model is a Markov Process Model. First Order Markov Process Model is a special case of General Markov Process. Thrust of the argument is that Markov Chain Probability Theory is the underlying base of such models. The first order auto-regression model is outlined below: Yt =
0+ 1 Yt-1 +

Ut .(2)

This model contains only one lagged value of Y and U. 3.2 AUTO -REGRESSIVE PROCESS AND GENERAL FORM General form of first order auto-regressive model is auto-regressive process based model. This model is outlined below: Yt - =U (Yt-1 ) + Ut (3)

Rearrangement of terms of 3 will give the following model: Yt =(1- U) +U Yt-1 + Ut (4)

The term Auto-Regression applies not only to stochastic process of generation of observed values of Y at different points in time, but it also applies to the stochastic process of generation of error term Ut. As a first order random error, Ut is also specified by auto-regression model of first order: Ut =V Ut-1+t .(4a) V is coefficient of auto-covariance, or auto-correlation of Ut at time t. Auto-correlation coefficient, V, is stipulated to lie in the range: -1< V <+1. Strict inequality sign on both sides of V implies that V will neither be zero, nor will it be one; this is a practical proposition, which satisfies theoretical stipulation. This is the feature of stationary time series.

Like Ut in equation 4, t in equation 4a of process of error generation is itself a stochastic error. Like all random errors, t satisfies following conditions: E(t)=0 E(tt) = t =/=0 Last condition means that it is not probable to generate values of Ut without an error. 3.3 IMPLICATIONS OF AUTO-REGRESSION PROCESS OF FIRST ORDER {AR (1)} AR (1) embodies following important implications for inter- relations between variance and covariance of Ut and its coefficient of auto-correlation. These relations are derived mathematically: E(Ut 2)=
2 2

E((tt-1)=0

/(1-V2)

E(Ut*Ut-1)= V2 { 2/(1-V2)} Cor (Ut* Ut-1) = V2 First and second relations of errors show that variance and covariance of error Ut are a function of their correlation. It is this facet which makes auto-correlation of errors important. FEATURES OF AUTO-REGRESSIVE MODEL Auto-Regressive Model 4 depicts following important facets of the stochastic process under consideration: (1) Relation between Yt and its preceding value, Yt-1 is characterized by two parts- error or random part, shown by Ut, and a systematic part, shown by (1- U) + UYt-1; obviously, behavior of systematic part of the model depends on the mean, and U, the coefficient of Yt-1.

(2) Systematic relation is linear in variables. (3) Relation 3 considers the relation between deviates from common mean, of both dependent and independent variables. Hence, the function may be treated as linear in parameter, U, alone. (4) Two parameters of the model, U and part of the values of Yt; and (5) Uncertainty with regard to precise values of Yt arises from uncertain or random behavior of Ut. Behavior of Ut reflects the behavior of Yt. Random Behavior of Errors This warrants specification of behavior of random errors. Ut is a purely random variable in so far as each value of Ut is drawn randomly and independently of all other values. Hence, correlation between any two of its values at two different points in time is zero. In other words, random errors are un-correlated. Random behavior of error term is constrained to conform to the specified assumptions. Assumptions are given below. Random behavior of Ut involves such features as make its (i) Mean zero; (ii) Variance constant and equal to
2 u

determine behavior of systematic

for all values of t; and

(iii) Correlation between any two values of Ut, say, Ut and Ut+s is treated as zero. In a sampling of errors, each error is assumed to be drawn independently of the preceding or succeeding errors to make covariance zero. Errors are a part of the randomly drawn sample. Random sampling is characterized by independent trials and equality of probability of each value being included in the sample. 3.3 POSSIBLE SETS OF BEHAVIOR OF PARAMETERS The parameters, U and , may have several possible values that either of two may be

parameters may assume. Set of infinite probable behavior of U and made manageable by stipulating specific conditions.

3.4 CONSTRAINING OF PARAMETRIC VALUES Two such possible sets of values of U and (1) are considered hereunder:

If =0 and U=1, function 4 will converge to random walk model:

Yt = Yt-1 + Ut .(5)

(2)

If

=/= 0, and U<1, observed values will fluctuate around some , which are interpreted as calculated value of mean of

constant value of

observed values of the time series. (3) Closer the U to unity, greater is the weight attached to the immediately preceding value Yt-1 of Yt. Greater the divergence of U from unity, greater is the weight attached to distant values of Yt-s. Thus, behavior of U determines the length of lags to be included in the model. (4) If values of U lie in the range -1<U<+1, that is, unit circle, Time Series of observed values of Yt is Stationary. If the condition U<1 is satisfied, observed values of Yt tend to fluctuate around its mean, . This keeps fluctuations within narrow bends. (5) If a time series of values is generated by a stationary process, observed values fluctuate around a constant mean; there is no tendency for the spread of the values to either increase or decrease over time (Harvey, 1981, p.3). Graphical Portrayal Following diagram depicts the time series with non-zero auto-correlation, or autocorrelated errors. Hence, the time series in this graph violates the assumption of absence of correlation in errors in a stationary time series: Observed points shown by crosses in the graph depict the fluctuations of values of Yt around its mean, . The horizontal line AB in diagram runs through . Trend of
10

errors may be linear also. Non linear trend of errors is captured by a second degree parabola: Ut = a-bT+cT2 .(6) In relation 6, T denotes time. At low values of T, negative sign of b will dominate, and it will make the errors decline with time. But at higher values of T, positive sign of c will dominate, and the errors will rise in magnitude. This will ensure that first the divergence from the mean will decline slowly, but after reaching the minimum, divergence will tend to rise with the passage of time. In the long run, values may depict volatile fluctuations due to this factor. TIME SERIES WITH AUTO-CORRELATED ERRORS
y
t

Diagram
X yT X y T+1/T

X X X X X X X X X X

Thus, errors will not be characterized either by linearity or independence in such cases. 3.3 CONSTANCY OF AMPLITUDE OF FLUCTUATIONS Interestingly errors, shown by the differences between actual and mean values, embody serial/auto-correlation. If the values of Yt are generated by a stationary stochastic process, values tend to fluctuate around a constant level; consequently, there is no tendency for the spread of fluctuations to increase or decrease with time. In other words, amplitudes of fluctuations do not fluctuate as much as will make the peaks and troughs go
11

T+1

beyond the range. Errors are then purely random so that the sums of negative and positive errors are equal; this makes the mean of errors to converge towards zero. Besides, variance is constant and errors are uncorrelated. These are the most prominent features of a stationary time series. Except non zero error covariance, graph portrayed above display features of stationary time series. 4. MORE LAGS AND GENERALIZATION Introduction of more lags in relation 3 or 4 can ensure any time series to be characterized by the features of being stationary. Thus, incorporation of more lags will make the series converge to be stationary. If we add Yt-2, Yt-3, Yt-4, , Yt-s to the right hand side of model 3 or 4, we obtain higher order single variable auto-regression model: Yt - =U (Yt-1 ) + U (Yt-2 ) +U (Yt-3 )+.+ U (Yt-s )+ Ut (7) Thus, we learn two approaches to overcome non-stationary feature of time series: (i) Take deviations from the mean; and (ii) Incorporate more lags. 5. AUTO-REGRESSIVE MOVING AVERAGE PROCESS If we introduce the lagged error terms in model 7, it will yield Auto-Regressive Moving Average (ARMA) Process based model. Yt - =U (Yt-1 ) + U (Yt-2 ) +U (Yt-3 )+.+ U (Yt-s )+ Ut +U Ut-1+..+U Ut-s A simple version of above model is obtained by retaining only one lagged value of Yt and the error Ut : : Yt - =U (Yt-1 ) + Ut + U Ut-1 .(8)

The parameter U is the moving average parameter, which is attached both to the explanatory variable and the error. Thus, a model which includes the

12

lagged values of the dependent variable as well as lagged random errors is defined as auto-regressive moving average model (ARMA). ARMA model plays a pivotal role in dynamic modeling; it allows a PARSIMONIOUS representation of stationary time series. In other words, it permits construction of a complex model with a limited number of parameters. Number of parameters in a model has a direct bearing upon the problem of estimation of the parameters and the degree of complexity involved. More parameters lead to more complexity and the possibility of the creeping of technical problems like multi-collinearity. 6. TYPES OF AUTO REGRESSIVE MODELS Auto Regressive Models may be distinguished broadly into different types according to (i) Length of the lag that is involved in the time series model; (ii) Number of variables included in the model; and (iii) Order of the differences. 4.1 LENGTH OF TIME LAG Length of the lag refers to the number of time periods included in the model. According to this criterion, auto-regressive models are broadly distinguished into two categories: (i) (ii) Infinite lag models, and Finite lag models.

Infinite lag models are associated with extremely long time series models. Such time series may spread over a century or even more (Subrmaniam Swamy, 1993). Finite lag models are based on finite time series data, say, 20, 30, 40, 50 periods. Duration of the period may range from an hour to a decade, or even centuries. One time period lag is most commonly used. Generally, 2-3 periods lag suffices in most of the cases. However, Hemlatha Subramnian (2011) found the lag of 17
13

periods being involved in the determination of expenditure on senior secondary education to be relevant in her time series model, which considered per capita income and lagged public expenditure on education as the determinants of current expenditure on education in India. Time lag to be included in the model may be determined empirically. This may involve experiments with models of different time lags; model selected will depend upon the goodness of the fit of alternative models. A priori reasoning may also be employed for this purpose. 4.2. SINGLE OR MULTIPLE VARIABLE MODELS An auto-regressive model may either be single variable model, or multiple variables model. 4.2.1 SINGLE VARIABLE AUTO-REGRESSIVE MODEL A single variable auto-regressive model considers lagged values, Yt-s, of the dependent variable alone to be the determinants of its current value, Yt. Models 1 to 4 are examples of single variable auto-regression models. 4.2.2 MULTIPLE VARIABLES AUTO-REGRESSIVE MODEL Multiple variables Auto-Regressive model includes the lagged values of the dependent variable along with one or more exogenous variables as determinants. Yt =
0

1 Yt-1+

2 Xt

+et..(7)

Model 7 is a two variables auto-regressive model; these variables are Y and X respectively. Y is an endogenous and X is exogenous variable in the model. SOME FEATURES OF STOCHASTIC PROCESS As we know, building blocks are needed for the formulation of models. Uni or multivariate time series models are no exception to this. The concept of White Noise is one such building block for the formulation of time series models. The concept of White Noise is borrowed from engineering in statistics. White noise is an essential component/feature of stochastic process of the generation of
14

values of time series.

Stochastic time series process contains white noise

elements as an essential ingredient the analysis of which is one of the building blocks of the formulation of time series based forecasting models. Incorporation of white noise in the model captures the uncertainty of future, and hence, the margin of forecasting errors. Error less forecast is seldom possible in empirical analysis. WHITE NOISE TIME SERIES PROCESS The basic characteristics of white noise stochastic time series process are as follows: {t }, t=-,+ Where t depicts time and each element in the sequence of {t} satisfies the following conditions: E(t)=0, E(t2)=
2

E (ts)=0 for all values of s and t in the time domain. The above features embody the assumptions that each value of the series, t, is drawn randomly from a population with zero mean and constant variance. Some time, it may also be assumed that the values are drawn independently, and/or the values are normally distributed with zero mean and constant variance given by
2

ONE VARIABLE TIME SERIES MODEL A single variable time series model of white noise may be specified as an auto-regressive model. An auto-regressive model describes the behavior of its current values in terms of its own values observed in the past. An auto-regressive model of white noise is specified as follows:
15

Ut = V Ut-1 + et ..(8) Auto-regressive errors like Ut in equation 8 generally represent residual variation in the dependent variable of a regression function, which is not explained by systematic part of the equation and which is caused by the random disturbances. Systematic part of the regression model is backed up by some well established theory; the theory, however, offers no explanation of the residual part of the function. Regression models, which treats white noise variable, Ut as an outcome of auto-regressive process may be specified as follows: Yt = Xt +Ut (2)

The theory considers values of Ys and Xs to have been generated by stochastic time series process. This assumption may be extended to cover the generation of values of residuals/errors also to be generated by the stochastic time series process. Error Uts may either be represented by a function like 8, which considers each error Ut to equal its immediately preceding Ut-1 value plus an innovation, which is shown by et. Term innovation for et is used to avoid the use of the term error, otherwise it will convey that error also embody other error in
16

itself. Alternatively, series of values shown by Ut may be manipulated to depict the history of Ut. HIGHER ORDER AUTO-REGRESSION PROCESS Statistical evidence may sometimes show that the residuals/ errors cannot be replicated by first order auto-regression stochastic process, which is embodied in relation 8. But the series of residuals is subject to more intricate and involved process of higher order auto-regression of the following type: Ut = V1 Ut-1 + V2 Ut-2 + V3 Ut-3 +..+ Vs Ut-s + et ..(9) The second order auto-regression time series process is a simple form of relation 3: Ut = V1Ut-1 + V2 Ut-2 + et .(10)

Higher order auto-regression may better represent the stochastic process under consideration. In empirical analysis, however, second or third order auto-regression may suffice to capture the actual observations. Interestingly, errors, shown by the differences between the actual and the mean may embody serial/auto-correlation. If series of values of Yt are generated by a stationary stochastic process, values tend to fluctuate around a constant level and there is no tendency for the spread of fluctuations to increase or
17

decrease with time. In other words, amplitudes of fluctuations do not fluctuate so much as will make the peaks and troughs go beyond the range. Above features are the most prominent features of a stationary time series. ********************************** BASIS CONCEPTS AND MODELS OF TIME SERIES ANALYSIS Modern econometrics considers modeling as an essential part of econometric methods of analysis. Explanation of observed facts and their use for forecasting future values are two inseparable functions of econometric analysis. Some models are basically formulated for forecasting rather than to furnish an explanation of reality. Forecasting is generally based on modeling of time series data. Auto-Regressive Integrated Moving Average1 (ARIMA), propounded by Gerad P.E. Box and G.M. Jenkins (1978), and Vector Auto-Regression (VAR) Modeling2 enunciated by Christopher A. Sims (1980) have not only facilitated but also popularized Forecasting Time Series Modeling a great deal both among theoreticians and the practitioners. ARIMA and VAR offer
18

two alternative methodologies of time series forecasting modeling. AUTO-REGRESSIVE INTEGRATED MOVING

AVERAGE (ARIMA) MODELING ARIMA is also known as Box-Jenkins (B-J) methodology of time series forecasting. ARIMA modeling does not focus on the formulation of either single equation or simultaneous multiple equations models for forecasting. ARIMA methodology

concentrates mainly on evaluating stochastic properties of time series data as the base of forecasting. Consequently, ARIMA requires neither data massaging to match some

conceptual/theoretical framework of analysis, nor does it need data manipulation to satisfy the stipulated conditions of methodology. It rather works on the premise that Data Speak for Themselves, and hence, Data do not need any external or exogenous prop or propeller. ARIMA uses only lagged observed values of the given variable and random factors affecting its values to explain the current and predict the future values. Unlike the general regression model which includes a number of exogenous or pre-determined variables, Xj, j=1,2,.m for determining and explaining the
19

values

of

dependent/endogenous

variable

Y,

ARIMA

methodology does not use any pre-determined factors for determining the values of Y. It is as if values of Y are autonomous or autarkic and totally free from any exogenous influence except the random factor(s). For this feature of ARIMA, these models are also called a-theoretic. CRITIC OF ARIMA ARIMA approach is neither free from limitations nor is it immune to criticism. The following, in my view, are serious limitations of ARIMA. (1) ARIMA implicitly assumes that all the observed facts are scientific in nature. Hence, the observed facts do not need anything else for their explanation. This assumption runs counter to the socio-economic reality. Though all scientific facts are common facts, but all common/ordinary facts are not scientific. But scientific research uses only scientific facts. But how does one determine whether the given set of time series observations constitute scientific facts? Evaluation of the scientific or non-scientific character of the given time series one needs theory. Theory may either be an economic theory or financial theory to ascertain whether the series is truly economic
20

or financial. If not full blown theory, this task can not be performed without the use of certain concepts like GDP, employment, inflation, exchange rate, international trade, rate of returns, capital-output or labour-output ratio, etc.. Obviously, the basic premise of ARIMA runs counter to the above. (2)ARIMA approach and philosophy underlying it runs counter to the Marshalls Philosophy that Data do not speak for themselves, they are made to speak. Data or observed values of any variable do not connote any concrete meaning by itself. Meanings are invested by associating the numerical value with some concept or construct. For example, the following figure is cited here: Year 1980: 132232. What do we understand by it? Practically nothing is conveyed by the figure itself. If, however, it is stated that 132232 management graduates were seeking job at the beginning of the financial year 1980, we are immediately able to understand the meanings of the above cited figure. (3) ARIMA does not explain what are the random factors and how many of these factors are relevant to the analysis of any given series. Therefore, concepts are needed to attach

21

meanings to the observed values and theory is required to extricate meanings out of the observed scientific facts. In other words, ARIMA methodology is indep3endent of any theoretical paradigms for explaining the behavioral

characteristics of the observed facts. VECTOR AUTO-REGRESSION (VAR) METHODOLOGY In simultaneous structural equation models, one set of variables is treated as dependent and other set is considered to be exogenous or pre-determined to the model. Pre-determined variables comprise both exogenous and lagged values of the dependent variables. Each regression function contains one endogenous and some predetermined variables. Such models can not be estimated with out first identifying the given regression functions. Rank order conditions and a priori restrictions on certain parameters are used to facilitate identification. These restrictions relate either to magnitude or sign or both. For example, sign of the coefficient attached to price in demand function is expected to be negative, while price will have positive sign in supply function. Similarly, quantum of rainfall in a particular season will affect output/supply and not demand. So, the value of the coefficient of rainfall in demand for
22

agricultural good is expected to be zero, while it may have a non-zero value in supply function. The equations may either be exactly identified or identified. In case an equation remains unidentified or under identified, it can not be estimated. Thus, identification precedes rather than follows estimation (Karl, Fox). Assigning zero or non-zero value to the coefficient(s) of specific variables in selected functions is considered to be totally subjective. Christopher Sims (1980) criticized this practice on this ground. But a moments thought will reveal that the decision to include or exclude a variable in a relation as an explanatory variable is guided mainly by theory. Exclusion of the variable implies its coefficient to have zero value in the function. Similarly, negative sign of the price in demand function is based on inverse relation between the price and quantity of demand. What is the subjectivity? Besides, theoretical decisions can not be and should not be treated purely as statistical or data based. However, this is precisely the approach of Box-Jenkins and Christopher Sims. According to Sims, if there are truly simultaneous equations and their variables, then all the variables should be treated on par
23

rather than some being considered as a priori determined and others ex priori determined, or some as dependent and others as independent. So, all the variables should be on an equal footing. Logical extension of the argument will suggest that every variable is a function of ever other variable; it will make the solution intractable. But Sims used above premise to develop vector auto-regression modeling. Granger causality test has already laid down the

foundation of VAR. &&&&&&&&&&&&&&&&&&&&&&&&&&&&&


DISTRIBUTED LAG MODEL Regression models, based on time series data, are developed for the purpose of explaining the observed behavior and forecasting future outcomes of operations of the forces at work in the continuum of time and domain of given space. Such models are invariably dynamic in nature. Both auto-regressive and/or distributed lag models perform these twin roles. An Auto-regressive model may be formulated independently of distributed lag model. Alternatively, auto-regressive model is derived as a part of the distributed lag model. Both these models are dynamic and involve time as an independent variable in its own right. DISTRIBUTED LAG MODEL: CONCEPT If a regression model includes not only current value of an independent variable, Xt but also its past value(s), Xt-s, s=1,2,, as determinants of current

24

value of the dependent variable, Yt, then such a model is defined as a distributed lag model. FROM TWO TO MULTI-VARIABLE MODEL Introduction of one or more lagged values of Xt as determinant of Yt transforms the two variable model into a multiple regression model, which necessitates estimation of more than two parameters. For example, we may consider the following model of aggregate consumption function (Prakash, S., 2010): Yt =
0

1 Xt

+ Ut

(1)

Where Yt is aggregate consumption of India, say in 2011, Xt is disposable income or per capita income of India at time t. Introduction of lagged values of Xt will transform equation 1 from simple to two variable regression model into a distributed multiple regression model: Yt = lags: Yt= + 0Xt+ 1Xt-1+ 2Xt-3+..+ sXt-s+ .+ t X0+Ut.(3) This model has certain mathematical properties, which makes its manipulation feasible and easy. This justifies accepting such an unrealistic model. If we accept logic of this model, one may like to include disposable income of India in prehistoric period, or Maura period. Consideration of even Mugal period may sound irrational. Above difficulty of fixing time of origin of X makes practical difficulties of such models obvious. Still it has some theoretical appeal which makes practical life comfortable. We shall see it in subsequent pages. AUTO-REGRESSION MODEL RE-DEFINED If in a distributed lag model, like the model equation 1, one or more lagged values of dependent variable, Yt is introduced along with current value, Xt and lagged
25
0+ 1 Xt+ 2 Xt-1+Ut

(2)

Relation 2 is a particular case of more general distributed lag model with infinite

value(s), Xt-1 as determinants of the dependent variable, Yt, then the distributed lag model is transformed into an auto-regressive model. Following model is an illustration of an auto-regressive model, which is the extended and modified form of distributed lag model 2: Yt = +
0 Xt

1 Xt-1

Yt-1 +Ut

(4)

Lapse of time makes both sets of values of variables, X and Y, change. This transforms time into an independent variable of the system. Time is treated as if it were an independent entity. Consequently, Y and X will change with time. TIME AS A VARIABLE Why do we need the incorporation of time as an essential variable of the system? What does time in the model relations perform? These are two important questions to be answered. Distributed lag models have been developed independently by Koek, Cagan, Mark Nerlove and Shirley Almon; each of them furnished answers to these twin questions. Subsequently, others have also added to their contribution. REASONS FOR INCORPORATION OF TIME AND LAGS The following are the main reasons for the incorporation of time as a variable and lags in the regression models. 1. TIME PATH AND PATTERN OF CHANGE Incorporation of time in the analytical framework facilitates tracing of the time path of change, its nature, direction and magnitude. Analysis of the nature,

direction and magnitude of change enables the investigator to understand the genesis and consequences of change, detect its pattern and estimate the trend, if any, involved in time series.

26

2. ROBUST AND DIVERSIFIED FRAMEWORK OF ANALYSIS Introduction of time in a linear series and its analysis furnishes a natural framework of analysis of dynamic structure of observed values and nature and direction of changes involved over a period of time. Close analysis of changes may highlight whether the changes have been purely random or the changes embody some causal relation(s). Theories focus on such aspects of linear time series as (i) stationarity, (ii) dynamic causal relations, (iii) auto-correlation and auto-correlation function, (iv) modeling, and (v) forecasting of values. 3. NON INSTANTANEOUS CAUSAL RELATIONS Social relations in general and economic and business relations in particular involve time; these relations are seldom instantaneous. Consequently, response of the dependent variable to any change in the independent factor, irrespective of the magnitude of change, is not immediate; it is generally delayed. The delay in response of the dependent

27

variable to change(s) in exogenous/independent variable(s) is defined as Lag. Delayed responses are rarely uniform, period of delay differs between cases. This makes incorporation of time as an explicit variable essential in order to analyze each case according to the need of the situation. REASONS FOR DELAYED RESPONSES Response of dependent variable to changes in exogenous factor(s) is delayed due to a variety of reasons, which are discussed here. These causes may also be taken as causes of lags. 1. PSYCHOLOGICAL FACTORS Psychological factors are among the important causes of delayed responses to change. People take decisions and implement the same; but behavioral agents are subject to general principle of inertia, which accounts for delays in decisions or their implementation. Decision making agents have their habits, opinions and perceptions; then most of the people resist change and continue to stick to their given mode of thoughts and behavior.

28

Information is seldom perfect and people entertain doubts about the nature of change: Is change desirable and in the right direction? Does change require alteration in habits? Is change permanent or transitory? For example, 60 per cent Austrians thought that joining EU will harm them, but after the analysis of costs and benefits and dissemination of results of analysis opinions changed and 60% favoured. So Austria became member of EU. So not only access to information but access to right information is pivotal for the acceptance of change. There is principle of inertia, which psychologically affects peoples responses to change. If prices are expected to fall in near future, present purchase plans may be deferred. Then, if the price falls today, many people may expect the price to fall further in future, inducing them to wait and watch rather than to rush to make purchases. So expectation and perceptions are also important. All these factors and facets of human behavior adversely affect peoples response to change instantaneously. 2. TECHNOLOGICAL REASONS Process of technological change takes time to be completed. If new plant and machinery have been planned to be installed in a
29

factory, placement of order, its execution by supplier, installation of the capital equipment cannot be completed quickly. The process is spread over several time periods. Similarly, if demand is expected to rise next month, production of more output has to be taken up in right earnest in order to make timely supply to the market. Production involves lag and lags differ between the goods. For example, production process of a bicycle may be completed in few weeks, but it takes 3 to 5 years to produce ships and airplanes. If wages of labour falls, making substitution of labour for capital economically viable, technical bottlenecks may not permit it instantaneously. 3. INSTITUTIONAL FACTORS There are several institutional factors that inhibit instant response to change. For example, a contract has been made with a supplier for a good by a company at a prior negotiated rate. Subsequently, price of the particular good falls in the market, then it will not be possible to switch over to a new supplier who is willing to supply the good at a lower rate.

30

RBI announces policy every six months in response to which banks effect changes in their own lending and deposit/ borrowing rates. Banks will find it difficult to alter the rates till the announcement of the new policy, even though the market based rates may be drastically different from such administered rates. Similarly, before the harvesting of the crops, Government announces minimum support price for buying the supplies from farmers at the given price. The traders will find it difficult to buy the crop output at a price lower than the minimum support price whatever the state of the market. 4. POLICY IMPACTS There is always lapse of time between formulation of policy and its implementation. Impact of implementation of policy measures empirically does not emerge instantaneously; effect of change in policy and its implementation always take time to emerge perceptibly and visibly. Impact of incentives provided by the government to encourage exports does not appear immediately as long lead times are involved in the provisions of incentive schemes, utilization by exporters of the incentives, and the impact of the change. In
31

order to contain inflationary pressures, RBI controls the supply of money in general and credit in particular. So, the cheap money policy is replaced by costlier money policy. But it takes pretty long time for results to become visible. Several times, the chosen measures may fail to achieve the desired goals. Time lags are involved in implementation also. Planning Practices Unspent funds under a plan are carried forward to the next plan which is treated as non-plan expenditure; it cannot be spent on plan programmes/projects. 5. ROLE OF EXPECTATIONS Behavior of numerous business and economic variables is conditioned by the expectations of the agents or decision makers. These expectations are based on past experience, newly emerged scenario, availability of additional information and perception about the future. This necessitates the incorporation of lags explicitly in the analytical framework in order to take cognizance of expectation. That is why an expectational equilibrium is distinguished from pother categories of equilibrium.

32

6.

HISTORY

AS

DRIVER

OF

BUSINESS

AND

ECONOMIC DECISIONS Some decisions in business and decisions pertaining to the economy are guided and conditioned by the historical background, including the immediate context of the problem. The technology in use in an organization can not be wished away simply because some new technology has emerged. How many more workers are needed to be employed for the production of additional output depends on the type of machinery and equipment already installed. If slow down occurs in an organization, it cannot retrench all its workers in one go. How many new workers are to be employed depends on the size of existing employment. What shall be the electricity bill of a household in a given month depends on the number and types of gadgets already acquired and used in the past. The total fixed cost of production depends upon the number and nature of capital equipment already installed. Incidentally, Mark Nerlove highlighted first three of these six factors as the causes of the incorporation of lags in the models. MAIN TYPES OF DISTRIBUTED LAG MODEL

33

Dynamic distributed lag models play an important role in economic analysis and analysis of the problems of finance. In this context, two types of distributed lag models may be distinguished: (i) (ii) Single Variable Distributed Lag model; and Multiple Variable Distributed Lag model. Two variables distributed lag model is the simplest case of multiple variable distributed lag models. SINGLE VARIABLE DISTRIBUTED LAG MODEL A single variable distributed lag model contains only one variable in the regression function. Auto regressive model is the simplest case of single variable distributed lag model. In such models, current value of the variable is expressed as the function of all its preceding values: Yt= + 1Yt-1+ 2Yt-2+ 3Yt-3+.+ sYt-s+. tY0 +Ut Model relation 1 treats current value of Y to depend upon its past values. It does not consider influence of any variable other than itself. MULTI-VARIABLE DISTRIBUTED LAG MODEL A more general case will be one that has one or more explanatory variables, Xs of the dependent variable, Y. A
34

distributed lag model with a single explanatory variable is a simple case of such models. But all values of X, ranging from current to all of its lagged values, are assumed to influence the current value of Y: Yt= 0+ 1Xt+ 2Xt-1+ 3Xt-3+..+ sXt-s+.+ t X0+Ut(2) Model 2 like model 1 is also a distributed lag model with infinite lags. The difference is that in model 2 we have lagged values of only explanatory variable X as the determinants of the current value of Yt of Y, while relation 1 has only lagged values of Yt itself as the determinants of Yt. A more complex model may be obtained by the combination of the lagged values of Y with the current and lagged values of X as explanatory variables in the same fubction: Yt = + iYt-k+ iXt+Ut+rUt-1.(3) Estimation of a distributed lag model of the above types with all or even most of the lead or time lags of a long time series is extremely complex; part of the complexity arises from a large number of lagged values of the endogenous and/or exogenous variables as determinants of the current value of the dependent variable.

35

Even more complex will be a model like 3, which includes not only the current and lagged values of X but the lagged values of Y also. The difficulties will again arise from the necessity to estimate a large number of parameters in such models. DIFFICULTIES IN ESTIMATING DISTRIBUTED LAG MODELS Three difficulties may be mentioned in this context: (1) Presence of large number of variables in the model reduces the degrees of freedom; larger the number of parameters, lower is the number of degrees of freedom. Less degrees of freedom may compromise the statistical significance of the estimated parameters; (2) Too many independent variables in the equation may lead to multi-collinearity. Multi-collinearity will affect both the sign and significance of some parameters. So, the test of significance of the estimated parameters become a serious problem in such cases; and (3) Error of such models is itself auto-regressive: Ut =
1+vt.. 0+ 1Ut-

This poses some technical problems.

APPROACHES TO DISTRIBUTED LAG MODELS

36

There are four distinct approaches to consider distributed lag models: (i) (ii) Koycks Approach/Specification/Hypothesis; Cagans Adaptive Expectation Model;

(iii) Marc Nerloves Partial Adjustment Model; and (iv) Shirley Almons Approach to Distributed Lag Model. We shall consider each of these approaches one by one. But first we consider Koycks transformation of a distributed lag model with infinite lags into a model with few lags. KOYCKS APPROACH Koyck has proposed an algebraic transformation of distributed lag model with infinite lags into a model with finite lags which makes it easy and simple to deal with and empirically estimate the model. Koysks model is also known as geometrically declining lag model. But Koyck does not furnish theoretical explanation of distributed lag model. Koyck concentrated on enunciating an algebraic mechanism to simplify the application of distributed lag models but did not attempt to enrich the distributed lag model theoretically. This is done by Cagan and Marc Nerlove. Milton Friedman also furnished permanent

37

income hypothesis for analyzing consumption behavior in distributed lag framework. KOYCKS SIMPLIFICATION Koyck developed a simplifying mechanism to overcome the difficulty of estimating distributed lag models. He propounded a novel approach to specify a distributed lag model with a view to (i) simplify its estimation by the reduction of parameters to be estimated, and (ii) provide a meaningful interpretation of results derived from the application of distributed lag model. He starts with the above distributed lag model with infinite lags: Yt = +
0

Xt+

Xt-1 +

Xt-2 +..+

Xt-s + . +

X0+et..(4) If we consider only finite rather than infinite lags by putting s=k in the above function, where k is some finite number, then the above function will be Yt = +
0

Xt+

Xt-1 +

Xt-2 +..+

Xt-k

+et(5) k is some finite number and t denotes time. In the above function,
0

depicts the short run response of Y to a unit change

in the current value of Xt in the given time period itself. As

38

usual,

measures the change in the value of Y in response to

unit change in the value of Xt. If it is assumed that the change in the value of Y in response to unit change in the value of X in each period is the same, then the change in each successive period will be as follows: Change in Y in period t: t-1: t-2:
0 0+ 0+ 1, 1+ 2. 2 +.+ t

t-t=0:

0+

1+

It may be shown that i=


0+ 1+ 2

++

k-1+

k=

..(6)

Above expression yields the total change in the value of Y in response to a given change in the value of X during the k different periods. is defined as the total distributed lag

multiplier. The impact of change in the value of X on the value of Y is spread over k periods. Total impact is not exhausted within the period in which the value of X is changed. The interpretation is obviously based on the assumption that the response, evoked by change in X, remains the same in each period.
39

KOYCKS ASSUMPTIONS All theories and models are formulated on the basis of a set of assumptions within which the models and theories are valid. Koyck also made the following assumptions: (i) All regression coefficients, denoted by s, have the same

sign, that is, either all the coefficients are positive or negative. Sing of the coefficients may either be specified a priori on the basis of theory, or these may be determined empirically; and (ii) Numerical values of the coefficients decline geometrically.

The geometrical decline will conform to the following exponential schemata:


k= 0 k

..(7)

k=0,1,. Obviously, coefficient


k

is a constant given by

0.

Relation 7

imposes an exponentially declining geometric pattern on the numerical values of the coefficients of lagged variables in the model despite the constancy of
k..

If we substitute in relation 5

from relations 6 and 7, we will obtain the following distributed lag model: Yt = and 0 <
k

Xt-k + et .(8) <1.


40

Exponential function is easy to manipulate.

is the rate of <1

decline in the values of the coefficients of lagged values.

will ensure that the values of successive coefficients will decline, while >0 will ensure that the decline will not be
k

transformed into rise with the passage of time. Multiplication of each coefficient of the lagged variable by with increasing

value of k makes the value of successive coefficients lower than the value of the preceding coefficients. It ensures that the effect of the lagged values of the variables goes on declining till it totally disappears or becomes ineffective. Thus, more distant the past, smaller tends to be its impact on the present. Memory fades with time; consequently, more distant the past, greater is its haziness. This is quite rational even otherwise. Longer the period covered by the series, greater is the probability of trend reversal due to twists and turns. Greater the number of twists and turns in a long time series, larger will be the number of trend reversals. DETERMINANTS OF DECLINE IN COEFFICIENTS OF LAGGED VARIABLES The magnitude of decline in the value of successive coefficients of lagged variables depends on two parameters:
41

(i)

Value of the common coefficient, variables, and

of explanatory

(ii)

Value of

, the coefficient of decline.

The condition, 0< <1, coupled with positive power of ensures that is positive but less than 1, so the value of each

successive coefficient declines and no coefficient will turn from positive to negative. Greater the value of common coefficient, , greater shall be the value of its product with
k

; and greater the value of

, that is,

nearer it is to unity, greater shall be the value of its product with the coefficient , and hence, slower and lower shall be the

decline in the value of the coefficients of the lagged explanatory variables. As against it, smaller the value of , greater shall be

the decline in the value of the successive coefficients of the regression function. BASIC FEATURES OF KOYCK MODEL This makes three features of Koycks approach obvious:
(i)

s are not allowed to change the sign to ensure that the

value of the subsequent coefficients of lagged variables will continue to decline;

42

(ii)

< 1 ensures not only the decline in the values of the

consecutive coefficients of lagged explanatory variables, but it also assures that the recent past gets greater weight than the remote past; as the remote shall get lower weight than the present value of the explanatory variable; and
(iii)

Sum of the coefficients,

s,

will remain finite. This sum is,


k

in fact, the long run multiplier. In other words, the sum,

measures the long run impact of change in exogenous variable(s) on the values of the dependent variable. But the sum of
k

is given by: (9)

F
0

= F 0{1/(1 - P )}

k=0,1,2 Above is the sum of an infinite geometric series with common ratio .

FUNCTION WITH INFINITE LAGS IN KOYCKS FRAMEWORK Substitution of Koycks transformation into functional relation 2 with infinite lags will furnish the following: Yt = + Xt+
2 3 s t

Xt-1 +

Xt-2 +..+

Xt-s + . +

X0+et..(10)

43

Lagging the above function by one time period and multiplying by will yield the following relation: +
t-1 2 3 4 s

Yt-1 = + . +

Xt-1+

Xt-2 +

Xt-3 +..+

Xt-1+s

X0+et-1.(11)

Subtracting 11 from 10 will yield the following relation: THEORETICAL UNDERPINNINGS No theoretical underpinnings have been used in Koecks specifications. His specification and transformation appear as purely mathematical exercise. Cagan, Marc Nerlove, and Shirley Almon propounded three distinct approaches to the analysis of distributed lag models on the basis of distinct conceptual and theoretical underpinnings. Nerlove, Cagan and Almon furnished theoretical underpinnings to take distributed lag models out of the purely mathematical and/or statistical domain. But their approaches are totally different not only from Koycks approach but from each other also. All three approaches have played the pivotal role in popularizing distributed lag model in empirical research, though Almons approach has not been as popular as the approaches of Cagan and Nerlove. CAGANS ADAPTIVE EXPECTATION MODEL
44

In order to remove the limitation of absence of theoretical base of distributed lag model, base and process of formation of expectation is introduced into the model. Let the basic model be given by the following: Yt=
0+ 1 Xt*

+ Ut ..(9)

Yt is current value of the dependent variable, say, aggregate private final consumption expenditure; Xt* denotes long run disposable private per capita income, and U is error as usual. Xt* may also be defined as the optimum, or equilibrium or expected long run disposable income. The above function is based on the postulation that the long run aggregate consumption expenditure depends on long

run/expected disposable per capita income. But we do not have any observed value of Xt*, as it is based on expectation. We also do not have the basis and mechanism of formation of expectation. The following thesis is proposed for the expectation: Xt*- Xt-1*= {Xt- Xt-1*} 0< <=1 (10)

will satisfy the following condition:

45

This condition is slightly different from the condition imposed on in Koeks model in so far as is not allowed to

equal unity, while unity is the upper limit of the value that cannot transgress; but it can equal unity. Unit value of will

imply that the current expectation does not differ from past or preceding expectation. This is the hallmark of either a static and unchanging system, or a system in which the change may be anticipated to remain the same which eliminates the need for revision or modification of expectation. is defined as the coefficient of adaptive expectation. It implies that expectation is revised and modified in the light of experience and/or observed configuration of forces at work. Some time, this hypothesis is defined as error learning hypothesis, or progressive expectation hypothesis. This

hypothesis was proposed by Cagan. Milton Friedman developed his permanent income hypothesis on similar considerations. Above relation is then simplified to express unobservable expected value Xt* in terms of known value (s) of X: Xt*= Xt + (1- ) X*t-1 ..(11) If we substitute the value of Xt* from 11 into relation 10, we get Yt=
0+ 1

{ Xt + (1- ) X*t-1} + Ut
46

0+ 1

Xt +

1(1-

) X*t-1 + Ut ..(12)

If the relation 9 is lagged now by one time period and then multiplied by (1- ), we get (1- ) Yt-1 = (1- ) [ 0+ =
0 (11 X*t-1 1(1-

+ Ut-1]

)+

) X*t-1 +(1- ) Ut-1 .(13)


1(1-

If 13 is subtracted from 12, we will get the following relation Yt - (1- ) Yt-1 =
1 X*t-1 0+ 1

Xt +

) X*t-1 + Ut-[(1- )

0+(1-

+(1- ) Ut-1]
0

Yt = ( =

0- 0 +

) + (1- ) Yt-1 +

Xt + [ 1(1- ) X*t-1-+

(1-

) X*t-1]+ Ut-(1- ) Ut-1


0

+ (1- ) Yt-1 +

Xt + Ut

Yt = II0+ II1Yt-1 +II2 Xt +II3 E Relation E is reduced form of relations A, B, C, and D. Parameters IIs are reduced form parameters. Reduced form parameters may be used to derive the estimates of structural parameters from the following equations: II0= Once
0

, II1 =1- , or =1- II1, II2 = is determined from

, and II3 = Ut

=1-II1, all other structural

parameters will be determined as follows:


0 =II1

/ ;

1 =II2

/ , and Ut =II3 / .

47

In relation E above, II2 or relation A, it is

measures the change in Y in

response to a unit change in the value of X. As against this, in


1

which measures the average change in the

value of Y in response to a unit change in the expected value Xt* of Xt. This is an important distinction in the interpretation of these two specifications. This relation shows that the expected disposable per capita income at time t is equal to the weighted average of actual disposable per capita income at time t and its expected value in the preceding period. The weights are Besides, if Xt*= Xt. Obviously, expected value coincides with the current value. There is no time lag involved in the realization of the expectation, it is immediately realized. This can happen if there is no uncertainty about the future, or, the agents have perfect insights. This also makes it clear that nearer the value of is to =1, then and 1respectively.

unity, lower is the weight attached to the past. Immediate past is generally known. On this premise, Xt-1* may be treated as the known of the system. As against this, if Xt*=X*t-1
48

=0, then

This implies that there is no difference between the preceding and current periods expectations. Expectation does not need any revision or modification. Thus, the conditions that prevail today will continue to prevail in future also. All expected future values will coincide with the current value. ELIMINATION OF UNOBSERVED EXPECTED VALUE If, however, this is not the case, then, the mechanism for the elimination of the lags/unobserved expected value has also been developed. If the relation A is lagged by one period and then multiplied by (1- ), we get (1- ) Yt-1 = (1- ) [ 0+ =(1- )
1 Xt*

+ Ut-1]
1 Xt*

0+(1-

+(1- ) Ut-1 ..D ) Xt-1* + Ut-[(1- )


0+(1-

If D is subtracted from C, we will get the following relation Yt - (1- ) Yt-1 =


1 Xt* 0+ 1

Xt +

1(1-

+(1- ) Ut-1]
0 +(1-

Yt =

) Yt-1 +

Xt+ Ut-(1- ) Ut-1 E.

=II0+ II1 Yt-1 + II2Xt +II3 on several counts.

This model and its underlying hypothesis have been criticized

49

There is another alternative approach to consider distributed lag and/or auto-regression model. This approach is called partial adjustment hypothesis. This hypothesis is discussed in the ensuing paragraphs.\ PARTIAL/STOCK ADJUSTMENT HYPOTHESIS Marc Nerlove has furnished an alternative specification of the distributed lag model. This specification is known as partial or stock adjustment model (PAM). It is assumed that there is an optimal or long run equilibrium value of Y. The optimal or desired level of value of Y is designated by Yt* which is assumed to be linearly related to the exogenously given Xt by the following relation: Yt*=
0

1Xt+

et(1).

In order to eliminate unobserved desired value Yt*. Nerlove assumes that the adjustment of actual to the desired value of the variable Y is spread over several time periods. Consequently, it is not probable to reach the desired value Yt* in one single time period. In any one time period only a fraction of the total desired adjustment/change in the value of Y is accomplished. The

assumption leads to the following relationship between the actual and desired change in the value of Yt:
50

Yt- Yt-1 = Yt=

(Yt*- Yt-1)(2) ) Yt-1.(3)

Alternatively, relation 2 may be expressed as follows: Yt*+ (1-

Relation 3 shows that the current value of Yt is the weighted average of the desired and previous periods value, Yt-1 where and 1are the weights attached to the desired and preceding periods values respectively. Weighted sum of Yt is the realization of the desired value in the long run. It is, therefore, defined as a long run function. The coefficient of adjustment, satisfies the following condition: 0< <=1. >0 ensures that the amount of adjustment in any time , in each successive period,

Thus,

period will be positive but only a fraction of the total desired adjustment or change in the actual value. Lower the value of ,

lower and slower will be the process of adjustment of actual to the desired value, Yt*, and smaller shall be the proportion of total adjustment achieved in a single period. How much adjustment is completed within a period and how prolonged the adjustment process is depends upon the value of . If, however,

equals 1, adjustment of actual to the desired value will be


51

instantaneous and it will be complete within the period concerned itself rather being spread over several time periods. Substitution of the value of Yt* from 1 into 3 will yield the following relation which does not involve the unobserved value Yt*: Yt= = [
0 0

1Xt+

et] + (1-

) Yt-1 et

1Xt + (1-

) Yt-1+

=II0 +II1 Xt + II2Yt-1+II3.............(4) Where II0=


0,

II1=

1; II2 =1-

; or

=1- II2; and II3=

e.

Values of the structural parameters of the function may thus be determined from above relations of the parameters. As the relation 1 is assumed to furnish the long run desired or equilibrium/optimum value of Yt*, the relation 4 may then be taken to represent its short run format; this function depicts the degree of divergence of Yt from Yt* in each period of observation.. The deviation of the short run value, Yt from its long run value Yt*: Yt - Yt* is accounted by ignorance, bottlenecks, inertia or lack of exact foresight. Besides, change in dependent variable lags behind the change in exogenous variable, Xt.
52

After the short run function 4 has been empirically estimated, it is easy to derive the estimate of long run function, 1. Subtraction of II2 from 1 furnishes the value of II1 by will yield the values of
0

. Then, division of II0 and and


1.

It means that the

division of the estimated coefficient of Xt of the short run function will furnish an estimate of long run propensity to change in the linear specification and long run elasticity in the log-linear specification of the original function in the model. Omission of the lagged value Yt-1 will then furnish the estimate of long run function itself. The process of adjustment may be represented graphically on the assumption that, say =0.75. The following chart furnishes

the amount of adjustment accomplished in each period: Period Adjustment Within the Period Initial First Second 0.06925 0.0 0.75 Up to the period 0.0 0.75 0.1875 1.00 0.25 0.93075 Adjustment Cumulative Adjustment Remaining

53

Third 0.01731 Fourth 0.001299 0.98492

0.051938

0.98269

____________________________________________________ Thus, slightly more than 98 per cent of total adjustment is accomplished in three periods. An interesting facet of adjustment is that the maximum adjustment is effected in first period itself; thereafter adjustment effected in each subsequent period goes down progressively till its magnitude becomes negligible. This facet of temporal adjustment in the value of the dependent variable in response to a given change in independent variable implies that lower the remaining proportion of total adjustment, smaller is the amount of adjustment achieved in successive periods. These twin facets of adjustment process are depicted by the following graphs: Blue bars depict the amount of adjustment effected in a given period. If the adjustments completed in each time period, shown by the tops of the blue bars, are joined together, a convex curve will be traced; the curve first falls sharply from right to left but the extent of fall itself declines from one to another time period. Such a curve may be depicted by a second degree parabola.
54

Red bars depict the cumulated adjustment completed up to each time period. If we join the top of these red bars, we get a concave curve rising from left to right. The curve may approximate a logistic/logit curve; the curve rises most rapidly during the first period; there after the curve rises at a decreasing rate. This is shown by the graph 2.

55

1.2 1 0.8 0.6 0.4 0.2 0 0 0 Initial First Second Third Fourth 0.1875 0.051938 0.001299 Cumulative Adjustment Up to the period 0.75 0.75 Adjustment in the Period 0.93075 0.98269 0.98492

Graph of Adjustment Different Periods


1.2 1 0.8 0.6 0.4 0.2 0 0 Initial 0 First Second Third 0.1875 0.051938 0.001299 Fourth Cumulative Adjustment Up to the period 0.75 0.75 Adjustment in the Period 0.93075 0.98269 0.98492

Koycks Transformation, Partial Adjustment and Adaptive Expectation Models

56

All three models appear to be the same algebraically. The following are some of the important similarities of these models: (i) Partial adjustment function 4, Koycks specification and adaptive expectation function E look alike except the differences in symbols: (ii) The coefficient same way as and .

can also be interpreted exactly in the and vice versa.

(iii) The empirical estimates of functions E and 4 will exactly be the same. (iv) All three models are auto-regressive in orientation. Besides, the error term of of Koycks model, adaptive expectation model and distributed lag model are also auto-regressive. But these specifications/models are different conceptually and theoretically, their algebraic similarity notwithstanding. Differences of Partial Adjustment and Adaptive Expectation Models The following differences characterize these two functions: (i) First and foremost difference is that Koycks model is an algebraic expression which has not been backed up by any conceptual and theoretical paradigm except that
57

Koyck incorporated observed empirical reality of investment in his specification. As against Koycks model, adaptive expectation and partial adjustment specifications have specific theoretical underpinnings which are used in interpreting the results. (ii) For interpreting as the coefficient of adaptive

expectation, we will have to assume that the current value of the dependent variable, Yt is linearly related to its desired value Yt* rather than assuming that Yt* depends on its current value as has been done in relation 1. (iii) Adaptive expectation model incorporates uncertainty as an essential part of its conceptual specification. Uncertainty pertains to the future value of the independent variable. This is not the case with partial adjustment model. Achievement of partial adjustment in any period is explained by institutional and technical rigidities, inertia, bottlenecks or barriers to change, and lack of complete information, etc. and not to uncertainty about the value of the exogenous variable.

58

(iv)

The error term, et of the partial adjustment model is simpler than that of the adaptive expectation model, since it is a simple multiple of a constant . As against

this, the error term of adaptive expectation model is itself a distributed lag expression: Ut-(1- ) Ut-1 This error expression is a more complex expression than (v) Adaptive .et. model (AEM) and partial expectation

adjustment models (PAM) are much a AEM, in its turn has come to be compared with rational expectation model (REM). AEM may be supported on the following counts: (i) AEM furnishes an extremely simple format for incorporating expectation in theory. One does not need an elaborate knowledge of probability theory to encompass expectation in theoretical analysis. (ii) AEM provides a simple tool for the analysis of decision makers in reasonably realistic terms. The postulation that the real life decision makers learn from experience and generally do not repeat the same mistake is simple common sense understanding of real life behavior of
59

people. This assumption or postulate is much more reasonable and realistic than the assumption that the people do not care for the past at all, or that the past does not matter at all as is the case in static framework where current value equals the past value. (iii) The premise that the recent past exercises greater influence on the current decisions than the remote past is also in consonance with the commonsense understanding of the observed conditions in a dynamically changing state of the business and economy. The AE model has been greatly popular in empirical research for long. But first J Muth and then Robert Lucas and Thomas Sargent challenged the basic thrust and underpinnings of AEM. The main argument against AE is that it relies exclusively on the past values of the variable under consideration for the formulation of the expectation for the future. Rational Expectation Model assumes that

individual economic agents use current available and relevant information in forming their expectations and do not rely purely upon past experience. Expectations are designated as rational in so far as these are assumed to be
60

formed on the basis of all the available information, including both the past and present. True, past alone cannot be the guide for the future. But then, present alone is also not sufficient to form expectation. Expectation involves anticipation of the nature and magnitude of change that is envisaged to materialize in future. But the envisagement depends partly on the nature and magnitude of changes that had occurred in the past. It is more realistic to assume that the expectation is formed not only on the basis of the observed past but the current state is also reckoned and then anticipated change in future is concretized. But the anticipation of future change requires insights, intuitive understanding and some assumption to work upon. AE performed this function to some extent. Partial adjustment hypothesis also performs the role in its own way. So, AE and PAM are not to be discarded altogether, though these may be modified and extended to perform the envisaged function more efficiently than it has done so far.

61

Shirley Almons Polynomial Distributed Lag Model

Shirley Almon differed from all above approaches on empirical and theoretical counts. Above three approaches use functional relations, which are linear both in variables and parameters. Almon observes that a long time series is often characterized by several twists and turns, upward swings and downward tumbles in values, resulting in reversals in trend as well as non-linearity. Besides, Koycks transformation of infinite into finite lags, that also push away distant past into oblivion, is based on the assumption that lies in the range of 0< <1 and k is an by
k

integer; these assumptions ensure that multiplication of

will make the value of successive coefficients of the lagged variables decline. These assumptions are highly restrictive and do not conform to the observed facts most of the times. Periodic occurrence of trade cycles is a pretty well known phenomenon. Extensive literature comprising diverse theories of trade cycles, exist in economics. In fact, growth theory is an off-shoot of the theory of trade cycles (Prakash, S., 1992). Almon discarded the above restrictive assumptions and replaced these by the
62

assumption that the time series, covering long run, will depict non-linear changes in values. Such non-linear values of the variables can be captured by treating the regression parameters to be approximated by some polynomial function.

Almon used Wiejerstrass Theorem to assume that the s will approximately conform to a polynomial of an appropriate degree. Wiejerstrass Theorem states that in a finite and closed interval in which a continuous function holds, the function will uniformly approximate a polynomial of an appropriate degree. Thus, the conditions under which the theorem holds is that the interval in which the values of the exponent lie should be finite and closed. Naturally, the theorem will not hold true if the interval is either infinite or the values belong to an open rather than a closed set.

The degree of the polynomial, approximated by

s, depends

upon the actually observed values of the series and the pattern of their changes through time. In some cases, the polynomial may be linear, in some, second degree, and in other cases, changes may approximate polynomial of three or more degrees.
63

Actual degree of polynomial that approximates time series is a matter of empirics. However, most of the economic and business time series may be approximated by second or third degree polynomials. This may be elaborated by reconsidering the distributed lag model with finite lags:

Yt= + jXt-j +Ut If the scatter diagram of time series data is of the type shown either in graph 2 or 3, a second degree parabola may be fitted to it: In the first graph, the values first decline very rapidly, then the rate of decline slackens and then the values start rising. As against this, values in the second graph rise first rapidly, then the rate of increase declines and thereafter the values start declining. Both sets of values will be represented well by second degree parabola; in the first case the curve will be convex to the origin and the second curve will be concave in shape. If however the series is marked by cyclical fluctuations, it will be represented by the scatter diagram of the following type which will be approximated by the degree polynomial.
64

Graph
i

X X X X X

X X

X X X X

65

X X

Shirley Almon hypothesized that such cases may be represented byi following mathematical functions:
X s = a0 + a1j +a2 j2..(5) X X

X X

X 0 1 2 3 8

66

s = a0 + a1j +a2 j2..(5) This is a second degree polynomial which will fit the data well shown in figure 3 or 4. Curves, representing data of figure 3 or 4 will, however, differ in signs attached to the coefficients, as. Such functional relations capture the series which depicts trend reversal once. s = a0 + a1j +a2 j2+a3j3..(6) This is the third degree polynomial which has more than one curvature; it implies reversal of trend from one to another time period more than once.

Operative Mechanism of Almons Thesis First step is to evaluate the time series data with a view to determine the polynomial which will fit the data best. Careful perusal of the data is quite helpful in this. If it is not possible to judge the appropriate polynomial by mere perusal, one may adopt trial and error method to arrive at the right decision. Alternative curves may be fitted to the data in order to discover the curve of the best fit.
67

Suppose the second degree concave curve fits the data best. Then, the Second step is to substitute the value of relation 5 in the following relation:
i

from

Yt= +ctXt-s +Ut Yt = + (a0 + a1j +a2 j2)Xt-j +Ut = + a0 Xt-j + a1 j Xt-j +a2 j2)Xt-j +Ut(7) For simplicity, we define the following new variables:

Zot = Xt-j Z1t =j Xt-j Z2t =j2 Xt-j Substituting these Zs in 7, we get

Yt = + a0Zot + a1Z1t +a2 Z2t +Ut(8) Relation 8 appears like a general regression function. Yt is regressed on the new variables Zs rather than the observed
68

variables Xs. Function 8 may be estimated by OLS. Estimated values of the parameters and as will have all the desirable

properties of the OLS estimators, if the error term Ut satisfies the stipulated classical conditions of OLS. Almons method has one distinct advantage over Koecks transformation in so far as relation 8 does not contain the stochastically generated lagged Yt-1 as an explanatory variable and its possible correlation with the error term Ut. Once the parameters and as of the relation 8 have been estimated, it will be possible to determine the original parameters s from the estimates of above parameters:
0=a0 1=a0 +a1 +a2 2= 3=

a0 +2a1 +4a2

a0 +3a1 +9a2

.
s=

a0 +sa1 +s2a2

Problems of Application of Almons Schemata

69

However, the application of Almons thesis poses certain technical problems which have to be resolved. These difficulties are as follows: 1. Length of the lag has to be determined in advance. One may follow the guidelines furnished by Davidson and Mackinnon (1993). They opine that The best approach is probably to settle the question of the length of the lag first, by starting with a very large value of q (the lag length) and then seeing whether the fit of the model deteriorates significantly when it is reduced without imposing any restrictions on the shape of the distributed lag.

We see a problem cropping in this mechanism in so far as it still leaves the question of the length of the lag hanging in balance. In our view this is a trial and error method based on the spirit of Hendrys top down approach. The question is why one cannot adopt bottom up approach? It does not matter whether one starts with top down or bottom up approach so long as one is able to find the right length of the lag. But top-down approach leaves

70

the question what is the peak point at the top. As against this, it is easy to fix the bottom of the pyramid.

Probably the reason lies in the fact that if there is some true lag length encompassed in the time series, then, use of fewer lags than the true length of the lag will lead to relevant omitted variable bias in the estimate of the function. But one can commit the error at the other end by retaining the more lags than the required number in the function. Consequences of omission of relevant variable(s) are quite serious.

Choice of more lags than the warranted length of the lag amounts to the inclusion of such variables as are not true determinants of Yt . Inclusion of variables which are not true determinants will create irrelevant variable bias; the consequences of such bias are much less serious than the consequences of exclusion of relevant variables. Parameters can be estimated consistently by OLS in such cases. In our view, there is a trade-off between these two types of biases. Alternatively, Akaike or Schwarz information criterion may be used to determine the appropriate length of the lag. This
71

could also be used to determine the polynomial to be used in the model. We strongly think that the evaluation of basic data should provide the base for all such decisions. In most of cases, an appropriate evaluation of the features of data will lead to right choice.

2. Second problem is the determination of the appropriate power of the polynomial to be used in the distributed lag model. A general rule is that the power of the polynomial should at least be one more than the number of turning points in the curve relating
j

to j. If the curve shows one

turning point, then, second degree polynomial may be used, and if the curve contains two turning points, then the third degree curve will be useful, and in general it may be stated that if the curve has s turning points, then s+1-th degree polynomial will be appropriate for use. In practice a fairly low degree polynomial may be appropriate, though there is no a priori reasoning, or precept.

3. Zs, the constructed variables, are the linear combination of the original variables Xs. This is bound to lead to the
72

problem of multi-collineariy. Multi-collineariy is, however, not as serious a problem as other problems. 4. Chosen degree of the polynomial and the maximum length of the lag depend a lot on subjective decision of the investigators. Some Features of Stochastic Process As we know, building blocks are needed for the formulation of models. Uni or multivariate time series models are no exception to this. The concept of White Noise is one such building block for the formulation of time series models. The concept of white noise has been borrowed from engineering in statistics. White noise is an essential component/feature of stochastic process of the generation of the values of time series. Stochastic time series process contains white noise elements as an essential ingredient the analysis of which is one of the building blocks of the formulation of time series based forecasting models. Incorporation of white noise in the model captures the uncertainty of future, and hence, the margin of forecast errors. Error less forecast is seldom possible in empirical analysis. White Noise Time Series Process
73

The basic characteristics of white noise stochastic time series process are as follows: {t }, t=-,+ Where t depicts time and each element in the sequence of {t} satisfies the following conditions: E(t)=0, E(t2)=
2

E (ts)=0 for all values of s and t in the time domain. The above features embody the assumptions that each value of the series, t, is drawn randomly from a population with zero mean and constant variance. Some times it may also be assumed that the values are drawn independently, and/or the values are normally distributed with zero mean and constant variance given by
2

One variable Time Series Model A single variable time series model of white noise may be specified as an auto-regressive model. An auto-regressive model describes the behavior of its current values in terms of its own values observed in the past. An auto-regressive model of white noise is specified as follows: Ut = V Ut-1 + et ..(1)
74

Auto-regressive errors like Ut in equation 1 generally represent the residual variation in the dependent variable of a regression function, which is not explained by the systematic part of the equation and which is caused by the random disturbances. The systematic part of the regression model is backed up by some well established theory; the theory, however, does not offer any explanation of the residual part of the function. Regression models, which may treat white noise variable, Ut as an outcome of auto-regressive process may be specified as follows: Yt = Xt +Ut (2)

The theory considers values of Ys and Xs to have been generated by stochastic time series process. This assumption may be extended to cover the generation of the values of residuals/errors also to be generated by the stochastic time series process.

Error Uts may either be represented by a function like 1, which considers each error Ut to equal its immediately preceding Ut-1 value plus an innovation which is shown by et. Term innovation for et is used to avoid the use of the term error, otherwise it will
75

convey that error also embody other error in itself. Alternatively, the series of values shown by Ut may be manipulated to depict the history of Ut. Higher Order Auto-Regression Process Statistical evidence may some times show that the residuals/ errors can not be replicated by first order auto-regression stochastic process that is embodied in relation 1. But the series of residuals is subject to more intricate and involved process of higher order auto-regression of the following type: Ut = V1 Ut-1 + V2 Ut-2 + V3 Ut-3 +..+ Vs Ut-s + et ..(3) The second order auto-regression time series process is a simple form of relation 3: Ut = V1Ut-1 + V2 Ut-2 + et .(4)

The higher order auto-regression may better represent the stochastic process under consideration. In empirical analysis, however, second or third order auto-regression may suffice to capture the actual observations. Interestingly, the errors, shown by the differences between the actual and the mean may embody serial/auto-correlation. If the series of values of Yt are generated by a stationary stochastic
76

process, the values tend to fluctuate around a constant level and there is no tendency for the spread of fluctuations to increase or decrease with time. In other words, amplitudes of fluctuations do not fluctuate so much as will make the peaks and troughs go beyond the range. Above features are the most prominent features of a stationary time series. Unit Root Test of Stationarity The observed values of the time series may be evaluated by the application of the criterion of U<1. Application of this criterion is known as the unit root test of the stationarity of the time series under consideration. Generalization of Auto Regression Model The model given by the equation: Yt = (1- U) + U Yt-1 + et is

the special and simplified case of the general auto-regression model. The model may easily be generalized by the introduction of more past values of Yt like the values Yt-2,Yt-3, Yt-4, and so on up to Yt+s: Yt =(1-U)
2+..+Uet-s.. +U

+UYt-1+UYt-2 +UYt-3 ++U Yt-s + et (6)

et-1 +U et-

77

Model 6 is defined as the auto-regressive moving average process. The distinguishing feature of this model is that it involves the lagged values of both Y and the error term e. In equation 5, we have only one lagged value of Y on the right hand side, we have only one random error, et. It is thus obvious that the model 5 is the simple and modified version of model 6. References Box, Gerad P.E. and Jenkins, G.M. (1978), Time Seres Analysis Forecasting and Control, Holden Day, San Francisco, Karl, A. Fox ( Analysis Sims, C.A. (1980), Macro-econometrics and Reality, ), Agricultural Policy and Econometric

Econometrica, Vol. 48. Gujarati, Damodar N. and Sangeetha (2007) Basic

Econometrics, Tata-McGraw Hill, New Delhi. Green Harvey, A.C. Davidson, Russel and Mackinnon, James, G. (1993) Estimation and Inference in Econometrics, Oxford University Press, New York.
78

%%%%%%%%%%%%%%%%%%%%%%

79

You might also like