Linear Regression - Stats 2 (Translated)
Linear Regression - Stats 2 (Translated)
Linear Regression - Stats 2 (Translated)
REGRESSION
ANALYSIS
Scope
Linear Regression Analysis
• There are various types of regression analysis, what we are studying this
time is linear regression analysis
Continuous Predictor
• The characteristics of the data used in this material are more like continuous
predictor data
X Y
Covariance = Shared Variance Correlation
X Y
Regression Analysis = Prediction Regression Analysis
X Y X Y
X correlates Y X predicts Y
X is related to Y X affects Y
X is related to Y X determines Y
The X variance is in line with the Y X increases Y
variance X decreases Y
The X and Y variances are aligned etc..
Etc...
Regression analysis is an extension of the correlation test so that
the coefficients it produces are influenced by correlation
Correlation Test Regression Analysis
Terminology
• Dependent Variable • Independent Variable
• Criteria (criterium)Output • Predictor
• Output/Outcome • Antecedents
• Consequence • Explanatory
• Resultant • Reason
• Effect • Pusher
• Consequence • Beginner (starter)
• Impact • Originator
• Igniter
• Trigger
The use of words depends on the context of the variables and the meaning of their
connection
Regression Analysis
• When to use
• When researchers want to predict the quantitative value of a dependent variable (Y)
by an independent variable (X)
• What are the characteristics of the variables?
• Dependent variable = continuous (interval/ratio)
• Why does it have to be interval/ratio, because the average value can be estimated from this data
• The discussion this time will be limited to regression analysis which uses independent variables
in the form of continuous data
• Variabel independen = kontinyu/kategorikal (nominal/ordinal/interval/rasio)
• In the case of categorical data, the theme of regression using dummy variables will be discussed
• The discussion this time will be limited to regression analysis which uses independent variables
in the form of continuous data
Purpose of Regression Analysis
• The purpose of regression analysis is to obtain an equation that
connects the dependent variable and the independent variable
• Example of the equation Y = A + BX
• The derived equations can sometimes be used for predictive purposes,
but more often the goal of research is to establish the relative
contribution of independent variables in determining a dependent
variable or to develop a model to describe a phenomenon.
Visual Display of Research Design
Correlation Prediction
Y = A + BX
• Y = Dependent Variable (Criteria)
• A = Baseline (intercept)
• B = Prediction coefficient (regression coefficient/prediction
coefficient)
• X = Independent Variable (Predictor)
Y = A + BX
Prediction Coefficient (B)
Coefficient B (Prediction)
Shows how big a role X
plays in predicting Y
Proof of Equation
• The greater the
coefficient value, the
greater the predicted
value
Visual Evidence
• The bigger the
prediction the more
slanted the line is
• Either tilt up or down
The greater the B value,
The steeper the slope of the prediction line
B = 1,02 B = 0,05
Prediction coefficients are often called
slope parameters.
Steep
Steep
Sloping
Examining the Role of Eating on Feelings of Full
Visual Evidence:
The prediction line for Makan Angin
is more slanted compared to Makan
Roti and Makan Angin
Proof of Equality:
Y = A+ B1(X1) + B2(X2) + B3(X3)
Y = 4.8 + 0,01(X1) + 0,019(X2) +
B = 0,010 B = 0,019 B = 0,61 0,61 (X3)
B3 > B2 > B1
Understanding Slope Coefficient
• Self-Acceptance Scale
• This scale has a score range of 20 to 50
• 1 unit = 1 self-acceptance point
• Employee salary
• Unit = rupiah | 1 unit means 1 rupiah
• Frequency of anger
• Unit = appearance of anger | 1 unit = one appearance of anger
• Work experience
• Unit = year | 1 unit = one year of work
Y = A + B(X)
Loyalty = 2 + 3(Salary)
Every increase in one unit of salary will increase loyalty by = 2
+ 3(1) = 5 loyalty points
Y = A + B(X)
Loyalty = 1 + 5(Experience)
Each increase in one unit of years of experience will increase
loyalty by = 1 + 5(1) = 6 loyalty points
Example of Regression Analysis with Different Units
• X1 = Education Variable
• Length of education (unit: per month)
• X2 = Experience Variable
• Work Experience (unit: per year)
Example of Regression Analysis with Different Units
Example of
Fictitious Data
Analysis
Results
Interpretation
EDUCATION = Every 1 year increase in education will increase the current salary
per month by = 1 + (0.5) = 1.5 million rupiah
SALARY = Every increase of 1 million rupiah in initial work salary will increase the
current salary per month by = 1 + (1.2) = 2.2 million rupiah
Y = A + BX
Intercept Coefficient (A)
Level of Aggressiveness before crowding
Intersep plays a role (crowding = 0)
Understanding Intercept
• The intercept coefficient shows the intersection of the regression line
with the Y value when = 0
Difficult to interpret
Body weight = 65.4 + 2.01 (height/cm)
Intercept = 65.4
Body weight when height 0 = 65.4
Notes on Linear
Equations in Regression
Analysis
Significance of Coefficients
Significance test of
• Significance of Coefficients Coefficient
• Each coefficient is tested for significance to ensure whether the value of the coefficient is
"reliable". Trustworthy means that the price is not a coincidence but is something that exists that
reflects the population
Significance of Coefficients
• p value (Support)
• t = (estimate – 0)/SE
• t =0.0943 / 0.0159 = 5.94
𝐵− 0
𝑡=
𝑆𝐸
Premise
• Premise 1 = The t value shows how
big the difference between B and 0 is
• Premise 2 = very low t values tend to Conclusion
produce statistically insignificant If the t-count resulting from the
values computation is not significant because it
• Premise 3 = A value of 0 indicates is below the t-table, the B value in the
nothing regression equation is nothing (not
significant, not reliable)
Standardized vs Non-Standardized Coefficients
• Standardized coefficient
• The interpretation of the prediction coefficient depends on the scale of the variable unit so it is called
a standardized coefficient (unstandardized estimate/unstandardized coefficient) so that the higher the
value is not necessarily the greater the prediction.
• To compare researchers need to use standardized coefficients
Confidence Interval
• Confidence interval (confidence interval) shows the coefficient values in the population
with a certain level of confidence (for example 95%).
• Reading Rules: If the coefficient prices in the interval DO NOT MAKE the price equal
to 0, then the coefficient can be trusted so it is statistically significant (p < 0.01 or p <
0.05)
Some Additional Notes
• Regression coefficients are relative
• The regression coefficient depends on the variables involved in the analysis. X1's prediction of
Y can change when X1 is included in the regression analysis along with X2 and so on
• Depends on the correlation between predictors. The higher the correlation between predictors,
the more the predicted value has the potential to change. (remember: partial correlation)
• The higher the correlation between predictors will give rise to cases of multicollinearity (will
be discussed later
• Regression coefficients sometimes involve a role that is not proven to
be significant or has a small contribution to Y
• Sometimes in some studies, researchers include predictors that are not proven to be significant
• You need to be careful when reading the results of regression analysis
Some Additional Notes
• Significance (p) vs Effect size(r2)
• Dalam kasus analisis regresi satu prediktor sumbangan efektif dari prediktor akan
sama dengan koefisien prediksi terstandar (beta) yang dikuadratkan
Prinsip Sumbangan Efektif #2
Sikap 0.8382=0,70
• Dalam antar prediktor yang memiliki korelasi kecil,
Persepsi 0.2852=0,08
maka total sumbangan efektif total mendekati total
Total = 78 % = = 0.777
koefisien prediksi terstandar kuadrat masing-masing
prediktor
Prinsip Sumbangan Efektif #2
• Ketika antar prediktor yang
memiliki korelasi kecil
• Total sumbangan efektif mendekati
total koefisien prediksi terstandar
kuadrat masing-masing prediktor
Korelasi sikap dan persepsi = 0.01 • Lihat contoh disamping
Kondisi Awal
• Korelasi semua prediktor
dengan kriteria, arahnya positif
• Korelasi antar prediktor sangat
tinggi (r=0,890)
Keanehan Hasil
• Prediksi salah satu prediktor
berubah menjadi negatif
• Nilai koefisien terstandar di
atas 1
Kasus Multikolinieritas
Wilayah
Perebutan
Situasi
• Pada kasus multikolinieritas, antara satu
prediktor dan prediktor lain “berperang”
memperebutkan wilayah yang dapat
memprediksi kriteria
Catatan
• Mengapa dalam analisis regresi berganda, korelasi yang tinggi antara
prediktor dengan kriteria tidak menjamin besarnya nilai prediksi?
• Karena bisa jadi perannya digantikan prediktor lain yang berkaitan erat
dengan prediktor tersebut
• Menelaah korelasi matriks antar variabel sebelum melakukan analisis regresi sangat membantu
untuk memahami dinamika sumbangan efektif yang didapatkan dari analisis
3 Tujuan #3
Analisis Regresi
Pengembangan Model
Analisis Regresi Sebagai Pemodelan
1. Model adalah miniatur atau penyederhanaan dari fenomena
• Ada banyak prediktor yang mempengaruhi terjadinya fenomena
akan tetapi tidak semua dilibatkan dalam penelitian
2. Model dikembangkan berdasarkan perspektif tertentu
• Pemilihan variabel sebagai prediktor dalam penelitian didasari
oleh teori atau tujuan penelitian
3. Pemodelan adalah upaya untuk menjelaskan fenomena secara sederhana
• Proses munculnya fenomena melalui proses yang kompleks akan
tetapi dalam analisis regresi proses ini hanya satu jalur (path)
peranan
Analisis Regresi Sebagai Pemodelan
4. Model yang sederhana (parsimoni) namun optimal
dalam menjelaskan fenomena lebih diinginkan
daripada model yang kompleks
• Hasil analisis regresi diharapkan memuat
sedikit prediktor yang berkontribusi besar
daripada banyak prediktor dengan
kontribusi yang sama
5. Kualitas suatu model regresi ditunjukkan oleh suatu
indeks
Menu Model Fit di
• Kualitas model regresi dapat ditunjukkan Jamovi
melalui harga model fit
Menu Pengembangan Model dalam Jamovi
• Pemodelan dalam analisis regresi dilakukan baik metode secara manual maupun
secara otomatis (enter, stepwise dst). Jamovi belum mengakomodasi metode analisis
secara otomatis sehingga memberikan kesempatan kepada peneliti untuk
mengembangkan modelnya sendiri
• Peneliti dapat memasukkan prediktor-prediktor persamaan secara bertahap melalui
blok. Satu blok dapat berisi satu prediktor atau lebih.
• Satu blok biasanya diisi oleh prediktor-prediktor yang memiliki karakteristik sama
Analisis Regresi dengan Blok
X1 X1
X2 X2
Y X3 Y
X3
Tahap 1
X4 Blok 1 masuk ke persamaan
X5 X4
Tahap 2
X5 Blok 2 masuk ke persamaan
Varians bersama blok 1 dengan blok lainnya Varians bersama antar blok
• Antara satu blok dengan blok lain memiliki kaitan (berupa varians
bersama/kovarians) yang juga memiliki kaitan dengan varians dari kriteria.
Dampak Penggunaan Blok
• Blok 1 masuk, kemudian menjelaskan
varians Y
• Blok 2 masuk, hanya menjelaskan
varians unik yang tidak terkait dengan
varians Blok 1
• Blok 3 masuk, hanya menjelaskan
varians unik yang tidak terkait dengan
varians Blok 2 dan Blok 3
• Blok 3 dapat sisa-sisa varians
• Blok masuk secara berurutan sehingga blok yang pertama kali masuk cenderung
akan menjelaskan varians lebih besar pada kriteria daripada yang masuk di tahap
kedua dst.
Pemilihan Blok
Blok 1
Konstruk
Sosial
Blok 2
Internal Individu
Blok 3
Demografi
• Blok 1 (trait)
• Ekstraversi (X1)
• Keterbukaan (X2)
• Blok 2 (motivasional)
• Ketekunan (X3)
• Kesabaran (X4)
Model Coefficients
• Mengeluarkan informasi tentang interval
konfidensi dan koefisien terstandar
(standardized estimate)