Statistik II

Willkommen bei Stats by Randolph. Hier geht es zur Hauptseite mit weiteren Informationen und Inhalten.

Autor:innen dieser Seite: An den Inhalten dieser Seite haben mitgearbeitet: Markus Janczyk und Valentin Koob. Der Inhalt dieser Seite wird in der Lehre in den Studiengängen Psychologie von der AG Psychologische Forschungsmethoden und Kognitive Psychologie an der Universität Bremen verwendet, steht aber allen Interessierten zur Verfügung. Rückmeldungen/Fehler/Vorschläge können gesendet werden an randolph@uni-bremen.

Versionshistory:

v1.0: erste online-gestellte Version (03.06.2024)

# Pakete die hier benutzt werden:
library(psych)

Im Haupttext zu Statistik II haben wir im Zuge der Faktorenanalyse (Teil 13) die praktische Durchführung mit der Funktion efa() aus dem Paket lavaan kennengelernt. Wir stellen hier die alternative Funktion fa() aus dem Paket psych vor. Wir verwenden dazu die gleichen Daten, wie im Haupttext und laden diese zunächst und befreien sie von der ersten Spalte, die die Versuchsperson indiziert:

daten <- read.table("./Daten/daten_PCA_FA.dat", 
                    header = TRUE)
daten <- subset(daten,
                select = c(-1))

Die Funktion fa() bietet die Möglichkeit entweder die Maximum-Likelihood Methode (Argument fa = "ml") oder eine Hauptachsenanalyse (Argument fa = "pa") zu verwenden. Ein Aufruf mit Hilfe der Rohdaten als DataFrame sieht so aus:

fa1 <- fa(daten,           # Rohdaten 
          nfactors = 2,    # Anzahl der Faktoren, 
          fa = "ml",       # ML-Schätzung
          rotate = "none") # ggf. eine bestimmte Rotationsmethode
print.psych(fa1,           # fa()-Objekt
            cut = 0.3,     # Minimum der angezeigten Ladungen
            sort = TRUE)

## Factor Analysis using method =  minres
## Call: fa(r = daten, nfactors = 2, rotate = "none", fa = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##               item   MR1  MR2   h2   u2 com
## Schreiben        6  0.69 0.50 0.73 0.27 1.8
## Buchstabieren    4  0.66 0.56 0.74 0.26 2.0
## Lesen            5  0.65 0.52 0.70 0.30 1.9
## Algebra          1 -0.51 0.67 0.70 0.30 1.9
## Analysis         3 -0.56 0.66 0.75 0.25 2.0
## Geometrie        2 -0.55 0.63 0.70 0.30 2.0
## 
##                        MR1  MR2
## SS loadings           2.21 2.12
## Proportion Var        0.37 0.35
## Cumulative Var        0.37 0.72
## Proportion Explained  0.51 0.49
## Cumulative Proportion 0.51 1.00
## 
## Mean item complexity =  1.9
## Test of the hypothesis that 2 factors are sufficient.
## 
## df null model =  15  with the objective function =  3.35 with Chi Square =  992.23
## df of  the model are 4  and the objective function was  0.02 
## 
## The root mean square of the residuals (RMSR) is  0.01 
## The df corrected root mean square of the residuals is  0.02 
## 
## The harmonic n.obs is  300 with the empirical chi square  1.1  with prob <  0.89 
## The total n.obs was  300  with Likelihood Chi Square =  6.91  with prob <  0.14 
## 
## Tucker Lewis Index of factoring reliability =  0.989
## RMSEA index =  0.049  and the 90 % confidence intervals are  0 0.11
## BIC =  -15.9
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy             
##                                                    MR1  MR2
## Correlation of (regression) scores with factors   0.94 0.94
## Multiple R square of scores with factors          0.89 0.88
## Minimum correlation of possible factor scores     0.78 0.77

Neben einer Reihe weiterer Informationen gibt der Block Standardized loadings die geschätzten Ladungen an. Hier sieht man auch schön, dass eine initial gefundene Lösung nicht gut interpretierbar sein muss. Nahezu alle Variablen laden betragsmäßig hoch auf beiden Faktoren. Daher rotieren wir die Lösung einmal mit Hilfe einer Varimax-Rotation:

fa2 <- fa(daten,              # Rohdaten 
          nfactors = 2,       # Anzahl der Faktoren, 
          fa = "ml",          # ML-Schätzung
          rotate = "varimax") # Varimax-Rotation anfordern
print.psych(fa2,
            cut = 0.3,
            sort = TRUE)

## Factor Analysis using method =  minres
## Call: fa(r = daten, nfactors = 2, rotate = "varimax", fa = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##               item   MR1   MR2   h2   u2 com
## Buchstabieren    4  0.86       0.74 0.26   1
## Schreiben        6  0.85       0.73 0.27   1
## Lesen            5  0.84       0.70 0.30   1
## Analysis         3        0.87 0.75 0.25   1
## Algebra          1        0.84 0.70 0.30   1
## Geometrie        2        0.83 0.70 0.30   1
## 
##                        MR1  MR2
## SS loadings           2.17 2.16
## Proportion Var        0.36 0.36
## Cumulative Var        0.36 0.72
## Proportion Explained  0.50 0.50
## Cumulative Proportion 0.50 1.00
## 
## Mean item complexity =  1
## Test of the hypothesis that 2 factors are sufficient.
## 
## df null model =  15  with the objective function =  3.35 with Chi Square =  992.23
## df of  the model are 4  and the objective function was  0.02 
## 
## The root mean square of the residuals (RMSR) is  0.01 
## The df corrected root mean square of the residuals is  0.02 
## 
## The harmonic n.obs is  300 with the empirical chi square  1.1  with prob <  0.89 
## The total n.obs was  300  with Likelihood Chi Square =  6.91  with prob <  0.14 
## 
## Tucker Lewis Index of factoring reliability =  0.989
## RMSEA index =  0.049  and the 90 % confidence intervals are  0 0.11
## BIC =  -15.9
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy             
##                                                    MR1  MR2
## Correlation of (regression) scores with factors   0.94 0.94
## Multiple R square of scores with factors          0.89 0.89
## Minimum correlation of possible factor scores     0.78 0.77

Die nun gefundene Lösung ist sehr ähnlich zu der Lösung, welche uns efa() ausgegeben hat.

Zum Schluss sei noch gezeigt, wie man eine Faktorenanalyse auf Basis einer Kovarianzmatrix durchführen kann (bspw. wenn man die Daten nicht standardisieren möchte). Hierfür muss das Argument covar = TRUE gesetzt werden. Der so erhaltene Output enthält nun den Block Unstandardized loadings:

fa3 <- fa(daten,              # Kovarianzmatrix 
          nfactors = 2,       # Anzahl der Faktoren, 
          fa = "ml",          # ML-Schätzung
          covar = TRUE,       # direkt die Kovarianzmatrix nutzen
          rotate = "varimax") # Varimax-Rotation anfordern
print.psych(fa3,
            cut = 0.3,
            sort = TRUE)

## Factor Analysis using method =  minres
## Call: fa(r = daten, nfactors = 2, rotate = "varimax", covar = TRUE, 
##     fa = "ml")
## Unstandardized loadings (pattern matrix) based upon covariance matrix
##               item   MR1   MR2 h2   u2   H2   U2
## Geometrie        2  4.81       23 10.1 0.70 0.30
## Analysis         3  4.70       22  7.2 0.75 0.25
## Algebra          1  4.43       20  8.3 0.70 0.30
## Lesen            5        4.80 23  9.9 0.70 0.30
## Schreiben        6        4.53 21  7.6 0.73 0.27
## Buchstabieren    4        4.12 17  5.9 0.74 0.26
## 
##                         MR1   MR2
## SS loadings           64.87 60.63
## Proportion Var         0.37  0.35
## Cumulative Var         0.37  0.72
## Proportion Explained   0.52  0.48
## Cumulative Proportion  0.52  1.00
## 
##  Standardized loadings (pattern matrix)
##               item    MR1    MR2   h2   u2
## Geometrie        2   0.83        0.70 0.30
## Analysis         3   0.87        0.75 0.25
## Algebra          1   0.84        0.70 0.30
## Lesen            5          0.84 0.70 0.30
## Schreiben        6          0.85 0.73 0.27
## Buchstabieren    4          0.86 0.74 0.26
## 
##                  MR1  MR2
## SS loadings     2.15 2.17
## Proportion Var  0.36 0.36
## Cumulative Var  0.36 0.72
## Cum. factor Var 0.50 1.00
## 
## Mean item complexity =  1
## Test of the hypothesis that 2 factors are sufficient.
## 
## df null model =  15  with the objective function =  151.62 with Chi Square =  44904.21
## df of  the model are 4  and the objective function was  0.02 
## 
## The root mean square of the residuals (RMSR) is  0.34 
## The df corrected root mean square of the residuals is  0.66 
## 
## The harmonic n.obs is  300 with the empirical chi square  1058.88  with prob <  6.2e-228 
## The total n.obs was  300  with Likelihood Chi Square =  6.94  with prob <  0.14 
## 
## Tucker Lewis Index of factoring reliability =  1
## RMSEA index =  0.049  and the 90 % confidence intervals are  0 0.11
## BIC =  -15.87
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy             
##                                                    MR1  MR2
## Correlation of (regression) scores with factors   0.94 0.94
## Multiple R square of scores with factors          0.89 0.89
## Minimum correlation of possible factor scores     0.77 0.78

Statistik II

Faktorenanalyse mit der Funktion `fa()`

AG Psychologische Forschungsmethoden und Kognitive Psychologie, Institut für Psychologie, Universität Bremen

Statistik II

Faktorenanalyse mit der Funktion fa()

AG Psychologische Forschungsmethoden und Kognitive Psychologie, Institut für Psychologie, Universität Bremen

Faktorenanalyse mit der Funktion `fa()`