Chapter 5. Statisticcal Tests and Linear Regression

using Pkg;
Pkg.add("HypothesisTests");
using HypothesisTests, CSV, DataFrames, RDatasets
   Resolving package versions...
  No Changes to `~/.julia/environments/v1.11/Project.toml`
  No Changes to `~/.julia/environments/v1.11/Manifest.toml`
Precompiling project...
           βœ— GtkObservables
           βœ— ProfileView
  0 dependencies successfully precompiled in 4 seconds. 518 already precompiled.
  2 dependencies errored.
  For a report of the errors see `julia> err`. To retry use `pkg> precompile`
df = dataset("datasets", "iris")
first(df, 5)

5Γ—5 DataFrame

Row
SepalLength
SepalWidth
PetalLength
PetalWidth
Species

Float64

Float64

Float64

Float64

Cat…

1

5.1

3.5

1.4

0.2

setosa

2

4.9

3.0

1.4

0.2

setosa

3

4.7

3.2

1.3

0.2

setosa

4

4.6

3.1

1.5

0.2

setosa

5

5.0

3.6

1.4

0.2

setosa

T-tests

One-sample

Two-samples

Paired-samples

Attention: the usage of a paired sample test is not valid here, just for illustating how to use the function.

ANOVA

Models in Julia

1. Multivariate Linear Regression

2. Fixed-effects Models (fast)

The package of FixedEffectModels also supports CUDA, increasing the speed on a computer with Nvidia's graphic toolkit:

3. Mediation

While it's common to use packages in R and Python to run mediation analysis, I fail to find an appropriate alternative in Julia.

Therefore, we may need to do it by ourselves:

3Γ—3 DataFrame

Row
X
M
Y

Float64

Float64

Float64

1

0.2

3.5

1.4

2

0.2

3.0

1.4

3

0.2

3.2

1.3

Calculate direct and indirect effects:

(Attention, the theoritical assumption of this example here does not hold, just for illustration)

4. Moderation

3Γ—3 DataFrame

Row
X
W
Y

Float64

Float64

Float64

1

0.2

3.5

1.4

2

0.2

3.0

1.4

3

0.2

3.2

1.3

(Attention, the theoritical assumption of this example here does not hold, just for illustration)

5. Conditional Process

A more complete analysis, therefore, should attempt to model the mechanisms at work linking X to Y while simultaneously allowing those effects to be contingent on context, circumstance, or individual differences. (Andrew F. Hayes, 2018, p. 395)

For instance, we want to test if the relation between X and Y, is mediated by M, and the M on Y is moderated by W. In the current dataset, we assume:

  • X: PetalWidth

  • M: SepalLength

  • W: SepalWidth

  • Y: PetalLength

3Γ—4 DataFrame

Row
X
M
W
Y

Float64

Float64

Float64

Float64

1

0.2

5.1

3.5

1.4

2

0.2

4.9

3.0

1.4

3

0.2

4.7

3.2

1.3

Suppose that we expect:

  • $M=i_M+aX+\epsilon_M$

  • $Y=i_Y+c'X+b_1M+b_2W+b_3WM+\epsilon_Y$

If you need bootstraping and everything else introduces by Hayes, we can using R or Python in Julia to do so.

Last updated

Was this helpful?