# Data Analytics in Julia

* [**🔗 Read the book online**](https://data-julia.rongxin.me)
* By [**Rongxin Ouyang**](https://rongxin.me/cv), PhD student in Computational Communication, NUS

![](https://github.com/reycn/data-analytics-in-julia/blob/main/gitbook/image/cover.png)\
![](https://805018807-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FL1MD6Rchxcl7KPeAtsi1%2Fuploads%2Fgit-blob-233b4c358a42a7761321da4904959ff81d06a653%2Fcover.png?alt=media)\
\&#xNAN;*(Generated by GPT-4o)*

## Scope

This short book provides a practical guide for data analysis in social science using Julia. It replicates common analytical steps in the field.

Because of its speed.

## Outline

* [✅ Chapter 1. Installation](https://data-julia.rongxin.me/data-analysis-in-julia/1.installation.basics.jl)
  * ✅ Why do we need Julia
  * ✅ How to install Julia
  * ✅ How to install Julia as a Jupyter kernal for notebooks
  * ✅ The basics of operations, data structures, packages, methods, and define functions
* [✅ Chapter 2. Data Loading and Selection](https://data-julia.rongxin.me/data-analysis-in-julia/2.data.loading.selection.jl)
  * ✅ Load a dataframe from a local file, an online link, and a common datasets; or create it from scratch
  * ✅ Select by rows, columns, and conditions.
* [✅ Chapter 3. Transformation and calculation](https://data-julia.rongxin.me/data-analysis-in-julia/3.transform.calculate.jl)
  * ✅ Split and combine
  * ✅ Grouping
  * ✅ Sorting
  * ✅ Transforming between long / wide tables
  * ✅ Find / fill / drop missing values
* [✅ Chapter 4. Pipeline and Useful Packages](https://data-julia.rongxin.me/data-analysis-in-julia/4.pipeline.tools.jl)
  * ✅ Data pipeline
  * ✅ Manipulate strings
  * ✅ Network
* [✅ Chapter 5.1 Models and Tests](https://data-julia.rongxin.me/data-analysis-in-julia/5.1.models.jl)
  1. ✅ Common parametric tests (t-tests and ANOVA)
  2. ✅ Regression (multi-variate regression and fixed effects models)
  3. ✅ Path Analysis
     1. ✅ Mediation
     2. ✅ Moderation
     3. ✅ Conditional Path Analysis
* [✅ Chapter 5.2 Models and Tests (continued)](https://data-julia.rongxin.me/data-analysis-in-julia/5.2.models.jl)
  1. 🚧 / ✅ Counterfactual Framework
     1. 🚧 / ✅ Instrumental Variables
     2. 🚧 / ✅ Regression Discontinuity Design
     3. 🚧 / ✅ Difference-in-Difference
     4. 🚧 / ✅ Synthetic Control
     5. 🚧 / ✅ Synthetic Difference-in-Difference
* [✅ Chapter 6. Visualization](https://data-julia.rongxin.me/data-analysis-in-julia/6.visualize.jl) (ggplot2-like)
  * ✅ Scatterplot, barplot, lineplot, and histogram
  * ✅ Styles and themes
  * ✅ Multiple-plots in facets
* [✅ Chapter 7. Using R and Python in Julia](https://data-julia.rongxin.me/data-analysis-in-julia/7.r.and.python.in.julia.jl)
  * ✅ Using R functions and R code blocks in Julia
  * ✅ Using Python functions and Python code blocks in Julia
* [✅ Chapter 8. Performance Optimization](https://data-julia.rongxin.me/8.performance.jl)
  * ✅ Tips for increasing the speed
  * ✅ Profiling tool and visualization
* [✅ Appendix. Codes for plotting](https://data-julia.rongxin.me/8.plot.and.notebooks)
  * ✅ All codes used for plotting

## License

This work is licensed under a[Creative Commons Attribution-NonCommercial 4.0 International License](https://creativecommons.org/licenses/by-nc/4.0/).

[![CC BY-NC 4.0](https://licensebuttons.net/l/by-nc/4.0/88x31.png)](https://creativecommons.org/licenses/by-nc/4.0/)
