Chapter 1. Installation and Basics

Why do we need Julia?

Given the popularity of Python, R, and Stata (in Econ.), why do we need Julia? Three reasons:

  • Fast, compared with popoular programming langugages in data analysis, Julia shows high efficiency, as shown in the following figure.

    img

    (The evaluation is done byDaniel Moura at https://towardsdatascience.com/r-vs-python-vs-julia-90456a2bcbab)

  • Ease of use This language is new, but pretty similar to R and Python & optimized for data analysis.

  • Python / R integrateble: Julia allows you integrate ggplot2 in R, pytorch in Python into a data project written in Julia. More details can be found in Chatper 7.

How to install Julia

For fresh installation on linux / mac (if you are using windows, click here):

curl -fsSL <https://install.julialang.org> | sh

If you have previous versions installed:

juliaup self uninstall

Using Julia in Jupyter Notebooks

  1. Run julia in your shell / commandline

  2. Install the package for installing a kernel for Jupyter notebooks

     using Pkg
     Pkg.add("IJulia")
  3. Install a kernel for Jupyter notebook:

    using IJulia
    installkernel("Julia", "--depwarn=no")
  4. (optional) Install additional kernels for multi-threads computing:

    • 8 thraeds

      using IJulia
      installkernel("Julia (8 threads)", env=Dict("JULIA_NUM_THREADS"=>"8"))
    • 32 thraeds

      using IJulia
      installkernel("Julia (32 threads)", env=Dict("JULIA_NUM_THREADS"=>"32"))

    (Additional information can be found here)

Now, you will be able to choose Julia for your Jupyter notebooks:

img

Update Julia

In your terminal, juliaup update

Update kernels for notebook:

   ```Julia
   using IJulia
   installkernel("Julia (8 threads)", env=Dict("JULIA_NUM_THREADS"=>"8"))
   ```

Basics in Julia

Different from that in Python and R, Julia uses function println() to print strings and logs:

println("Hello")
Hello

Some times, we want to create new variables, and assign values to them -- it's like Python:

text = "Hello"
println(text)
Hello

Of course, we want to combine some logs for us to document the progress of analysis:

text = "Hello"
text_to_print = string(text, ", world!")
println(text_to_print)
Hello, world!

Alternatively, you can also use * operater to concat strings:

text = "Hello"
text_to_print = text * ", world!"
println(text_to_print)
Hello, world!

(This seemingly weird operation is explained here)

You may have noted that we are doing operations. In Julia, basic operations of numbers are:

  1. Arithmetic Operators

    Expression
    Name
    Description

    x + y

    binary plus

    performs addition

    x - y

    binary minus

    performs subtraction

    x * y

    times

    performs multiplication

    x / y

    divide

    performs division

    x ÷ y

    integer divide

    x / y, truncated to an integer

    x ^ y

    power

    raises x to the yth power

    x % y

    remainder

    equivalent to rem(x, y)

    (Those are common operations, see full list here)

  2. Logical Operators

    Expression
    Name

    !x

    Not

    x && y

    And

    x || y

    Or

  3. Comparisons

    Operator
    Name

    ==

    Equal to

    !=, ≠

    Not equal to

    <

    Less than

    <=, ≤

    Less than or equal to

    >

    Greater than

    >=, ≥

    Greater than or equal to

There are more operations less frequently used in data analysis, see full list here.

Data structures

Some operations are only applicable to certain data types (e.g., the * function for concatanating strings has a different meaning to numerical values). If you are confused about the data type of a variable:

String

typeof(text)
String

Interger

To save memories, you can define the length of a number:

number_a::Int8 = 60;
println(typeof(number_a))

number_b::Int128 = 60;
println(typeof(number_b))
Int8
Int128

By default, it's 64:

number_c = 60;
println(typeof(number_c))
Int64

Float

number_d = 1.27

println(typeof(number_d))
Float64

What would happen if an int is multiplied by a float?

println(typeof(number_a * number_d))
Float64

Julia also has a dynamic type system.

Vector (List / Array)

list_a = ["a", "b", "c"]

println(typeof(list_a))
Vector{String}

A vector can be indexed by an integer:

list_a[1]
"a"

Or a vector:

list_a[[1, 2]]
2-element Vector{String}:
 "a"
 "b"

Tuples

brands = ("Apple", "Samsung", "Google")

println(typeof(brands))
Tuple{String, String, String}

A tuple can be indexed by an integer:

brands[1]
"Apple"

Or a vector:

brands[[1, 2]]
("Apple", "Samsung")

To quickly create a vector composing of a continuous integers for indexing:

brands[1:3]
("Apple", "Samsung", "Google")

Because 1:3 has created a vector-like range:

println(typeof(1:3))
UnitRange{Int64}

Dictionary

scores_a = Dict(
  "John" => 98,
  "Jessy" => 99
)

println(typeof(scores_a))
Dict{String, Int64}

The dictionary can be retrieved by its key:

scores_a["John"]
98

And the value can be changed:

scores_a["John"] = 30
scores_a
Dict{String, Int64} with 2 entries:
  "John"  => 30
  "Jessy" => 99

Using Packages

In many cases, we need to import many packages to finish diverse tasks. For example, we use ggplot2 in R to create beautiful figures.

In Julia, the workflow is like what we have in R (you can also use my package to simplify the same process in Python).

using Pkg;
Pkg.add("JSON");
   Resolving package versions...
  No Changes to `~/.julia/environments/v1.11/Project.toml`
  No Changes to `~/.julia/environments/v1.11/Manifest.toml`

Then, when a package is installed, you need to claim to use it before using:

using JSON

After that, you are able to use methods of a package such as JSON.parsefile("some.json") here

Functions and methods

To define a function:

function greet(name)
    "Hello, $name."
end;
greet("Jessy")
"Hello, Jessy."

To use a method defined globally, such as println, just call it.

If it's from a package, use package.method, e.g., JSON.parsefile("some.json") we've mentioned.

Great! Let's move on to load your data first.

Last updated

Was this helpful?