Chapter 7. Using R and Python in Julia

I recently faced an efficiency challenge in my data analysis pipeline: Python and R were too slow, even with parallelization. To address this, I began learning Julia.

Surprisingly, I discovered a method to integrate all major data analysis languages in a single notebook. Now we can seamlessly combine machine learning (in Python), fast data preparation (using Julia), and result visualization with ggplot2 (supported in R).

Prerequisites

Python, R, Julia installed
Jupyter Notebook installed
Install Julia kernel:
- Enter Julia REPL
- For example, we want to add a Julia kernel in Jupyter using multi-threads: installkernel("Julia (4 threads)", env=Dict("JULIA_NUM_THREADS"=>"4"))
Install PyCall and RCall in Julia

import Pkg; Pkg.add("RCall"); Pkg.add("PyCall")

# Import libraries
using RCall, PyCall, DataFrames

[32m[1m   Resolving[22m[39m package versions...
[32m[1m  No Changes[22m[39m to `~/.julia/environments/v1.11/Project.toml`
[32m[1m  No Changes[22m[39m to `~/.julia/environments/v1.11/Manifest.toml`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m  No Changes[22m[39m to `~/.julia/environments/v1.11/Project.toml`
[32m[1m  No Changes[22m[39m to `~/.julia/environments/v1.11/Manifest.toml`

Using R in Julia

1. Using R-functions:

R"""
library(ggplot2)
"""

jmtcars = reval("mtcars")
first(jmtcars, 5)

5-element Vector{Any}:
 RObject{RealSxp}
 [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
[31] 15.0 21.4

 RObject{RealSxp}
 [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4

 RObject{RealSxp}
 [1] 160.0 160.0 108.0 258.0 360.0 225.0 360.0 146.7 140.8 167.6 167.6 275.8
[13] 275.8 275.8 472.0 460.0 440.0  78.7  75.7  71.1 120.1 318.0 304.0 350.0
[25] 400.0  79.0 120.3  95.1 351.0 145.0 301.0 121.0

 RObject{RealSxp}
 [1] 110 110  93 110 175 105 245  62  95 123 123 180 180 180 205 215 230  66  52
[20]  65  97 150 150 245 175  66  91 113 264 175 335 109

 RObject{RealSxp}
 [1] 3.90 3.90 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 3.92 3.07 3.07 3.07 2.93
[16] 3.00 3.23 4.08 4.93 4.22 3.70 2.76 3.15 3.73 3.08 4.08 4.43 3.77 4.22 3.62
[31] 3.54 4.11

2. Using R codes:

R"""
library(ggplot2)
library(dplyr)

data=head(mtcars, 30)
 
# 1/ add text with geom_text, use nudge to nudge the text
ggplot(data, aes(x=wt, y=mpg)) +
  geom_point() + # Show dots
  geom_text(
    label=rownames(data), 
    nudge_x = 0.25, nudge_y = 0.25, 
    check_overlap = T
  )
"""

[33m[1m┌ [22m[39m[33m[1mWarning: [22m[39mRCall.jl: 
[33m[1m│ [22m[39mAttaching package: ‘dplyr’
[33m[1m│ [22m[39m
[33m[1m│ [22m[39mThe following objects are masked from ‘package:stats’:
[33m[1m│ [22m[39m
[33m[1m│ [22m[39m    filter, lag
[33m[1m│ [22m[39m
[33m[1m│ [22m[39mThe following objects are masked from ‘package:base’:
[33m[1m│ [22m[39m
[33m[1m│ [22m[39m    intersect, setdiff, setequal, union
[33m[1m│ [22m[39m
[33m[1m└ [22m[39m[90m@ RCall ~/.julia/packages/RCall/0ggIQ/src/io.jl:172[39m

RObject{VecSxp}

Using Python in Julia

1. Using Python functions

os = pyimport("os")
os.listdir()

9-element Vector{String}:
 "6.visualize.jl.ipynb"
 "5.models.jl.ipynb"
 "res"
 "5.2.models.jl.ipynb"
 "1.installation.basics.md"
 "7.r.and.python.in.julia.jl.ipynb"
 "4.pipeline.tools.jl.ipynb"
 "3.transform.calculate.jl.ipynb"
 "2.data.loading.selection.jl.ipynb"

2. Using Python code blocks

py"""
print("This is Python")
"""

This is Python

Alternatives

2Sigma, BeakerX, https://github.com/twosigma/beakerx

PreviousChapter 6. Data Visualization in Julia NextChapter 9. Performance

Last updated 10 months ago

Was this helpful?