3  Language bridges: reticulate / rpy2

Below are some examples of how to incorporate R code in Python, and Python in R.

3.1 Using R in Python: rpy2

In a Python, you can incorporate R scripts using the rpy2 package.

Below shows an example where we have some R scripts that contains functions we want to use in Python:

from rpy2 import robjects
from rpy2.robjects import pandas2ri

# Load R functions
# Read R scripts from files
def load_r_script(filename):
    with open(filename, "r") as file:
        return file.read()

script_location = "src/R/"

script_filenames = [
    "read_in_data.R",
    "do_something.R"
]

scripts = [script_location + script for script in script_filenames]

for script in scripts:
    robjects.r(load_r_script(script))

# Assign the functions by loading them with robjects
read_in_data_r = robjects.r["read_in_data"]
do_something_r = robjects.r["do_something"]

We can now use the functions read_in_data_r and do_something_r in our script.

For example, reading in some csv data, and doing something to it:

r_df = read_in_data_r("/data/my_file.csv")
r_df = do_something_r(r_df)

We could keep using R throughout, but the r_df is an R dataframe object.

If we wanted to continue our analysis, but using pandas, we will have to convert our R dataframe, to a pandas dataframe:

with (robjects.default_converter + pandas2ri.converter).context():
    df = robjects.conversion.get_conversion().rpy2py(r_df)

Here, df is now a pandas dataframe that can be used natively in Python.

Some considerations:

  1. This is great method to quickly incorporate some R code into a Python script.
  2. You will need to manage dependencies across both R and Python, see Chapter 7.
  3. Performance - the conversion step can be computationally expensive in some cases. This is especially important to consider when using this method in a Shiny for Python application. If your app relies on snappy performance, use this method as a prototype, and subsequently consider whether the effort of porting your R scripts to a Python-based method is worth the performance improvement. If sticking with this method, try to minimise the number of conversions back and forth.

3.2 Using Python in R: reticulate