Generating Parameterised Reports with help of purrr

Ayush Patel

2022-08-25

animated-memory-purrr-rmd

A repo to contain the example of parameterised reporting using Rmarkdown.

As a part of my talk for the RLadies Bangalore Community, this repository will serve as an example for how to parameterize rmarkdown and quarto reports (files).

A point to note is that these reports presented here are not statistically rigorous or aesthetically acceptable, the idea here is to demonstrate the principle of prarameterised reports and how to create several such reports with the help of {purrr}.

Project/file structure

All the raw data is stored in the data_raw folder. The raw data is the census village amenities directory for the Gujarat state for year 2011. One can look for this dataset here

The raw data is cleaned using the script prepare_data_for_report.R and is stored in the data_prepared folder.

The report is created using the files district_report.rmd and district_report.qmd. Both these files carry out the same operations and generate the same reports. This duplication is for demonstration purpose. All reports generated by the .rmd file are stored in the folder store_district_RmdReports and the reports generated by the .qmd files are stored in the folder store_district_QmdReports. This is again is only for demonstration purpose, one need not make two folders.

The file generate_parameterised_reports.R contains code to programmatically generate several reports using the {purrr} abilities.

Below is the project strucutre.

## .
## ├── data_prepared
## │   ├── data_prepared.csv
## │   └── data_prepared.feather
## ├── data_raw
## │   └── DCHB_Village_Release_2400.xlsx
## ├── district_report.rmd
## ├── generate_parameterised_reports.R
## ├── ideal-chainsaw-purrr.Rproj
## ├── index.html
## ├── index.Rmd
## ├── LICENSE
## ├── prepare_data_for_report.R
## ├── README.md
## ├── store_district_QmdReports
## │   ├── district_report.qmd
## │   ├── District_ReportAhmadabad.html
## │   ├── District_ReportAmreli.html
## │   ├── District_ReportAnand.html
## │   ├── District_ReportBanasKantha.html
## │   ├── District_ReportBharuch.html
## │   ├── District_ReportBhavnagar.html
## │   ├── District_ReportDohad.html
## │   ├── District_ReportGandhinagar.html
## │   ├── District_ReportJamnagar.html
## │   ├── District_ReportJunagadh.html
## │   ├── District_ReportKachchh.html
## │   ├── District_ReportKheda.html
## │   ├── District_ReportMahesana.html
## │   ├── District_ReportNarmada.html
## │   ├── District_ReportNavsari.html
## │   ├── District_ReportPanchMahals.html
## │   ├── District_ReportPatan.html
## │   ├── District_ReportPorbandar.html
## │   ├── District_ReportRajkot.html
## │   ├── District_ReportSabarKantha.html
## │   ├── District_ReportSurat.html
## │   ├── District_ReportSurendranagar.html
## │   ├── District_ReportTapi.html
## │   ├── District_ReportTheDangs.html
## │   ├── District_ReportVadodara.html
## │   └── District_ReportValsad.html
## └── store_district_RmdReports
##     ├── District_ReportAhmadabad.html
##     ├── District_ReportAmreli.html
##     ├── District_ReportAnand.html
##     ├── District_ReportBanasKantha.html
##     ├── District_ReportBharuch.html
##     ├── District_ReportBhavnagar.html
##     ├── District_ReportDohad.html
##     ├── District_ReportGandhinagar.html
##     ├── District_ReportJamnagar.html
##     ├── District_ReportJunagadh.html
##     ├── District_ReportKachchh.html
##     ├── District_ReportKheda.html
##     ├── District_ReportMahesana.html
##     ├── District_ReportNarmada.html
##     ├── District_ReportNavsari.html
##     ├── District_ReportPanchMahals.html
##     ├── District_ReportPatan.html
##     ├── District_ReportPorbandar.html
##     ├── District_ReportRajkot.html
##     ├── District_ReportSabarKantha.html
##     ├── District_ReportSurat.html
##     ├── District_ReportSurendranagar.html
##     ├── District_ReportTapi.html
##     ├── District_ReportTheDangs.html
##     ├── District_ReportVadodara.html
##     └── District_ReportValsad.html

A note about Quarto rendering

The {quarto} package contains a function, quarto_render(), which is a mirror for rmarkdown::render(). However, in the quarto::quarto_render there is no output_dir argument. This means that the rendered .html(or any output format) are stored in the same place as the .qmd file. This is why the district_report.qmd file is in the store_district_QmdReports folder. There is a current issue open on the rstudion community about this as well. To read more click here.

A note on process

This is an attempt to lay down the steps to generate parameterised reports using purrr and rmarkdown.

Step 1 — Arriving at the analyses that you want to present in report

It is important to have completed all exploratory analyses and have gone through various analytical detours and rabbit holes beforehand. This process usually helps in deciding what is to be reported and decide the flow of analyses (steps/pieces of analyses). This framework should be set and frozen. However, one can enhance this framework from time to time. It is also in this step you decide what all params you will need or keep.

Step 2 — Prepare data for the reprot

The exact structure and format of data required for the report will become clear after Step 1. Create a separate R script which will clean, wrangle and make all necessary changes to raw data. Moreover, the prepared data can be saved as a csv or feather file or a file of your choosing. It is this prepared data that will be used to generate the report. The prepared data, in my opinion, should be in a manner that will allow you to carry out practically least amount of complex computations in the report (.rmd file).

In this case the raw data is in the data_raw folder, which is cleaned, wrangled and prepared using the script prepare_data_for_report.R and the prepared data is stored in the folder data_prepared by the name data_prepared.csv and data_prepared.feather.

Step 3 — Write the Report Structure

A new .rmd file will generate the report. Declare all parameters that were decided upon in Step 1 in this .rmd file. It is in this that the analyses flow will be carried out. I suggest to write this .rmd file keeping in mind some values that the params in this file can take. This makes it easier to implement the analyses flow.

In this case the report structure is contained in the district_report.rmd and for quarto it is in the file district_report.qmd (this .qmd file is in the folder store_didtrict_QmdReports for reasons described in the previous section A note about Quarto rendering)

Step 4 — One ring to rule them All (One script to create all the reports)

Create a R script for functionally generating multiple parameterised reports. In this script create a wrapper function around the rmarkdown: render() function in a manner where it takes as input the file created in Step 3, points to the folder in which the generated reports are to me saved, gives appropriate names to the generated reports and takes in the values that the params will assume for the report.

Once this function is created. Create a vectors/lists, one for each param, that will contain the sequence of values to be passed to a given parameter.

Use the appropriate {purrr} function, if there are two or more params pmap is the way to go, apply the wraper function over the vectors/lists of param inputs. This will generate all your reprots and save those in the location specified.

In this case this is carried out in the file generate_parameterised_reports.R

Where is the code for all this??

Here is the github repo link.

Where are the reprots

Here are the Rmd reports

[1] “Ahmadabad
[2] “Amreli
[3] “Anand
[4] “BanasKantha
[5] “Bharuch
[6] “Bhavnagar
[7] “Dohad
[8] “Gandhinagar
[9] “Jamnagar
[10] “Junagadh
[11] “Kachchh
[12] “Kheda
[13] “Mahesana
[14] “Narmada
[15] “Navsari
[16] “PanchMahals
[17] “Patan
[18] “Porbandar
[19] “Rajkot
[20] “SabarKantha
[21] “Surat
[22] “Surendranagar” [23] “Tapi
[24] “TheDangs
[25] “Vadodara
[26] “Valsad

Here are the Qmd reports

[1] “Ahmadabad
[2] “Amreli
[3] “Anand
[4] “BanasKantha
[5] “Bharuch
[6] “Bhavnagar
[7] “Dohad
[8] “Gandhinagar
[9] “Jamnagar
[10] “Junagadh
[11] “Kachchh
[12] “Kheda
[13] “Mahesana
[14] “Narmada
[15] “Navsari
[16] “PanchMahals
[17] “Patan
[18] “Porbandar
[19] “Rajkot
[20] “SabarKantha
[21] “Surat
[22] “Surendranagar” [23] “Tapi
[24] “TheDangs
[25] “Vadodara
[26] “Valsad