animated-memory-purrr-rmd
A repo to contain the example of parameterised reporting using Rmarkdown.
As a part of my talk for the RLadies Bangalore Community, this repository will serve as an example for how to parameterize rmarkdown and quarto reports (files).
A point to note is that these reports presented here are not statistically rigorous or aesthetically acceptable, the idea here is to demonstrate the principle of prarameterised reports and how to create several such reports with the help of {purrr}.
Project/file structure
All the raw data is stored in the data_raw
folder. The
raw data is the census village amenities directory for the Gujarat state
for year 2011. One can look for this dataset here
The raw data is cleaned using the script
prepare_data_for_report.R
and is stored in the
data_prepared
folder.
The report is created using the files
district_report.rmd
and district_report.qmd
.
Both these files carry out the same operations and generate the same
reports. This duplication is for demonstration purpose. All reports
generated by the .rmd
file are stored in the folder
store_district_RmdReports
and the reports generated by the
.qmd
files are stored in the folder
store_district_QmdReports
. This is again is only for
demonstration purpose, one need not make two folders.
The file generate_parameterised_reports.R
contains code
to programmatically generate several reports using the {purrr}
abilities.
Below is the project strucutre.
## .
## ├── data_prepared
## │ ├── data_prepared.csv
## │ └── data_prepared.feather
## ├── data_raw
## │ └── DCHB_Village_Release_2400.xlsx
## ├── district_report.rmd
## ├── generate_parameterised_reports.R
## ├── ideal-chainsaw-purrr.Rproj
## ├── index.html
## ├── index.Rmd
## ├── LICENSE
## ├── prepare_data_for_report.R
## ├── README.md
## ├── store_district_QmdReports
## │ ├── district_report.qmd
## │ ├── District_ReportAhmadabad.html
## │ ├── District_ReportAmreli.html
## │ ├── District_ReportAnand.html
## │ ├── District_ReportBanasKantha.html
## │ ├── District_ReportBharuch.html
## │ ├── District_ReportBhavnagar.html
## │ ├── District_ReportDohad.html
## │ ├── District_ReportGandhinagar.html
## │ ├── District_ReportJamnagar.html
## │ ├── District_ReportJunagadh.html
## │ ├── District_ReportKachchh.html
## │ ├── District_ReportKheda.html
## │ ├── District_ReportMahesana.html
## │ ├── District_ReportNarmada.html
## │ ├── District_ReportNavsari.html
## │ ├── District_ReportPanchMahals.html
## │ ├── District_ReportPatan.html
## │ ├── District_ReportPorbandar.html
## │ ├── District_ReportRajkot.html
## │ ├── District_ReportSabarKantha.html
## │ ├── District_ReportSurat.html
## │ ├── District_ReportSurendranagar.html
## │ ├── District_ReportTapi.html
## │ ├── District_ReportTheDangs.html
## │ ├── District_ReportVadodara.html
## │ └── District_ReportValsad.html
## └── store_district_RmdReports
## ├── District_ReportAhmadabad.html
## ├── District_ReportAmreli.html
## ├── District_ReportAnand.html
## ├── District_ReportBanasKantha.html
## ├── District_ReportBharuch.html
## ├── District_ReportBhavnagar.html
## ├── District_ReportDohad.html
## ├── District_ReportGandhinagar.html
## ├── District_ReportJamnagar.html
## ├── District_ReportJunagadh.html
## ├── District_ReportKachchh.html
## ├── District_ReportKheda.html
## ├── District_ReportMahesana.html
## ├── District_ReportNarmada.html
## ├── District_ReportNavsari.html
## ├── District_ReportPanchMahals.html
## ├── District_ReportPatan.html
## ├── District_ReportPorbandar.html
## ├── District_ReportRajkot.html
## ├── District_ReportSabarKantha.html
## ├── District_ReportSurat.html
## ├── District_ReportSurendranagar.html
## ├── District_ReportTapi.html
## ├── District_ReportTheDangs.html
## ├── District_ReportVadodara.html
## └── District_ReportValsad.html
A note about Quarto rendering
The {quarto} package contains a function,
quarto_render()
, which is a mirror for
rmarkdown::render()
. However, in the
quarto::quarto_render
there is no output_dir
argument. This means that the rendered .html(or any output format) are
stored in the same place as the .qmd file. This is why the
district_report.qmd
file is in the
store_district_QmdReports
folder. There is a current issue
open on the rstudion community about this as well. To
read more click here.
A note on process
This is an attempt to lay down the steps to generate parameterised reports using purrr and rmarkdown.
Step 1 — Arriving at the analyses that you want to present in report
It is important to have completed all exploratory analyses and have gone through various analytical detours and rabbit holes beforehand. This process usually helps in deciding what is to be reported and decide the flow of analyses (steps/pieces of analyses). This framework should be set and frozen. However, one can enhance this framework from time to time. It is also in this step you decide what all params you will need or keep.
Step 2 — Prepare data for the reprot
The exact structure and format of data required for the report will become clear after Step 1. Create a separate R script which will clean, wrangle and make all necessary changes to raw data. Moreover, the prepared data can be saved as a csv or feather file or a file of your choosing. It is this prepared data that will be used to generate the report. The prepared data, in my opinion, should be in a manner that will allow you to carry out practically least amount of complex computations in the report (.rmd file).
In this case the raw data is in the
data_raw
folder, which is cleaned, wrangled and prepared using the scriptprepare_data_for_report.R
and the prepared data is stored in the folderdata_prepared
by the namedata_prepared.csv
anddata_prepared.feather
.
Step 3 — Write the Report Structure
A new .rmd file will generate the report. Declare all parameters that were decided upon in Step 1 in this .rmd file. It is in this that the analyses flow will be carried out. I suggest to write this .rmd file keeping in mind some values that the params in this file can take. This makes it easier to implement the analyses flow.
In this case the report structure is contained in the
district_report.rmd
and for quarto it is in the filedistrict_report.qmd
(this .qmd file is in the folderstore_didtrict_QmdReports
for reasons described in the previous section A note about Quarto rendering)
Step 4 — One ring to rule them All (One script to create all the reports)
Create a R script for functionally generating multiple parameterised
reports. In this script create a wrapper function around the
rmarkdown: render()
function in a manner where it takes as
input the file created in Step 3, points to the folder in which
the generated reports are to me saved, gives appropriate names to the
generated reports and takes in the values that the params will assume
for the report.
Once this function is created. Create a vectors/lists, one for each param, that will contain the sequence of values to be passed to a given parameter.
Use the appropriate {purrr} function, if there are two or more params
pmap
is the way to go, apply the wraper function over the
vectors/lists of param inputs. This will generate all your reprots and
save those in the location specified.
In this case this is carried out in the file
generate_parameterised_reports.R
Where is the code for all this??
Where are the reprots
Here are the Rmd reports
[1]
“Ahmadabad”
[2]
“Amreli”
[3]
“Anand”
[4]
“BanasKantha”
[5]
“Bharuch”
[6]
“Bhavnagar”
[7]
“Dohad”
[8]
“Gandhinagar”
[9]
“Jamnagar”
[10]
“Junagadh”
[11]
“Kachchh”
[12]
“Kheda”
[13]
“Mahesana”
[14]
“Narmada”
[15]
“Navsari”
[16]
“PanchMahals”
[17]
“Patan”
[18]
“Porbandar”
[19]
“Rajkot”
[20]
“SabarKantha”
[21]
“Surat”
[22]
“Surendranagar”
[23]
“Tapi”
[24]
“TheDangs”
[25]
“Vadodara”
[26]
“Valsad”
Here are the Qmd reports
[1]
“Ahmadabad”
[2]
“Amreli”
[3]
“Anand”
[4]
“BanasKantha”
[5]
“Bharuch”
[6]
“Bhavnagar”
[7]
“Dohad”
[8]
“Gandhinagar”
[9]
“Jamnagar”
[10]
“Junagadh”
[11]
“Kachchh”
[12]
“Kheda”
[13]
“Mahesana”
[14]
“Narmada”
[15]
“Navsari”
[16]
“PanchMahals”
[17]
“Patan”
[18]
“Porbandar”
[19]
“Rajkot”
[20]
“SabarKantha”
[21]
“Surat”
[22]
“Surendranagar”
[23]
“Tapi”
[24]
“TheDangs”
[25]
“Vadodara”
[26]
“Valsad”