This repo contains the research project as part of ETC5543 for Yiwen Zhang.
- Extend existing R Shiny app (found here). The app contains interactive analysis of CRAN download logs and textual analysis of the title and descriptions of packages by the CRAN Task View.
- Exploratory data analysis of CRAN download logs. Some obvious trends include a drop in download counts over the weekend and a spike when there is an update in the CRAN. Some questions to ask are:
- Are older packages more downloaded? Does this mean that even if a better R package comes along then it takes a while before the download picks up for it?
- How does an update to R affect download?
- The download counts are inflated due to bots and mirror downloads. How to adjust the download counts?
- Find related literatures. Some to consider are:
rtrends
- although it's not clear to me this is doing anything particularly striking.adjustedcranlogs
. It looks like sub-samples some CRAN packages and takes a certain quantile to subtract the download count.packageRank
. Also see intro post which is really well written!cranlogs
- to download the summary download data.cran.stats
dlstats
pkgsearch
Visualize.CRAN.Downloads
- CRAN package download logs http://cran-logs.rstudio.com/