GitHub - qinwf/awesome-R: A curated list of awesome R packages, frameworks and software.

Awesome

A curated list of awesome R packages and tools. Inspired by awesome-machine-learning.

heart for Top 50 CRAN downloaded packages or repos with 400+ star

2023

2020

  • VSCode - vscode-R + vscode-r-lsp VSCode R Langauage Support
  • gt - Easily generate information-rich, publication-quality tables from R
  • lightgbm heart - Light Gradient Boosting Machine.
  • torch - Tensors and Neural Networks with 'GPU' Acceleration.

2019

Integrated Development Environments

Integrated Development Environment

Syntax

Packages change the way you use R.

  • magrittr heart - Let's pipe it.
  • pipeR - Multi-paradigm Pipeline Implementation.
  • lambda.r - Functional programming and simple pattern matching in R.
  • purrr - A FP package for R in the spirit of underscore.js.

Data Manipulation

Packages for cooking data.

Data Formats

Packages for reading and writing data of different formats.

Graphic Displays

Packages for showing data.

  • ggplot2 heart - An implementation of the Grammar of Graphics.
  • ggfortify - A unified interface to ggplot2 popular statistical packages using one line of code.
  • ggrepel - Repel overlapping text labels away from each other.
  • ggalt - Extra Coordinate Systems, Geoms and Statistical Transformations for ggplot2.
  • ggstatsplot - ggplot2 Based Plots with Statistical Details
  • ggtree - Visualization and annotation of phylogenetic tree.
  • ggtech - ggplot2 tech themes and scales
  • ggplot2 Extensions - Showcases of ggplot2 extensions.
  • lattice - A powerful and elegant high-level data visualization system.
  • corrplot - A graphical display of a correlation matrix or general matrix. It also contains some algorithms to do matrix reordering.
  • rgl - 3D visualization device system for R.
  • Cairo - R graphics device using cairo graphics library for creating high-quality display output.
  • extrafont - Tools for using fonts in R graphics.
  • showtext - Enable R graphics device to show text using system fonts.
  • animation - A simple way to produce animated graphics in R, using ImageMagick.
  • gganimate - Create easy animations with ggplot2.
  • misc3d - Powerful functions to deal with 3d plots, isosurfaces, etc.
  • xkcd - Use xkcd style in graphs.
  • imager - An image processing package based on CImg library to work with images and display them.
  • hrbrthemes - πŸ” Opinionated, typographic-centric ggplot2 themes and theme components.
  • waffle - 🍁 Make waffle (square pie) charts in R.
  • dendextend - visualizing, adjusting and comparing trees of hierarchical clustering.
  • idendro - interactive exploration of dendrograms (trees of hierarchical clustering).
  • r2d3 - R Interface to D3 Visualizations
  • Patchwork - Combine separate ggplots into the same graphic.
  • plot3D - Plotting Multi-Dimensional Data
  • plot3Drgl - Plotting Multi-Dimensional Data - Using 'rgl'
  • httpgd - Asynchronous http server graphics device for R.

HTML Widgets

Packages for interactive visualizations.

Reproducible Research

Packages for literate programming and reproducible workflows.

Web Technologies and Services

Packages to surf the web.

Parallel Computing

Packages for parallel computing.

High Performance

Packages for making R faster.

  • Rcpp heart - Rcpp provides a powerful API on top of R, make function in R extremely faster.
  • Rcpp11 - Rcpp11 is a complete redesign of Rcpp, targetting C++11.
  • compiler - speeding up your R code using the JIT
  • cpp11 - cpp11 is a header-only R package that helps R package developers handle R objects with C++ code. It's similar to Rcpp but with different design trade-offs and features.

Language API

Packages for other languages.

  • rJava - Low-level R to Java interface.
  • jvmr - Integration of R, Java, and Scala.
  • reticulate heart - Interface to 'Python'.
  • rJython - R interface to Python via Jython.
  • rPython - Package allowing R to call Python.
  • runr - Run Julia and Bash from R.
  • RJulia - R package Call Julia.
  • JuliaCall - Seamless Integration Between R and Julia.
  • RinRuby - a Ruby library that integrates the R interpreter in Ruby.
  • R.matlab - Read and write of MAT files together with R-to-MATLAB connectivity.
  • RcppOctave - Seamless Interface to Octave and Matlab.
  • RSPerl - A bidirectional interface for calling R from Perl and Perl from R.
  • V8 - Embedded JavaScript Engine.
  • htmlwidgets - Bring the best of JavaScript data visualization to R.
  • rpy2 - Python interface for R.

Database Management

Packages for managing data.

  • RODBC - ODBC database access for R.
  • DBI - Defines a common interface between the R and database management systems.
  • elastic - Wrapper for the Elasticsearch HTTP API
  • mongolite - Streaming Mongo Client for R
  • odbc - Connect to ODBC databases (using the DBI interface)
  • RMariaDB - An R interface to MariaDB (a replacement for the old RMySQL package)
  • RMySQL - R interface to the MySQL database.
  • ROracle - OCI based Oracle database interface for R.
  • RPostgres - an DBI-compliant interface to the postgres database.
  • RPostgreSQL - R interface to the PostgreSQL database system.
  • RSQLite - SQLite interface for R
  • RJDBC - Provides access to databases through the JDBC interface.
  • rmongodb - R driver for MongoDB.
  • redux - Redis client for R.
  • RCassandra - Direct interface (not Java) to the most basic functionality of Apache Cassandra.
  • RHive - R extension facilitating distributed computing via Apache Hive.
  • RNeo4j - Neo4j graph database driver.
  • rpostgis - R interface to PostGIS database and get spatial objects in R.

Machine Learning

Packages for making R cleverer.

  • anomalize - Tidy Anomaly Detection using Twitter's AnomalyDetection method.
  • AnomalyDetection heart - AnomalyDetection R package from Twitter.
  • ahaz - Regularization for semiparametric additive hazards regression.
  • arules - Mining Association Rules and Frequent Itemsets
  • bigrf - Big Random Forests: Classification and Regression Forests for Large Data Sets
  • bigRR - Generalized Ridge Regression (with special advantage for p >> n cases)
  • bmrm - Bundle Methods for Regularized Risk Minimization Package
  • Boruta - A wrapper algorithm for all-relevant feature selection
  • BreakoutDetection heart - Breakout Detection via Robust E-Statistics from Twitter.
  • bst - Gradient Boosting
  • CausalImpact heart - Causal inference using Bayesian structural time-series models.
  • C50 - C5.0 Decision Trees and Rule-Based Models
  • caret heart - Classification and Regression Training
  • Clever Algorithms For Machine Learning
  • CORElearn - Classification, regression, feature evaluation and ordinal evaluation
  • CoxBoost - Cox models by likelihood based boosting for a single survival endpoint or competing risks
  • Cubist - Rule- and Instance-Based Regression Modeling
  • e1071 - Misc Functions of the Department of Statistics (e1071), TU Wien
  • earth - Multivariate Adaptive Regression Spline Models
  • elasticnet - Elastic-Net for Sparse Estimation and Sparse PCA
  • ElemStatLearn - Data sets, functions and examples from the book: "The Elements of Statistical Learning, Data Mining, Inference, and Prediction" by Trevor Hastie, Robert Tibshirani and Jerome Friedman
  • evtree - Evolutionary Learning of Globally Optimal Trees
  • fable - a collection of commonly used univariate and multivariate time series forecasting models
  • prophet heart - Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
  • FSelector - A feature selection framework, based on subset-search or feature ranking approches.
  • frbs - Fuzzy Rule-based Systems for Classification and Regression Tasks
  • GAMBoost - Generalized linear and additive models by likelihood based boosting
  • gamboostLSS - Boosting Methods for GAMLSS
  • gbm - Generalized Boosted Regression Models
  • glmnet heart - Lasso and elastic-net regularized generalized linear models
  • glmpath - L1 Regularization Path for Generalized Linear Models and Cox Proportional Hazards Model
  • GMMBoost - Likelihood-based Boosting for Generalized mixed models
  • grplasso - Fitting user specified models with Group Lasso penalty
  • grpreg - Regularization paths for regression models with grouped covariates
  • h2o heart - Deeplearning, Random forests, GBM, KMeans, PCA, GLM
  • hda - Heteroscedastic Discriminant Analysis
  • ipred - Improved Predictors
  • kernlab - kernlab: Kernel-based Machine Learning Lab
  • klaR - Classification and visualization
  • kohonen - Supervised and Unsupervised Self-Organising Maps.
  • L0Learn - Fast algorithms for best subset selection
  • lars - Least Angle Regression, Lasso and Forward Stagewise
  • lasso2 - L1 constrained estimation aka β€˜lasso’
  • LiblineaR - Linear Predictive Models Based On The Liblinear C/C++ Library
  • lightgbm heart - Light Gradient Boosting Machine.
  • lme4 heart - Mixed-effects models
  • nlme heart - Mixed-effects models, handling user-specified matrix of residual covariance, relevant for the analysis of repeated observations in longitudinal trials
  • glmmTMB - Generalized mixed-effects models, handling user-specified matrix of residual covariance, relevant for the analysis of repeated observations in longitudinal trials
  • LogicReg - Logic Regression
  • maptree - Mapping, pruning, and graphing tree models
  • mboost - Model-Based Boosting
  • Machine Learning For Hackers heart
  • mlr - Extensible framework for classification, regression, survival analysis and clustering [DEPRECIATED]
  • mlr3 heart - Next generation extensible framework for classification, regression, survival analysis and clustering
  • mvpart - Multivariate partitioning
  • MXNet heart - MXNet brings flexible and efficient GPU computing and state-of-art deep learning to R.
  • ncvreg - Regularization paths for SCAD- and MCP-penalized regression models
  • nnet - eed-forward Neural Networks and Multinomial Log-Linear Models
  • oblique.tree - Oblique Trees for Classification Data
  • pamr - Pam: prediction analysis for microarrays
  • party - A Laboratory for Recursive Partytioning
  • partykit - A Toolkit for Recursive Partytioning
  • penalized - L1 (lasso and fused lasso) and L2 (ridge) penalized estimation in GLMs and in the Cox model
  • penalizedLDA - Penalized classification using Fisher's linear discriminant
  • penalizedSVM - Feature Selection SVM using penalty functions
  • quantregForest - quantregForest: Quantile Regression Forests
  • randomForest - randomForest: Breiman and Cutler's random forests for classification and regression.
  • randomForestSRC - randomForestSRC: Random Forests for Survival, Regression and Classification (RF-SRC).
  • ranger - A Fast Implementation of Random Forests.
  • rattle - Graphical user interface for data mining in R.
  • rda - Shrunken Centroids Regularized Discriminant Analysis
  • rdetools - Relevant Dimension Estimation (RDE) in Feature Spaces
  • REEMtree - Regression Trees with Random Effects for Longitudinal (Panel) Data
  • relaxo - Relaxed Lasso
  • rgenoud - R version of GENetic Optimization Using Derivatives
  • rgp - R genetic programming framework
  • Rmalschains - Continuous Optimization using Memetic Algorithms with Local Search Chains (MA-LS-Chains) in R
  • rminer - Simpler use of data mining methods (e.g. NN and SVM) in classification and regression
  • ROCR - Visualizing the performance of scoring classifiers
  • RoughSets - Data Analysis Using Rough Set and Fuzzy Rough Set Theories
  • rpart - Recursive Partitioning and Regression Trees
  • RPMM - Recursively Partitioned Mixture Model
  • RSNNS - Neural Networks in R using the Stuttgart Neural Network Simulator (SNNS)
  • Rsomoclu - Parallel implementation of self-organizing maps.
  • RWeka - R/Weka interface
  • RXshrink - RXshrink: Maximum Likelihood Shrinkage via Generalized Ridge or Least Angle Regression
  • sda - Shrinkage Discriminant Analysis and CAT Score Variable Selection
  • SDDA - Stepwise Diagonal Discriminant Analysis
  • SuperLearner and subsemble - Multi-algorithm ensemble learning packages.
  • survminer - Survival Analysis & Visualization
  • survival - Survival Analysis
  • svmpath - svmpath: the SVM Path algorithm
  • tgp - Bayesian treed Gaussian process models
  • tidymodels - A collection of packages for modeling and statistical analysis that share the underlying design philosophy, grammar, and data structures of the tidyverse.
  • torch - Tensors and Neural Networks with 'GPU' Acceleration.
  • tree - Classification and regression trees
  • varSelRF - Variable selection using random forests
  • xgboost heart - eXtreme Gradient Boosting Tree model, well known for its speed and performance.

Natural Language Processing

Packages for Natural Language Processing.

  • text2vec - Fast Text Mining Framework for Vectorization and Word Embeddings.
  • tm - A comprehensive text mining framework for R.
  • openNLP - Apache OpenNLP Tools Interface.
  • koRpus - An R Package for Text Analysis.
  • zipfR - Statistical models for word frequency distributions.
  • NLP - Basic functions for Natural Language Processing.
  • LDAvis - Interactive visualization of topic models.
  • topicmodels - Topic modeling interface to the C code developed by by David M. Blei for Topic Modeling (Latent Dirichlet Allocation (LDA), and Correlated Topics Models (CTM)).
  • syuzhet - Extracts sentiment from text using three different sentiment dictionaries.
  • SnowballC - Snowball stemmers based on the C libstemmer UTF-8 library.
  • quanteda - R functions for Quantitative Analysis of Textual Data.
  • Topic Models Resources - Topic Models learning and R related resources.
  • NLP for :cn: - NLP related resources in R. @Chinese
  • MonkeyLearn - πŸ’ R package for text analysis with Monkeylearn πŸ’.
  • tidytext - Implementing tidy principles of Hadley Wickham to text mining.
  • utf8 - Manipulating and printing UTF-8 text that fixes multiple bugs in R's UTF-8 handling.
  • corporaexplorer - Dynamic exploration of text collections

Bayesian

Packages for Bayesian Inference.

  • brms - High-level interface for Bayesian regression models using Stan.
  • coda - Output analysis and diagnostics for MCMC.
  • mcmc - Markov Chain Monte Carlo.
  • MCMCpack - Markov chain Monte Carlo (MCMC) Package.
  • R2WinBUGS - Running WinBUGS and OpenBUGS from R / S-PLUS.
  • BRugs - R interface to the OpenBUGS MCMC software.
  • rjags - R interface to the JAGS MCMC library.
  • rstan heart - R interface to the Stan MCMC software.

Optimization

Packages for Optimization.

  • lpSolve - Interface to Lp_solve to Solve Linear/Integer Programs.
  • minqa - Derivative-free optimization algorithms by quadratic approximation.
  • nloptr - NLopt is a free/open-source library for nonlinear optimization.
  • ompr - Model mixed integer linear programs in an algebraic way directly in R.
  • Rglpk - R/GNU Linear Programming Kit Interface
  • ROI - The R Optimization Infrastructure ('ROI') is a sophisticated framework for handling optimization problems in R.

Finance

Packages for dealing with money.

Bioinformatics and Biostatistics

Packages for processing biological datasets.

  • Bioconductor heart - Tools for the analysis and comprehension of high-throughput genomic data.
  • genetics - Classes and methods for handling genetic data.
  • gap - An integrated package for genetic data analysis of both population and family data.
  • ape - Analyses of Phylogenetics and Evolution.
  • pheatmap - Pretty heatmaps made easy.
  • lme4 - Generalized mixed-effects models.
  • nlme - Mixed-effects models, handling user-specified matrix of residual covariance, relevant for the anaysis of repeated observations in longitudinal trials.
  • glmmTMB - Generalized mixed-effects models, handling user-specified matrix of residual covariance, relevant for the anaysis of repeated observations in longitudinal trials.

Network Analysis

Packages to construct, analyze and visualize network data.

  • Network Analysis List - Network Analysis related resources.
  • CRAN Task View NetworkAnalysis - CRAN Task View on network analysis resources
  • igraph heart - A collection of network analysis tools.
  • network - Basic tools to manipulate relational data in R.
  • sna - Basic network measures and visualization tools.
  • manynet - Tools for making and modifying many different types of networks.
  • autograph - Automagic plotting of network graphs and models.
  • netdiffuseR - Tools for Analysis of Network Diffusion.
  • networkDynamic - Support for dynamic, (inter)temporal networks.
  • ndtv - Tools to construct animated visualizations of dynamic network data in various formats.
  • statnet - The project behind many R network analysis packages.
  • ergm - Exponential random graph models in R.
  • latentnet - Latent position and cluster models for network objects.
  • tnet - Network measures for weighted, two-mode and longitudinal networks.
  • rgexf - Export network objects from R to GEXF, for manipulation with network software like Gephi or Sigma.
  • visNetwork - Using vis.js library for network visualization.
  • tidygraph - A tidy API for graph manipulation

Spatial

Packages to explore the earth.

  • CRAN Task View: Analysis of Spatial Data- Spatial Analysis related resources.
  • Leaflet - One of the most popular JavaScript libraries interactive maps.
  • ggmap - Plotting maps in R with ggplot2.
  • REmap - R interface to the JavaScript library ECharts for interactive map data visualization.
  • sf - Improved Classes and Methods for Spatial Data.
  • sp - Classes and Methods for Spatial Data.
  • rgeos - Interface to Geometry Engine - Open Source
  • rgdal - Bindings for the Geospatial Data Abstraction Library
  • maptools - Tools for Reading and Handling Spatial Objects
  • gstat - Spatial and spatio-temporal geostatistical modelling, prediction and simulation.
  • spacetime - R classes and methods for spatio-temporal data.
  • RColorBrewer - Provides color schemes for maps
  • spatstat - Spatial Point Pattern Analysis, Model-Fitting, Simulation, Tests
  • spdep - Spatial Dependence: Weighting Schemes, Statistics and Models
  • tigris - Download and use Census TIGER/Line shapefiles in R
  • GWmodel - Geographically-Weighted Models
  • tmap - R package for thematic maps

R Development

Packages for packages.

Logging

Packages for Logging

  • futile.logger - A logging package in R similar to log4j
  • log4r - A log4j derivative for R
  • logging - A logging package emulating the python logging package.

Data Packages

Handy Data Packages

  • engsoccerdata - English and European soccer results 1871-2016.
  • gapminder - Excerpt from the Gapminder dataset (data about countries through the past 50 years).
  • wbstats - Tools for searching and downloading data and statistics from the World Bank Data API and the World Bank Data Catalog API.
  • ICON - complex systems & networks datasets from the Index of COmplex Networks (ICON) database webpage.
  • RCOBOLDI - Import COBOL CopyBook data files directly into R as properly structured data frames. Package builds are available via Drat and DockerHub.

Other Tools

Handy Tools for R

  • git2r - Gives you programmatic access to Git repositories from R.
  • Conda - Most R packages are available through the Conda polyglot cross-platform dependency manager.

Other Interpreters

Alternative R engines.

  • CXXR - Refactorising R into C++.
  • fastR - FastR is an implementation of the R Language in Java atop Truffle and Graal.
  • pqR - a "pretty quick" implementation of R
  • renjin - a JVM-based interpreter for R.
  • rho - Refactor the interpreter of the R language into a fully-compatible, efficient, VM for R.
  • riposte - a fast interpreter and JIT for R.
  • TERR - TIBCO Enterprise Runtime for R.

Learning R

Packages for Learning R.

  • swirl heart - An interactive R tutorial directly in your R console.
  • DataScienceR heart - a list of R tutorials for Data Science, NLP and Machine Learning.

Resources

Where to discover new R-esources.

Websites

Manuals

  • R-project - The R Project for Statistical Computing.
  • An Introduction to R - A very good introductory text on R, also covers some advanced topic. See also the Manuals section on CRAN
  • CRAN Contributed Docs - CRAN Contributed Documentation in many languages.
  • Quick-R - An excellent quick reference
  • tryR - A quick course for getting started with R.

Tools and References

  • RDocumentation - Search through all CRAN, Bioconductor, Github packages and their archives with RDocumentation.
  • rdrr.io - Find R package documentation. Try R packages in your browser.
  • CRAN Task Views - Task Views for CRAN packages.
  • rnotebook.io - Create online R Jupyter Notebooks for free.

News and Info

  • R Weekly - Weekly updates about R and Data Science. R Weekly is openly developed on GitHub.
  • R Bloggers - There are people scattered across the Web who blog about R. This is simply an aggregator of many of those feeds.
  • R-users - A job board for R users (and the people who are looking to hire them)

Books

Free and Online

Paid

  • The Art of R Programming - It's a good resource for systematically learning fundamentals such as types of objects, control statements, variable scope, classes and debugging in R.
  • R Cookbook, 2nd ed. by JD Long & Paul Teetor (2019) - A quick and simple introduction to conducting many common statistical tasks with R.
  • R in Action - This book aims at all levels of users, with sections for beginning, intermediate and advanced R ranging from "Exploring R data structures" to running regressions and conducting factor analyses.
  • Use R! Series by Springer - This series of inexpensive and focused books from Springer publish shorter books aimed at practitioners. Books can discuss the use of R in a particular subject area, such as Bayesian networks, ggplot2 and Rcpp.
  • Learning R Programming - Learning R as a programming language from basics to advanced topics.

Book/monograph Lists and Reviews

Podcasts

Reference Cards

MOOCs

Massive open online courses.

Lists

Great resources for learning domain knowledge.

R Ecosystems

R communities and package collections (in alphabetical order):

2018

2017

  • prophet - Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
  • tidyverse - Easily install and load packages from the tidyverse
  • purrr - A functional programming toolkit for R
  • hrbrthemes - πŸ” Opinionated, typographic-centric ggplot2 themes and theme components
  • xaringan - Create HTML5 slides with R Markdown and the JavaScript library
  • blogdown - Create Blogs and Websites with R Markdown
  • glue - Glue strings to data in R. Small, fast, dependency free interpreted string literals.
  • covr - Test coverage reports for R
  • lintr - Static Code Analysis for R
  • reprex - Render bits of R code for sharing, e.g., on GitHub or StackOverflow.
  • reticulate - R Interface to Python
  • tensorflow - TensorFlow for R
  • utf8 - Manipulating and printing UTF-8 text that fixes multiple bugs in R's UTF-8 handling.
  • Patchwork - Combine separate ggplots into the same graphic.

Other Awesome Lists

Contributing

Your contributions are always welcome!

This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License - CC BY-NC-SA 4.0