Awesome R 中文版(ZH_CN)
这里有很多非常不错的R包和工具. 该想法来自于awesome-machine-learning.
这里是包的导航清单,看起来更方便 https://awesome-r.com
for [Top 50](https://github.com/rstudio/RStartHere/blob/master/top_downloads_2016/top_packages) CRAN downloaded packages or repos with 400+Integrated Development Environments
集成开发环境
Syntax
改变你使用R方式的包
Data Manipulation
数据处理相关的包
Graphic Displays
Packages for showing data.
- ggplot2
- 强大的绘图统计和计算图形系统的实现.强烈推荐.
- ggfortify -一个ggplot2(流行的统计软件包)统一的接口(使用一行代码即可).A unified interface to ggplot2 popular statistical packages using one line of code.
- ggrepel - 排除重叠的文本标签.
- ggalt - ggplot2额外的坐标系统,几何和统计.
- ggtree - 可视化和注释的系统树.
- ggplot2 Extensions - ggplot2扩展显示例子.
- lattice - 一个强大优雅的高级数据可视化系统.
- corrplot - 图形显示相关矩阵或一般矩阵。它还包含一些矩阵重新排序算法.
- rgl - R中3D可视化系统.
- Cairo - 一个使用cairo组件创建高质量显示输出的R图形包.
- extrafont - 在R中图像中使用字体的工具.
- showtext - 让R图形设备显示文本的时候使用系统字体.
- animation - 一个使用 ImageMagick在R中产生动画图形的工具.
- gganimate - 用ggplot2创建简单的动画.
- misc3d - 强大的3D绘图工具.
- xkcd - 在图表中使用xkcd风格.
- imager - 一个基于CImg库的图像处理包.
HTML Widgets
Packages for interactive visualizations.
Reproducible Research
Packages for literate programming.
Web Technologies and Services
Packages to surf the web.
Parallel Computing
Packages for parallel computing.
High Performance
Packages for making R faster.
Language API
Packages for other languages.
- rJava - R语言对JAVE接口.
- jvmr - 集成了R, Java, and Scala.
- rJython - R语言对Python/Jython的接口.
- rPython - 允许R调用Python.
- runr - 在R中运行Julia和Bash.
- RJulia - R中调用Julia.
- RinRuby - 一个Ruby库,整合了R用Ruby解释器.
- R.matlab - 读写mat文件,将R和Matlab连接到一起.
- RcppOctave -Octave and Matlab的接口.
- RSPerl - 双向接口,R中调用Perl和在Perl中调用R.
- V8 - 嵌入JavaScript引擎.
- htmlwidgets - R中把JavaScript数据可视化的最好方法.
- rpy2 - Python对R的接口.
Database Management
Packages for managing data.
- RODBC - R中ODBC数据库范围.
- DBI - 在R和数据库管理系统之间定义一个公共的接口.
- elastic - Elasticsearch HTTP API的包装器.
- mongolite - R中Mongo客户端.
- RMySQL - R语言的MySQL数据库接口.
- ROracle - R中Oracle数据库的接口.
- RPostgreSQL - R语言的PostgreSQL数据库系统接口.
- RSQLite - R语言SQLite数据库接口.
- RJDBC - 通过JDBC接口访问数据库.
- rmongodb - R中MongoDB驱动.
- rredis - R中Redis驱动.
- RCassandra -Apache Cassanda直接接口(不是JAVA),提供了最多的基本功能.
- RHive - 通过Apache Hive的R扩展促进分布式计算.
- RNeo4j - Neo4j图形数据库驱动.
Machine Learning
Packages for making R cleverer.
- AnomalyDetection
- 来自Twitter的AnomalyDetection R包.
- ahaz - 半参数添加风险回归的正则化.
- arules - 挖掘关联规则和频繁项集.
- bigrf - 大随机森林:大型数据集的分类和回归森林.
- bigRR - 广义回归(特殊是在p >> n情况下).
- bmrm - 风险最小化方案的正规化方法.
- Boruta - 所有相关的特征选择算法的一个封装
- BreakoutDetection - Breakout Detection via Robust E-Statistics from Twitter.[暂时不明真相]
- bst - 梯度增加.
- CausalImpact - 利用贝叶斯时间序列结构模型进行因果推断.
- C50 - C5.0决策树和基于规则的模型.
- caret
- 分类和回归训练.
- Clever Algorithms For Machine Learning
- CORElearn - 分类、回归、特征评价和排序.
- CoxBoost - Cox models by likelihood based boosting for a single survival endpoint or competing risks.
- Cubist - 规则和基于实例的回归建模
- e1071 - Misc统计函数 (e1071),主要功能有类别分析、傅里叶变换,模糊聚类,支持向量机,最短路径计算,朴素贝叶斯分类器等等.
- earth - 多元自适应回归模型.
- elasticnet - 稀疏估计和稀疏主成分分析.
- ElemStatLearn - 书籍"The Elements of Statistical Learning, Data Mining, Inference, and Prediction"中的数据集,函数和例子.
- evtree - 全局最优树的进化学习.
- forecast - 使用ARIMA, ETS, STLM, TBATS,和神经网络进行时间序列预测.
- forecastHybrid - 使用"forecast"包对ARIMA, ETS, STLM, TBATS,和神经网络模型进行交叉检验.
- FSelector - 一个基于subset-search或特性排名方法的特征选择框架.
- frbs - 使用模糊规则系统处理分类和回归的任务.
- GAMBoost - 基于广义线性和加法模型.
- gamboostLSS - GAMLSS方法的改善.
- gbm - 改善广义线性模型.
- glmnet
- Lasso 和 elastic-net正规化广义线性模型.
- glmpath - L1 Regularization Path for Generalized Linear Models and Cox Proportional Hazards Model
- GMMBoost - 广义混合模型.
- grplasso - Fitting user specified models with Group Lasso penalty
- grpreg - Regularization paths for regression models with grouped covariates.
- h2o
- Deeplearning, Random forests, GBM, KMeans, PCA, GLM
- hda - 异方差判别分析.
- ipred - 预测器改进.
- kernlab - kernlab: 基于内核学习的机器实验室.
- klaR - 分类和可视化.
- kohonen - 监督和非监督自组织映射.
- lars - Least Angle Regression, Lasso and Forward Stagewise
- lasso2 - L1 constrained estimation aka ‘lasso’
- LiblineaR - 基于C/C++库的线性预测模型.
- lme4
- Mixed-effects models
- LogicReg - 逻辑回归模型.
- maptree - 映射、修剪和图形树模型.
- mboost - Model-Based Boosting
- Machine Learning For Hackers
- mvpart - Multivariate partitioning
- MXNet
- MXNet brings flexible and efficient GPU computing and state-of-art deep learning to R.
- ncvreg - Regularization paths for SCAD- and MCP-penalized regression models
- nnet - eed-forward Neural Networks and Multinomial Log-Linear Models
- oblique.tree - Oblique Trees for Classification Data
- pamr - Pam: 小矩阵预测分析.
- party - A Laboratory for Recursive Partytioning
- partykit - A Toolkit for Recursive Partytioning
- penalized - L1 (lasso and fused lasso) and L2 (ridge) penalized estimation in GLMs and in the Cox model
- penalizedLDA - Penalized classification using Fisher's linear discriminant
- penalizedSVM - 使用惩罚函数的特征选择支持向量机.
- quantregForest - quantregForest: Quantile Regression Forests
- randomForest - 随机森林: Breiman and Cutler's random forests for classification and regression.
- randomForestSRC - randomForestSRC: Random Forests for Survival, Regression and Classification (RF-SRC).
- rattle - 图形界面式的数据挖掘工具.
- rda - Shrunken Centroids Regularized Discriminant Analysis
- rdetools - Relevant Dimension Estimation (RDE) in Feature Spaces
- REEMtree - Regression Trees with Random Effects for Longitudinal (Panel) Data
- relaxo - Relaxed Lasso
- rgenoud - R version of GENetic Optimization Using Derivatives
- rgp - R基因编程框架.
- Rmalschains - 使用本地文化基因算法进行连续问题优化.[这里翻译不准]. Search Chains (MA-LS-Chains) in R
- rminer - 在分类和回归问题中简单的使用数据挖掘方法(如神经网络和支持向量机).
- ROCR - 可视化评分分类器的性能.
- RoughSets - 使用粗糙集和模糊粗糙集理论进行数据分析.
- rpart - Recursive Partitioning and Regression Trees
- RPMM - Recursively Partitioned Mixture Model
- RSNNS - Neural Networks in R using the Stuttgart Neural Network Simulator (SNNS)
- Rsomoclu - Parallel implementation of self-organizing maps.
- RWeka - Weka的R接口(Weka是基于JAVA环境下开源的机器学习以及数据挖掘软件).
- RXshrink - RXshrink: Maximum Likelihood Shrinkage via Generalized Ridge or Least Angle Regression
- sda - Shrinkage Discriminant Analysis and CAT Score Variable Selection
- SDDA - Stepwise Diagonal Discriminant Analysis
- SuperLearner and subsemble - Multi-algorithm ensemble learning packages.
- svmpath - svmpath: the SVM Path algorithm
- tgp - Bayesian treed Gaussian process models
- tree - 分类和回归树.
- varSelRF - 使用随机森林进行变量选择.
- xgboost
- eXtreme Gradient Boosting Tree model, well known for its speed and performance.
Natural Language Processing
Packages for Natural Language Processing.
- text2vec - 一个快速文本挖掘框架。 Fast Text Mining Framework for Vectorization and Word Embeddings.
- tm - 一个全面的文本挖掘框架.
- openNLP - Apache OpenNLP工具接口.
- koRpus - 一个文本分析的R包.
- zipfR - 词频分布统计模型.
- NLP - 基本自然语言处理功能.
- LDAvis - 主题模型的交互式可视化.
- topicmodels - Topic modeling interface to the C code developed by by David M. Blei for Topic Modeling (Latent Dirichlet Allocation (LDA), and Correlated Topics Models (CTM)).
- syuzhet - Extracts sentiment from text using three different sentiment dictionaries.
- SnowballC - Snowball stemmers based on the C libstemmer UTF-8 library.
- quanteda - 文本数据的定量分析.
- Topic Models Resources - 主题模型的学习和R相关资源.
- NLP for
- NLP related resources in R. @Chinese
Bayesian
Packages for Bayesian Inference.
- coda - 输出MCMC(马尔可夫链蒙特卡尔理论)的分析和诊断信息.
- mcmc - 马尔可夫链蒙特卡尔理论(MCMC).
- MCMCpack - 马尔可夫链蒙特卡尔理论 (MCMC).
- R2WinBUGS - 在在R/S-PLUS中打开WinBUGS 和 OpenBUGS.
- BRugs - OpenBUGS MCMC 软件的R接口.
- rjags - JAGS MCMC组件的R接口.
- rstan
- Stan MCMC软件的R接口.
Optimization
Packages for Optimization.
- minqa - Derivative-free optimization algorithms by quadratic approximation.
- nloptr - 一个免费开源的非线性最优化程序包.
- lpSolve -
Lp_solve解决线性和整形问题的R接口.
Finance
Packages for dealing with money.
Bioinformatics
Packages for processing biological datasets.
- Bioconductor
- 用于分析和理解高通量基因组数据的工具.
- genetics - 处理基因数据的R包.
- gap - 一个人口家庭遗传数据分析的综合工具.
- ape - 分子系统学和进化分析.
- pheatmap - 一个使用简单的热图工具.
- ddpcr - Analysis and visualization of Droplet Digital PCR data.
Network Analysis
Packages to construct, analyze and visualize network data.
- Network Analysis List - 网络分析相关资源.
- igraph
- 一个网络分析工具集合.
- network - 一个操作数据关系的基本工具.
- sna - 基本的网络测量和可视化工具.
- netdiffuseR - 网络扩散的分析工具.
- networkDynamic - 支持动态和时序网络.
- ndtv - 构建动画的可视化动态网络工具,支持多种数据格式.
- statnet - 大量网络数据的分析,仿真和可视化工具.
- ergm - 指数随机图模型.
- latentnet - Latent position and cluster models for network objects.
- tnet - Network measures for weighted, two-mode and longitudinal networks.
- rgexf - 从R导出网络对象到GEXF, for manipulation with network software like Gephi or Sigma.
- visNetwork - 使用vis.js类库进行网络可视化.
R Development
Packages for packages.
Logging
Packages for Logging
- futile.logger - R中类似log4j的日志记录包.
- log4r - R中的log4j接口
- logging - 一个在R中实现log4j的日志处理包.
Data Packages
Handy Data Packages
- engsoccerdata - 英国和欧洲联赛结果数据(1871-2016年).
- gapminder - 从Gapminder摘录的数据集.
Other Tools
Handy Tools for R
- git2r - 在R中使用git.
Other Interpreters
Alternative R engines.
- CXXR - Refactorising R into C++.
- fastR - FastR is an implementation of the R Language in Java atop Truffle and Graal.
- incanter - Clojure-based, R-like statistical computing and graphics environment for the JVM with Lisp spirit.
- pqR - 一个更快的R实现.
- renjin - 一个基于JVM的R编译器.
- rho - Refactor the interpreter of the R language into a fully-compatible, efficient, VM for R.
- riposte - 一个R快速编译和JIT工具.
- RRO - R革命性开放平台(Microsoft R Open).
- TERR - R的TIBCO企业运行环境.
Learning R
Packages for Learning R.
- swirl - 一个在R控制台中交互式学习指南.
- DataScienceR - 一个数据科学,神经网络,和机器学习的指南.
Resources
发现新的R资源的地方.
Websites
- R-project - R 项目的官方网站.
- R Bloggers - R语言的一个综合性博客网站.
- DataCamp - 在线学习R数据分析.
- Quick-R - 一个非常好的快速参考手册.
- Advanced R
- 书籍高级R编程的在线版.
- Efficient R Programming - 书籍"Efficient R Programming"的在线主页.
- CRAN Task Views - CRAN包的任务列表.
- The R Programming Wikibook - 一个R协作手册
- R-users - R语言的求职板块.
- R Cookbook - 一个R问答网站,由[R Graphics Cookbook]进行支持(http://shop.oreilly.com/product/0636920023135.do).
- tryR - 快速开始使用R.
- RDocumentation - 使用RDocumentation搜索所有的CRAN, Bioconductor, Github包和文档.
Books
- R Books List - R相关书籍清单.
- The Art of R Programming - 一个很好的资源,可以系统地学习基础类型的对象,控制语句,变量的范围,以及调试等.
- Free Books - CRAN贡献的多种语言文档. Contributed Documentation in many languages.
- R Cookbook - 快速简单的介绍R及相关常见的统计任务.
- Johns Hopkins编写的数据科学专业的一部分教程:
- Exploratory Data Analysis with R - 基本的各种数据分析技能. * R Programming for Data Science - 依赖于R的一些高级数据分析. * Report Writing for Data Science in R - R语言的报表生成和可重用组件研究.
- R Packages - 一个用R包编写的书籍 (有论文和网站2钟格式).
- R in Action - 一本旨在帮助所有级别R用户的书籍.
- Use R! - This series of inexpensive and focused books from Springer publish shorter books aimed at practitioners. Books can discuss the use of R in a particular subject area, such as Bayesian networks, ggplot2 and Rcpp.
- R for SAS and SPSS users - 一个对已经熟悉SAS和SPASS用户的资源库.
- An Introduction to R - 一个很好的介绍R的文章,也涵盖了一些高级主题.
- Introduction to Statistical Learning with Application in R - A simplified and "operational" version of The Elements of Statistical Learning. Free softcopy provided by its authors.
- The R Inferno - Patrick Burns gives insight into R's ins and outs along with its quirks!
- R for Data Science - Free book from RStudio developers with emphasis on data science workflow.
Podcasts
- Not So Standard Deviations - 数据科学博客
- R World News - R行业的社区新闻,可以让你与时俱进.
- @Bob Rudis and @Jay Jacobs.
- The R-Podcast - 使用R的一些实践建议.
- R Talk - 关于R语言和统计软件的新闻和讨论.
Reference Cards
- R Reference Card 2.0 - Material from R for Beginners by permission of Emmanuel Paradis (Version 2 by Matt Baggott).
- Regression Analysis Refcard - R Reference Card for Regression Analysis.
- Reference Card for ESS - Reference Card for ESS.
- R Markdown Cheat sheet - Quick reference guide for writing reports with R Markdown.
- Shiny Cheat sheet - Quick reference guide for building Shiny apps.
- ggplot2 Cheat sheet - Quick reference guide for data visualisation with ggplot2.
- devtools Cheat sheet - Quick reference guide to package development in R.
MOOCs
Massive open online courses.
- The Analytics Edge - Hands-on introduction to data analysis with R from MITx.
- Johns Hopkins University Data Science Specialization - 9 courses including: Introduction to R, literate analysis tools, Shiny and some more.
- HarvardX Biomedical Data Science - Introduction to R for the Life Sciences.
- Explore Statistics with R - Covers introduction, data handling and statistical analysis in R.
Lists
Great resources for learning domain knowledge.
- Books - R书籍清单.
- DataScienceR - R数据科学、神经网络和机器学习的指南清单.
- ggplot2 Extensions - ggplot2扩展案例.
- Natural Language Processing
- R. @Chinese中NLP 相关资源
- Network Analysis - 网络分析相关资源.
- Open Data - 使用R获取,转换,操作,创建和贡献数据.
- Posts - 创建R博客或者文章.
- Package Development - 提高包开发的资源工具.
- R Project Conferences - 使用R的相关信息,DSC会议.
- RStartHere - 一些非常有用的R包指南.
- RStudio Addins - RStudio插件清单.
- Topic Models - 主题模型的学习和R相关资源.
- Web Technologies - 如何使用R和万维网的信息.
Other Awesome Lists
Contributing
一直欢迎大家的贡献!我的邮件:asxinyu@qq.com This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License - CC BY-NC-SA 4.0