Arules
Mining Association Rules and Frequent Itemsets with R
Install / Use
/learn @mhahsler/ArulesREADME
<img src="man/figures/logo.svg" align="right" height="139" /> R package arules - Mining Association Rules and Frequent Itemsets
Introduction
The arules package family for R provides the infrastructure for representing, manipulating and analyzing transaction data and patterns using frequent itemsets and association rules. The package also provides a wide range of interest measures and mining algorithms including the code of Christian Borgelt’s popular and efficient C implementations of the association mining algorithms Apriori and Eclat. In addition, the following mining algorithms are available via fim4r:
- Apriori
- Eclat
- Carpenter
- FPgrowth
- IsTa
- RElim
- SaM
Code examples can be found in Chapter 5 of the web book R Companion for Introduction to Data Mining.
To cite package ‘arules’ in publications use:
Hahsler M, Gruen B, Hornik K (2005). “arules - A Computational Environment for Mining Association Rules and Frequent Item Sets.” Journal of Statistical Software, 14(15), 1-25. ISSN 1548-7660, doi:10.18637/jss.v014.i15 https://doi.org/10.18637/jss.v014.i15.
@Article{,
title = {arules -- {A} Computational Environment for Mining Association Rules and Frequent Item Sets},
author = {Michael Hahsler and Bettina Gruen and Kurt Hornik},
year = {2005},
journal = {Journal of Statistical Software},
volume = {14},
number = {15},
pages = {1--25},
doi = {10.18637/jss.v014.i15},
month = {October},
issn = {1548-7660},
}
Packages
arules core packages
- arules: arules base package with data structures, mining algorithms (APRIORI and ECLAT), interest measures.
- arulesViz: Visualization of association rules.
- arulesCBA: Classification algorithms based on association rules (includes CBA).
- arulesSequences: Mining frequent sequences (cSPADE).
Other related packages
Additional mining algorithms
- arulesNBMiner: Mining NB-frequent itemsets and NB-precise rules.
- fim4r: Provides fast implementations
for several mining algorithms. An interface function called
fim4r()is provided inarules. - opusminer: OPUS Miner
algorithm for finding the op k productive, non-redundant itemsets.
Call
opus()withformat = 'itemsets'. - RKEEL: Interface to KEEL’s association rule mining algorithm.
- RSarules: Mining algorithm which randomly samples association rules with one pre-chosen item as the consequent from a transaction dataset.
In-database analytics
- ibmdbR: IBM in-database analytics for R can calculate association rules from a database table.
- rfml: Mine frequent itemsets or association rules using a MarkLogic server.
Interface
- rattle: Provides a graphical user interface for association rule mining.
- pmml: Generates PMML (predictive model markup language) for association rules.
Classification
- arc: Alternative CBA implementation.
- inTrees: Interpret Tree Ensembles provides functions for: extracting, measuring and pruning rules; selecting a compact rule set; summarizing rules into a learner.
- rCBA: Alternative CBA implementation.
- qCBA: Quantitative Classification by Association Rules.
- sblr: Scalable Bayesian rule lists algorithm for classification.
Outlier Detection
- fpmoutliers: Frequent Pattern Mining Outliers.
Recommendation/Prediction
- recommenerlab: Supports creating predictions using association rules.
The following R packages use arules:
arc,
arlclustering,
arulesCBA,
arulesNBMiner,
arulesSequences,
arulesViz,
clickstream,
CLONETv2,
CRE,
ctsem,
discnorm,
fcaR,
fdm2id,
GroupBN,
ibmdbR,
inTrees,
nuggets,
opusminer,
pervasive,
pmml,
qCBA,
RareComb,
rattle,
rCBA,
recommenderlab,
rgnoisefilt,
RKEEL,
RulesTools,
sbrl,
SurvivalTests,
TELP
Installation
Stable CRAN version: Install from within R with
install.packages("arules")
Current development version: Install from r-universe.
install.packages("arules",
repos = c("https://mhahsler.r-universe.dev",
"https://cloud.r-project.org/"))
Usage
Load package and mine some association rules.
library("arules")
data("IncomeESL")
trans <- transactions(IncomeESL)
trans
## transactions in sparse format with
## 8993 transactions (rows) and
## 84 items (columns)
rules <- apriori(trans, supp = 0.1, conf = 0.9, target = "rules")
## Apriori
##
## Parameter specification:
## confidence minval smax arem aval originalSupport maxtime support minlen
## 0.9 0.1 1 none FALSE TRUE 5 0.1 1
## maxlen target ext
## 10 rules TRUE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## Absolute minimum support count: 899
##
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[84 item(s), 8993 transaction(s)] done [0.01s].
## sorting and recoding items ... [42 item(s)] done [0.00s].
## creating transaction tree ... done [0.01s].
## checking subsets of size 1 2 3 4 5 6 done [0.03s].
## writing ... [457 rule(s)] done [0.00s].
## creating S4 object ... done [0.00s].
Inspect the rules with the highest lift.
inspect(head(rules, n = 3, by = "lift"))
## lhs rhs support confidence coverage lift count
## [1] {dual incomes=no,
## householder status=own} => {marital status=married} 0.10 0.97 0.10 2.6 914
## [2] {years in bay area=>10,
## dual incomes=yes,
## type of home=house} => {marital status=married} 0.10 0.96 0.10 2.6 902
## [3] {dual incomes=yes,
## householder status=own,
## type of home=house,
## language in home=english} => {marital status=married} 0.11 0.96 0.11 2.6 988
Using arules with tidyverse
arules works seamlessly with tidyverse. For
example:
dplyrcan be used for cleaning and preparing the transactions.transaction()and other functions accept `ti
