Lets Start with R-Decision Tree-
Install R Package :
Use the below command in R console to install the package.
install.packages("party")
The package "party" has the function ctree() which is used to create and analyze decison tree.
Syntax :
The basic syntax to create a decision tree is −
ctree(formula, data)
Where
- formula describing the predictor and response variables.
- data is the name of the data set used.
Input Data :
We will use the R in-built data set named readingSkills to create a decision tree. It describes the score of someone's readingSkills if we know the variables "age","shoesize","score" and whether the person is a native speaker or not.
Here is the sample data.
# Load the party package. library(party) # Print some records from data set readingSkills. print(head(readingSkills))
When we execute the above code, it produces the following result and chart −
nativeSpeaker age shoeSize score 1 yes 5 24.83189 32.29385 2 yes 6 25.95238 36.63105 3 no 11 30.42170 49.60593 4 yes 7 28.66450 40.28456 5 yes 11 31.88207 55.46085 6 yes 10 30.07843 52.83124 Loading required package: methods Loading required package: grid ............................... ...............................
Example:
We will use the ctree() function to create the decision tree and see its graph.
library(party) # Create the input data frame. input.dat <- readingSkills[c(1:105),] # Give the chart file a name. png(file = "decision_tree.png") # Create the tree. output.tree <- ctree( nativeSpeaker ~ age + shoeSize + score, data = input.dat) # Plot the tree. plot(output.tree) # Save the file. dev.off()
When we execute the above code, it produces the following result −
null device 1 Loading required package: methods Loading required package: grid Loading required package: mvtnorm Loading required package: modeltools Loading required package: stats4 Loading required package: strucchange Loading required package: zoo Attaching package: ‘zoo’ The following objects are masked from ‘package:base’: as.Date, as.Date.numeric Loading required package: sandwich
Conclusion:
From the above tree we can conclude that anyone whose readingSkills score is less than 38.3 and age is more than 6 is not a native Speaker.
Thats it !
Thank you and keep visiting.