The pairs plot is used to compare all the data in the selected range with all the other columns on a one to one basis. It is a great first pass to see if a relationship between two variables jumps out at you. In order to show this I will create four variable. The first is a random uniform. The second variable depends on the first variable in a linear relationship. The third variable has a squared relationship with the first variable. Finally the fourth variable has a cubic relationship with the first variable.
indep<-runif(300,-25,25)
linear<-2*indep+rnorm(300,0,3)
exp2<-.05*indep^2+rnorm(300,0,2)
exp3<-indep^3/1000+rnorm(300,0,1)
samples<-as.data.frame(cbind(indep,linear,exp2,exp3))
Now plot them using pairs and see what it produces
pairs(~indep+linear+exp2+exp3,data=samples)
The plots in this case do a preety good job of showing us what is going on. As an interesting aside if we just call the plot function on the data.frame sample R does a sample plot for us.
plot(samples)
Really Good blog post.provided a helpful information.I hope that you will post more updates like this Data Science online Training India
ReplyDeleteReally the blogging is spreading its wings rapidly. Your write up is a fine example of it. https://view.ly/v/2gEpJYevB3Ax
ReplyDelete