= Dec 8, 2016 = [[MeetingNotes/2016-12/Attendees|Attendees]] == Wiki Tour == Created accounts, gave attendees edit permissions, overview of the MoinMoin wiki features. == R == Confirmed that people had been able to install R (r-base on Debian distros) and in some cases RStudio. Ran through some basic R commands: {{{ install.packages("ISLR") library(ISLR) data(USArrests) library(MASS) head(USArrests) head(USArrests,10) str(USArrests) sumary(USArrests) # summary gives model/object specific output plot(1) plot(USArrests) USArrests$UrbanPop plot(USArrests$UrbanPop) plot(USArrests$UrbanPop, USArrests$Murder) plot(Murder ~ UrbanPop, USArrests) m <- lm(Murder ~ UrbanPop, USArrests) summary(m) plot(m) # not predictions, just diagnostics }}} == Presentation == David continued his presentation on Linear Classification. * Finishing part 1 of two class classification. * Have one class go to zero and the other fit to one to find the best separation on the distance between the two classes. * Look for functions to fit, like probit and logit to find the probability of being in one class or the other. Another approach is using decision trees (Petal Length). They can better handle unknown distributions. * You can use R to break into multiple classes, not just two. By increasing the number of classes on some datasets you start to approximate the probit or logit datasets. * Explore your dataset to find a way to transform it to be linear, to have a linearly separated boundary. * K Classification * You can no longer use a linear function to find the boundary. One approach is to assemble multiple linear classifiers. * One way to do this is using one vs all classifiers. * If you have multiple values, you can encode them using indicator variables into extra data columns when you have a factor with more than two levels. You can get away with one fewer indicator data frames than you have factor levels. * {{{d <- data.frame(iris[,c(2,4,5)])}}} * {{{d$c1 <- d$Species == "setosa"}}} * We create and plot the three models to analyze. Next we combine the classifiers, to "cast votes". Then try fitting different ways, like predictor, decision tree. One vs all (1-v-a) you need O(n) models. For one vs one (1-v-1) you need n^2^ models. * Maximize D^2^/S where D is distance between class means and S is within-class scatter (i.e. variance), or equivalently minimize S/D^2^. * The goal, again, is to minimize linear regressions to simplify solving. Linear regressions are thus a specialization of K class systems. == Next Time == * Thursday, January 12, 2017 * Be ready to discuss Chapter 2 of AnIntroductionToStatisticalLearning * David will wrap up his presentation on Linear Classification.