Shane Mueller
Sept 13 2018
There are a few core concepts that give you a lot of capabilities in R
They let you reuse analysis, automate analysis, save time, repeat processes etc.
Because R allows using functions on entire data vectors, it often blurs the line between conditionals and iterating.
Functions are defined using the function() function, and the result is assigned to a name.
se <- function(x)
{
   stdev <- sd(x)
   se <-  stdev/sqrt(length(x))
   return(se)
}
se <- function(x) {sd(x)/sqrt(length(x))}
x <- 100
doit <- function(){
   x <- x + 1
  return(x)
}
print(doit())
[1] 101
print(x)
[1] 100
In R, function arguments are named:
fn <- function(x, y=NULL,title="My Title")
{
  print(paste(x,y,title))
}
-You don't need to name a function to use it:
set.seed(111)
x2 <- data.frame(a=runif(10),b=runif(10)+.5)
lapply(x2, function(x){mean(x)})
$a
[1] 0.4069604
$b
[1] 0.8926438
mmm  <- function(x)
{
 c(min(x),median(x),max(x))
}
##See  how  these  change  when  you  have  more  samples:
mmm(1/runif (1000))
[1]    1.001720    1.965887 2248.495039
mmm(1/runif (100))
[1]  1.028069  2.216996 66.805299
data  <- runif (100) + 1/rnorm (100)
mmm(data)
[1] -197.8342889    0.2196956  110.9483544
Most common conditional statement is the if statement
##Conditional Branching
if( 0 )
{
  print("This will never print")
}
if(1)
{
  print("But this will")
}
[1] "But this will"
if(runif(1)<.5)
{
  print("less")
}else{
  print("more")
}
[1] "more"
  x <- sample(letters[1:5],10,replace=T)
  x2 <-ifelse(x=="a","A",x)
  x[which(x=="a")]<-"A"
  x
 [1] "b" "b" "e" "b" "e" "c" "e" "d" "A" "b"
  x2
 [1] "b" "b" "e" "b" "e" "c" "e" "d" "A" "b"
Write a function that will take as its first argument a data vector (e.g., something produced by runif(1000)), and as its second argument a keyword which tells the function whether to plot a histogram or a scatterplot.
x <- exp(rnorm(1000)*.3)
myplot <- function(x,type="scatter")
{  
  if(type=="scatter")
  {
    plot(x)   ##Plot a regular plot here
  }else if(type=="histogram")
    {
      hist(x) ##Plot a histogram
  }else{
     warning("error")
    }
}
myplot(x,"histogram")
myplot(x,"scatter")
Write a new mean function that does not
return an error when given a factor.  Rather, it returns
the modal (most common) value of that factor.  Then use
that function in the lapply and sapply on x2.
Use:
x2 <- data.frame(a=runif(100),b=runif(100),c= as.factor(sample(LETTERS,100,replace=T)))
x2 <- data.frame(a=runif(100),b=runif(100),c= as.factor(sample(LETTERS,100,replace=T)))
newmean <- function(data)
{
  if(is.factor(data))
  {
    tab <- table(data)
    names(tab)[which.max(tab)]
  } else {
    return(mean(data) )
  }
}
newmean(x2$a)
[1] 0.4977649
newmean(x2$c)
[1] "K"
lapply(x2,newmean)
$a
[1] 0.4977649
$b
[1] 0.4403626
$c
[1] "K"
sapply(x2,newmean)
                  a                   b                   c 
"0.497764854519628" "0.440362612789031"                 "K" 
Looping and iteration are methods for repeating some code or operation many times. Usually, iteration refers to repeating an operation across elements of a data set, and looping is more general
Important methods for this:
tapply and aggregate
Methods to avoid unless you know what you are doing
while, repeat
lapply, sapply
This keyword iterates a block of code over a set of values.
j <- 1
for(i in 1:1000000)
{
  j <- j + runif(1)  
}
print(j)
[1] 500314.9
Version 1
x <- sample(LETTERS)
out <- ""
for(i in 1:length(x))
  out <- paste(out, x[i],sep="")
out
[1] "SHBOKAWGXILPYDMRTEJVFZQUNC"
Version 2
out <- ""
for(i in x)
  out <- paste(out, i,sep="")
out
[1] "SHBOKAWGXILPYDMRTEJVFZQUNC"
Useful for recoding:
vals <- sample(c("man","WOMAN"),10,replace=T)
coded <- ifelse(vals=="man",1,2)
coded2 <- vals
coded2[which(vals=="man")] <- "MAN"
coded
 [1] 2 2 1 1 2 1 2 2 2 2
coded2
 [1] "WOMAN" "WOMAN" "MAN"   "MAN"   "WOMAN" "MAN"   "WOMAN" "WOMAN"
 [9] "WOMAN" "WOMAN"
Create a series of 1,000,000 letters of the alphabet using
items <- sample(letters,1000000,replace=T)
a' values with anA', and b with a B'. if statement, one that uses ifelse, 
and one that uses which.Some topics from discussion