### Answer Key ### ### Remember there is always more than one way to do something (even simple tasks) in R. That's what makes it a flexible and difficult to learn language. So the caveat with this "answer" key is that this is only my approach to the problem. ### Ecol592: Introduction to R ### Assignment 1 ### Due: February 4, 2014 ### Instructions: Answer the following questions by starting from a blank script file. Turn in both the answers to the questions and your code. Be sure to use adequate amounts of comments and meta data so that I (and you in the future) know what is going on. # Basic Questions #For this assignment, you'll be using the dataset built into R called 'trees'. First thing's first: how do you look at these data to see what is contained within the variable? #1) Describe the dataset using R functions. head(trees) dim(trees) str(trees) # These 3 are especially useful, but you could also look at names(trees), summary(trees), and many others #2) Access the 'Girth' column in two different ways. # Here are 3: trees$Girth trees[, 1] trees[, "Girth"] #3) Rename the 'Girth' column to 'DBH' (diameter at breast height). # My initial approach: names(trees)[1] <- "DBH" # After reading a few clever answers from your assignments: names(trees)[names(trees) == "Girth"] <- "DBH" # The second way makes the name switch more explicit. When coming back to this code, it will be obvious what column was changed to what. #4) Convert the column data into metric units (currently in inches, feet, cubic feet). Do this by "writing over" the current columns. trees$DBH <- 2.54 * trees$DBH # Or someone found the cm() function, which converts inches to centimeters # trees$DBH <- cm(trees$DBH) trees$Height <- 12 * trees$Height * 2.54 trees$Volume <- trees$Volume * cm(12)^3 #5) Add a 4th column that calculates the basal area of each tree in square centimeters. trees$basal.area <- (trees$DBH / 2) ^ 2 * pi #Challenge questions. #LovingR #6) Write some code that accesses the last row of 'trees'. dim(trees) # 31 rows so just put in 31 for the row index treees[31, ] #7) Create a new data frame in the same format as the 'trees' data frame representing 3 new measured trees (just make up numbers for the DBH, Height, and Volume). Append it to the end of trees. DBH <- c(20.3, 16.4, 18.7) temporary.trees <- data.frame(DBH=DBH, Height=c(1200, 987, 1024), Volume=c(176000, 145000, 156000), basal.area=((DBH/2)^2 * pi) ) new.trees <- rbind(trees, temporary.trees) tail(new.trees) #8) Does your code from question 6 still access the last row of trees? Rewrite that code is it is more flexible and will always access the final row of the trees data frame. # Nope! Let's use nrow() which returns the number of rows in the data frame. Using that in the row index area in bracket notation: trees[nrow(trees), ] new.trees[nrow(new.trees), ] #9) Figure out how to "sort" the new, appended data frame by the 'Height' column. Make sure that you don't lose the association between each tree's height and its corresponding DBH and volume. Now sort it in reverse order by 'Height'. trees[order(trees$Height), ] trees[order(-trees$Height), ] # or trees[order(trees$Height, decreasing=TRUE), ] #10) Write some code that will always access the row of 'trees' that represents the tree with the maximum volume. # Using a conditional statement to get a logical indexing vector trees[trees$Volume == max(trees$Volume), ] # Using the which.max() function which returns the index number of the vector element with the greatest value. trees[which.max(trees$Volume), ]