---
title: "Subsetting Data in R - Lab"
output: html_document
editor_options:
chunk_output_type: console
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
In this lab you can use the interactive console to explore but please record your commands here. Remember anything you type here can be "sent" to the console with Cmd-Enter (OS-X) or Cntr-Enter (Windows/Linux) (But only in side the ```{r}``` areas).
```{r, message = FALSE}
library(dplyr)
library(tidyverse)
library(jhur)
```
# Part 1
1. Check to see if you have the `mtcars` dataset
```{r}
mtcars
```
2. What class is `mtcars`?
```{r}
class(mtcars)
```
3. How many observations (rows) and variables (columns) are in the `mtcars` dataset?
```{r}
dim(mtcars)
nrow(mtcars)
ncol(mtcars)
glimpse(mtcars)
```
4. Copy mtcars into an object called cars and rename `mpg` in cars to `MPG`. Use `rename`
```{r}
cars = mtcars
cars = rename(cars, MPG = mpg)
head(cars)
```
5. Convert the column names of `cars` to all upper case. Use rename_all, and the `toupper` command (or `colnames`).
```{r}
cars = rename_all(cars, toupper)
head(cars)
```
```{r alternative}
cars = mtcars
cn = colnames(cars) # extract column names
cn = toupper(cn) # make them uppercase
colnames(cars) = cn # reassign
head(cars)
```
# Part 2
You can create a column called `car` using the `rownames_to_column` function.
```{r}
cars = rownames_to_column(mtcars, var = "car")
```
6. Subset the columns from `cars` that end in `"p"` and call it `pvars`, use `ends_with()`.
```{r}
pvars = select(cars, ends_with("p"))
```
7. Create a subset of the data that only contains the columns: `wt`, `qsec`, and `hp` and assign this object to `carsSub` - what are the dimensions of this dataset? Use `select()` (and `dim`):
```{r}
carsSub = select(mtcars, wt, qsec, hp)
dim(carsSub)
```
8. Convert the column names of `carsSub` to all upper case. Use `rename_all()`, and the `toupper` command (or `colnames`)
```{r}
carsSub = rename_all(carsSub, toupper)
```
# Part 3
9. Subset the rows of cars that get more than 20 miles per gallon (`mpg`) of fuel efficiency - how many are there? Use `filter()`
```{r}
cars_mpg = filter(cars, mpg > 18)
dim(cars_mpg)
nrow(cars_mpg)
# filter(cars, mpg > 20)
```
There are `r nrow(cars_mpg)` cars.
There are `nrow(cars_mpg)` cars.
```{r}
cars %>% filter(mpg > 20) %>% nrow()
filter(cars, mpg > 20) %>% nrow()
```
10. Subset the rows that get less than 16 miles per gallon (`mpg`) of fuel efficiency and have more than 100 horsepower (`hp`) - how many are there?
```{r}
filter(cars, mpg < 16 & hp > 100)
nrow(filter(cars, mpg < 16 & hp > 100))
nrow(filter(cars, mpg < 16, hp > 100))
cars %>% filter(mpg < 16, hp > 100) %>% nrow()
```
# Part 4
11. Create a subset from the `cars` data that only contains the columns:
`wt`, `qsec`, and `hp` for only the cars with 8 cylinders
and reassign this object to `carsSub` - what are the dimensions of this dataset?
```{r}
carsSub = filter(cars, cyl == 8)
carsSub = select(carsSub, wt, qsec, hp, car)
dim(carsSub)
carsSub = cars %>%
filter(cyl == 8) %>%
select(wt, qsec, hp, car)
dim(carsSub)
```
12. Re-order the rows of `carsSub` by weight in increasing order. Use `arrange()`
```{r}
carsSub = arrange(carsSub, wt)
```
13. Create a new variable in `carsSub` called `wt2`, which is equal to `wt^2`, using `mutate()`. Use piping `%>%`:
```{r}
carsSub %>% mutate(wt2 = wt^2)
carsSub = carsSub %>% mutate(wt2 = wt^2)
```