---
title: "Subsetting Data in R - Lab"
output: html_document
editor_options: 
  chunk_output_type: console
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

In this lab you can use the interactive console to explore but please record your commands here.  Remember anything you type here can be "sent" to the console with Cmd-Enter (OS-X) or Cntr-Enter (Windows/Linux) (But only in side the ```{r}``` areas).

```{r, message = FALSE}
library(dplyr)
library(tidyverse)
library(jhur)
```

# Part 1

1. Check to see if you have the `mtcars` dataset 
```{r}
mtcars
```


2. What class is `mtcars`?
```{r}
class(mtcars)
```

3. How many observations (rows) and variables (columns) are in the `mtcars` dataset?

```{r}
dim(mtcars)
nrow(mtcars)
ncol(mtcars)
glimpse(mtcars)
```

4. Copy mtcars into an object called cars and rename `mpg` in cars to `MPG`. Use `rename`

```{r}
cars = mtcars
cars = rename(cars, MPG = mpg)
head(cars)
```

5. Convert the column names of `cars` to all upper case. Use rename_all, and the `toupper` command (or `colnames`).

```{r}
cars = rename_all(cars, toupper)
head(cars)
```

```{r alternative}
cars = mtcars
cn = colnames(cars) # extract column names
cn = toupper(cn) # make them uppercase
colnames(cars) = cn # reassign
head(cars)
```


# Part 2

You can create a column called `car` using the `rownames_to_column` function. 

```{r}
cars = rownames_to_column(mtcars, var = "car")
```

6. Subset the columns from `cars` that end in `"p"` and call it `pvars`, use `ends_with()`.

```{r}
pvars = select(cars, car, ends_with("p"))
```

7. Create a subset of the data that only contains the columns: `wt`, `qsec`, and `hp` and assign this object to `carsSub` - what are the dimensions of this dataset? Use `select()` (and `dim`):

```{r}
carsSub = select(cars, car, wt, qsec, hp)
dim(carsSub)
```

8. Convert the column names of `carsSub` to all upper case.  Use `rename_all()`, and the `toupper` command (or `colnames`)

```{r}
carsSub = rename_all(carsSub, toupper)
```


# Part 3

9. Subset the rows of cars that get more than 20 miles per gallon (`mpg`) of fuel efficiency - how many are there? Use `filter()`

```{r}
cars_mpg = filter(cars, mpg > 20)
dim(cars_mpg)
nrow(cars_mpg)
glimpse(cars_mpg)
# filter(cars, mpg > 20)
```

There are `r nrow(cars_mpg)` cars. 
There are `nrow(cars_mpg)` cars. 


```{r}
cars %>% filter(mpg > 20) %>% nrow()
filter(cars, mpg > 20) %>% nrow()

```

10. Subset the rows that get less than 16 miles per gallon (`mpg`) of fuel efficiency and have more than 100 horsepower (`hp`) - how many are there?
```{r}
filter(cars, mpg < 16 & hp > 100)
nrow(filter(cars, mpg < 16 & hp > 100))
nrow(filter(cars, mpg < 16, hp > 100))
cars %>% filter(mpg < 16, hp > 100) %>% nrow()
```


# Part 4

11. Create a subset from the `cars` data that only contains the columns:
	`wt`, `qsec`, and `hp` for only the cars with 8 cylinders
	and reassign this object to `carsSub` - what are the dimensions of this dataset?

```{r}
carsSub = filter(cars, cyl == 8) 
carsSub = select(carsSub, wt, qsec, hp, car)
dim(carsSub)
carsSub = cars %>% 
  filter(cyl == 8) %>% 
  select(wt, qsec, hp, car)
dim(carsSub)
```


12. Re-order the rows of `carsSub` by weight in increasing order. Use `arrange()`

```{r}
carsSub = arrange(carsSub, wt)
```


13. Create a new variable in `carsSub` called `wt2`, which  is equal to `wt^2`, using `mutate()`.  Use piping `%>%`:

```{r}
carsSub %>% mutate(wt2 = wt^2)
carsSub = carsSub %>% mutate(wt2 = wt^2)
```