Let’s get familiar with R

In this section we will investigate the following questions: [[ How does R understand data? ]]

  1. How to use this script and R
  2. Creating objects
  3. Sequences and vectors (1-D)
  4. Data frames

Part 1: How to use this R script

A script is a text document that contains instructions and commands The # symbol is used to leave comments, which R will not try to interpret as a command.

The console (below) is for submitting commands to be interpreted in R. To run a command in the console, you can copy+paste it into the console and press enter.

Copy and paste the following into the console and run it:

>     print("the instructor's name is Sydney")  
[1] "the instructor's name is Sydney"

Run a single line of your script in the console by placing your cursor on the line you want to submit and use your cursor to press the “Run” button at the top right. You can also use the shortcut key strokes cmd+enter (mac) or ctrl+enter (PC) to run a single line at a time.

The two other most important keyboard shortcuts that you’ll want to use are the Tab key to auto-complete your typing at the command line and ctrl+up arrow or cmd+up arrow to access the most recently typed commands.

You can also select only part of a line to have it run on the console. Try running the following command without copy+paste: print(“my name is _____”)

R can also be used to perform calculations, such as the following:

> 5+1/3
[1] 5.333333

What rules does R apply for the order of operations and how do you find out?

Let’s modify the statement above to see if adding parentheses changes the result:

> (5+1)/3
[1] 2

Does it matter if there are spaces added into this?

>  ( 5 + 1 ) / 3
[1] 2

That shows us that the spaces did not matter for the calculation.

R also has some pre-defined matematical terms that you can use, such as pi.

What is pi times pi ?

> pi*pi
[1] 9.869604
> pi^2   # this does the same thing because ^ is, here, interpreted as "taken to the exponent"
[1] 9.869604

Part 2: Creating objects

Objects are like shortcuts. They are a way to store data without having to re-type them. By virtue, objects are only created once something has been assigned to them. Anything can be stored in an object, including figures! Let’s repeat our simple math calculation above, this time using objects. It we want to calculate (5+1)/3 using objects, it needs to look like this: (a+b)/c The objects a, b, and c do not exist yet, so we need to assign values to them in order to create them. R interprets the less than symbol and dash as “assign”. So we need to do the following:

> a <- 5   # assign the number 5 to a
> b <- 1   # assign number 1 to b
> c <- 3   # assign 3 to c

As you are assigning these numbers to objects, they appear in your environment (top right). These objects are not being saved to a hard drive, they are stored in memory of your computer only.

NOTE if you assign something to an object that already exists, R will do what you tell it and overwrite that obect with the new assignment.

Now we can execute our calculation using objects instead of numbers. Try it!

> (a+b)/c
[1] 2

Avoid creating object names that start with a numnber because R will look at the first character and try to interpret the entire name as a mathematical term. Try this:

> 2foxes <- 1   
Error: <text>:1:2: unexpected symbol
1: 2foxes
     ^

The error here tells us that something went wrong and R cannot proceed.

If we want to assign (a+b)/c to a new object called ‘answer’ – what will the object contain? Find out:

> answer <- (a+b)/c

Take a look at the object ‘answer’ by typing the name into the command line:

> answer
[1] 2

What would you get if you multiplied answer by 2?

> answer*2
[1] 4

The examples above dealt with numeric values assigned to objects. We can also store character data in objects. Since some words and phrases can contain spaces or other punctuation, we need to place our words and phrases inside quotation marks.

Let’s use my name for this exercise. Let’s create two objects, one for my first name and another for my last name.

> first.name <- "Sydney"
> last.name <- 'Everhart'  # Single quotes work too!

We now have those two objects. Let’s look at them.

> first.name
[1] "Sydney"
> last.name
[1] "Everhart"

Since we each have a first name and a family name, I want you to modify these objects so that instead of my name, they contain your name.

> first.name <- "Sydney"  #Replace my name with your own name
> last.name <- "Everhart"

Did this work? Let’s look at the two objects:

> first.name
[1] "Sydney"
> last.name
[1] "Everhart"

Using a function c() we can tell R to combine these two objects. This function will combine values from the first object with the second object and return them as a single observation. Let’s try it:

> c(first.name, last.name)
[1] "Sydney"   "Everhart"

Notice how the names are returned inside quotation marks, which tells us that these are interpreted as character data in R. You’ll also notice that each name is placed inside quotes and that’s because c() combined names into a single vector that contains two elements, your first and your last name. This brings us to the next part in our introduction, vectors.

Part 3: Vectors and sequences

Up to here, the objects we’ve created only contained a single element. You can store more than one element in a 1-dimensional object of unlimited length. Let’s create an object that is a vector of our first and last names using the two objects that we created previously.

Avoid re-typing your commands. Since the last command that we ran contained what we want, we can simply use the up arrow to access the most recently submitted command and modify it. You can also access the History tab in the top right panel of RStudio or, at the command line, access a list of the most recent commands using the cmd + up arrow OR the ctrl + up arrow.

> name <- c(first.name, last.name)

We can inspect this object by typing name at the command line. We can inspect the structure of this object using the function str() on name.

> str(name)
 chr [1:2] "Sydney" "Everhart"

This shows us that this is a vector because the elements in it are ordered from 1 to 2 as shown by the [1:2]. This also tells us that this list is a character list, which is indicated by the chr label. We also see the two elements in this vector, which is your first and last name.

What is the length of your name? We can find out using the function lenght()

> length(name)
[1] 2

Let’s compare this to a vector that contains only numeric data. For this example, let’s create three objects to represent today’s date in numbers for the month (05), day (24), and year (2017).

> month <- 05
> day <- 24
> year <- 2017

combine those three objects using the combine function:

> today <- c(month, day, year)

Inspect this object by typing the name today at the command line. You’ll see that R has eliminated the zero that preceeds the 5 and has kept the order we provided for these elements in the vector. Let’s take a look at the structure of today.

> str(today)
 num [1:3] 5 24 2017

You’ll notice that the vector has three elements [1:3] and it contains only numeric data.

Let’s do the same thing using the name May for month and see how that changes our vector. Notice that we are not modifying the object month, we are simply combining our two existing objects with the word “May”.

> c("May", day, year)   
[1] "May"  "24"   "2017"

In this case we didn’t re-assign the object named today. To inspect the structure of this vector, we can wrap the statement within the str() function, as shown below. We also want to inspect the data class (ie. whether numeric or character) using the function class(). Don’t forget to use the up-arrow to access the last like that you ran!

> str(c("May", day, year)) # this shows us the structure of the object
 chr [1:3] "May" "24" "2017"
> class(c("May", day, year))
[1] "character"

Notice how R is trying to keep our data organized according to type. Rather than coding this vector as containing numbers and characters, it has decided that because it can’t call everything in our vector a number that it will call everything characters. This process is called coercion.

Let’s say we wanted to create a table that showed every date this month:

  day   month   year
  1     5       2017
  2     5       2017
  3     5       2017
  ...

We know there are 31 days in the month, so we can modify the object day to contain all of the 31 days in this month. Instead of typing each number out by hand, we can place a colon (:) between 1 and 31, which is a shortcut in R for creating sequences of numbers.

> 1:31
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
[24] 24 25 26 27 28 29 30 31

You see at in the console that this created a sequence of 31 numbers from 1 to 31. Let’s go ahead and assign this to the object day.

> day <- 1:31

For the objects month and year, we don’t need to modify them, however, we want to repeat each of them a total of 31 times because we need to repeat each, once for each day.

We can easily repeat the number 5 a total of 31 times using the function rep(), specifying how many times we should repeat this object. Let’s assign 5 to month and modify the object month to contain 31 copies.

> month <- 5
> month <- rep(month, times = 31) 

Let’s check to make sure that month is correct using the function length():

> length(month)
[1] 31

There are 31 elements in this vector and we can inspect individual elements in the vector based on their ordered position using square brackets:

> day[24]  
[1] 24
> month[24] # the number inside the brackets corresponds to location of element in list, not value
[1] 5

In this case, the 24th element in day is 24, and the 24th element in month is 5 which confirms that we created this correctly.

Type day[32] into your R console. What do you get? What does it mean? Ask yourself the question, “Are there any months with 32 days?”

We can create the object year to contain 31 repeats of 2017, however, this time, let’s say we wanted to make sure that this object was always the same length as the number of days we have in a month. Instead of specifying 31, we can simply get that information using the length() function. Here, we’ll replace 31 with length(day), which is equivalent.

> year <- rep(2017, times = length(day))
> year
 [1] 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017
[15] 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017
[29] 2017 2017 2017
> length(year)
[1] 31

We now have three vectors to create our table and they are exactly the same length:

> length(day)
[1] 31
> length(month)
[1] 31
> length(year)
[1] 31

Part 4: Data frames

Remember that our goal here is to create a table with the columns “month”, “day”, and “year”. First, here’s a quick reminder of what we want this to look like:

  day   month   year
  1     5       2017
  2     5       2017
  3     5       2017
  ...

In order to create a data frame, we can use the command data.frame(). This function will create columns out of vectors that are all the same length. In the function, we just have to specify the columns.

> May <- data.frame(day = day, month = month, year = year)

Let’s inspect this new object in the same way as vectors:

> May
   day month year
1    1     5 2017
2    2     5 2017
3    3     5 2017
4    4     5 2017
5    5     5 2017
6    6     5 2017
7    7     5 2017
8    8     5 2017
9    9     5 2017
10  10     5 2017
11  11     5 2017
12  12     5 2017
13  13     5 2017
14  14     5 2017
15  15     5 2017
16  16     5 2017
17  17     5 2017
18  18     5 2017
19  19     5 2017
20  20     5 2017
21  21     5 2017
22  22     5 2017
23  23     5 2017
24  24     5 2017
25  25     5 2017
26  26     5 2017
27  27     5 2017
28  28     5 2017
29  29     5 2017
30  30     5 2017
31  31     5 2017
> length(May)
[1] 3

Using the length() function, we see it says 3. This is because May has three columns: day, month, and year. A data frame is a two-dimensional object which stores its information in rows and columns.

Because this is a 2-dimensional object, we can inspect the dimensions using the dim() function:

> dim(May)
[1] 31  3

This tells us that we have 31 rows and 3 columns. R also provides the nrow() and ncol() functions to make it easier to remember which is which:

> nrow(May)
[1] 31
> ncol(May)
[1] 3

What happens when we use the str() function?

> str(May)
'data.frame':   31 obs. of  3 variables:
 $ day  : int  1 2 3 4 5 6 7 8 9 10 ...
 $ month: num  5 5 5 5 5 5 5 5 5 5 ...
 $ year : num  2017 2017 2017 2017 2017 ...

We can see that it’s listing the columns we have in our table and showing us how they are represented. Notice the $ to the left of each column name, this is how we access the columns of the data frame:

> May$day
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
[24] 24 25 26 27 28 29 30 31
> May$month
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
> May$year
 [1] 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017
[15] 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017
[29] 2017 2017 2017

You can see that these are the same as the vectors we created earlier.

Because this object is rather large, we didn’t get to see the top rows of the obect. A quick way to look at the top of the object is with the head() function and if we wanted to look at the bottom, we would use tail().

> head(may)  # if this didn't work, double-check that you spelled the object name correctly
Error in head(may): object 'may' not found

Now that we have our table, the question becomes, how the heck do we inspect different elements?

Just like we can inspect the 24th element in the day vector using day[24], we can also use the brackets to subset a table, the only catch is that we have to use the coordinates of the row(s) and the column(s) we want. We can do this by specifying [row, column]. These are analagous to X and Y Cartesian coordinates. Let’s take a look at the elements in the 24th row, separately:

> May[24, 1] # day
[1] 24
> May[24, "month"] # you can use characters when the elements are named!
[1] 5
> May[24, 3] # year
[1] 2017

If we don’t specify a dimension, R will give us the entire contents of that dimension. Let’s look at the row that contains today’s date:

> May[24, ]
   day month year
24  24     5 2017

You can also use this to access just one column of the matrix. Let’s look at month:

> May[, 2]
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5

Notice, however that this result now appears to be a a vector! This is because of a sneaky default option called drop = TRUE. R tries to “help” by removing the dimensions of your data frame if you choose only one column. If you want to keep this as a data frame, you can turn off this option inside the brackets:

> May[, 2, drop = FALSE]
   month
1      5
2      5
3      5
4      5
5      5
6      5
7      5
8      5
9      5
10     5
11     5
12     5
13     5
14     5
15     5
16     5
17     5
18     5
19     5
20     5
21     5
22     5
23     5
24     5
25     5
26     5
27     5
28     5
29     5
30     5
31     5

Now that we’ve inspected the object May, let’s create the same thing for the month of June. How should we do this?

One option would be to create new obects for day, month, and year and combine them just like we did for May. What is the simplest method to do this, using the fewest number of steps?

We can simply make a copy of May with 30 days.

> June <- May[1:30, , drop = FALSE]  # Created new object called June that used rows 1:30

Inspect what we have now:

> str(June)
'data.frame':   30 obs. of  3 variables:
 $ day  : int  1 2 3 4 5 6 7 8 9 10 ...
 $ month: num  5 5 5 5 5 5 5 5 5 5 ...
 $ year : num  2017 2017 2017 2017 2017 ...
> tail(June) # we should have 30 days.
   day month year
25  25     5 2017
26  26     5 2017
27  27     5 2017
28  28     5 2017
29  29     5 2017
30  30     5 2017

We need to change the month column so that it says 6 instead of 5, how can we do this? Let’s just look at the column first:

> June$month
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5

We need to add 1 to each of these values, so let’s try that!

> June$month + 1
 [1] 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6

This worked, so now we just need to replace values in June[,2] with the new expression:

> June$month <- June$month + 1    # Did it work?
> str(June)
'data.frame':   30 obs. of  3 variables:
 $ day  : int  1 2 3 4 5 6 7 8 9 10 ...
 $ month: num  6 6 6 6 6 6 6 6 6 6 ...
 $ year : num  2017 2017 2017 2017 2017 ...

Let’s combine both of these tables into one. Of course, now that we have two dimensions, there are two ways we can combine them, by rows or by columns. R provides two functions that can help us with that called rbind() and cbind() which bind together rows and columns, respectively. Which one should we use? If you’re unsure, try both!

> cbind(May, June)
Error in data.frame(..., check.names = FALSE): arguments imply differing number of rows: 31, 30
> rbind(May, June)
   day month year
1    1     5 2017
2    2     5 2017
3    3     5 2017
4    4     5 2017
5    5     5 2017
6    6     5 2017
7    7     5 2017
8    8     5 2017
9    9     5 2017
10  10     5 2017
11  11     5 2017
12  12     5 2017
13  13     5 2017
14  14     5 2017
15  15     5 2017
16  16     5 2017
17  17     5 2017
18  18     5 2017
19  19     5 2017
20  20     5 2017
21  21     5 2017
22  22     5 2017
23  23     5 2017
24  24     5 2017
25  25     5 2017
26  26     5 2017
27  27     5 2017
28  28     5 2017
29  29     5 2017
30  30     5 2017
31  31     5 2017
32   1     6 2017
33   2     6 2017
34   3     6 2017
35   4     6 2017
36   5     6 2017
37   6     6 2017
38   7     6 2017
39   8     6 2017
40   9     6 2017
41  10     6 2017
42  11     6 2017
43  12     6 2017
44  13     6 2017
45  14     6 2017
46  15     6 2017
47  16     6 2017
48  17     6 2017
49  18     6 2017
50  19     6 2017
51  20     6 2017
52  21     6 2017
53  22     6 2017
54  23     6 2017
55  24     6 2017
56  25     6 2017
57  26     6 2017
58  27     6 2017
59  28     6 2017
60  29     6 2017
61  30     6 2017

Notice how cbind gave us an error. What happened? Looks like rbind worked, so let’s assign that to a new object:

> spring <- rbind(May, June)

Inspect this object to ensure it was made correctly.

> str(spring)
'data.frame':   61 obs. of  3 variables:
 $ day  : int  1 2 3 4 5 6 7 8 9 10 ...
 $ month: num  5 5 5 5 5 5 5 5 5 5 ...
 $ year : num  2017 2017 2017 2017 2017 ...
> head(spring)
  day month year
1   1     5 2017
2   2     5 2017
3   3     5 2017
4   4     5 2017
5   5     5 2017
6   6     5 2017
> tail(spring)
   day month year
56  25     6 2017
57  26     6 2017
58  27     6 2017
59  28     6 2017
60  29     6 2017
61  30     6 2017

We now have a new object spring that contains only numeric data. Let’s revise this object so that it uses names for the month instead of numbers and so that we know what day of the week it is. We want it to look like this:

  day   month   year  wkday
  1     "May"   2017  "Mon"
  2     "May"   2017  "Tues"
  3     "May"   2017  "Wed"
  ...

Months need to be changed from the number 5 to “May” and from 6 to “June” in the second column. Let’s first look at the month column.

> spring$month
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6
[36] 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6

We want to specify only the cells in this list that are 5. We know that rows 1 to 31 contain 5’s and the rest contain 6’s, which means we can inspect those rows in the object spring:

> spring[1:31, "month"]     # May
 [1] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
> spring[-c(1:31), "month"] # June
 [1] 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6

Notice that we used -c(1:31), what do you think this is doing? Why would this give us the values for the month of June?

We can use the ifelse() function to replace the values in our column. This function produces a new vector based on a condition specified for another vector. For example, if we graded students on a scale from 1 to 10 where anything above 5 was a passing grade, we could create a pass/fail vector like so:

> grades <- data.frame(grade = 1:10)
> grades
   grade
1      1
2      2
3      3
4      4
5      5
6      6
7      7
8      8
9      9
10    10
> grades$eval <- ifelse(grades$grade > 5, yes = "pass", no = "fail")
> grades
   grade eval
1      1 fail
2      2 fail
3      3 fail
4      4 fail
5      5 fail
6      6 pass
7      7 pass
8      8 pass
9      9 pass
10    10 pass

We can do the same thing with our spring data frame, except this time, we want to say if the month is 5, then it’s May, otherwise, we call it June:

> ifelse(spring$month == 5, yes = "May", no = "June")
 [1] "May"  "May"  "May"  "May"  "May"  "May"  "May"  "May"  "May"  "May" 
[11] "May"  "May"  "May"  "May"  "May"  "May"  "May"  "May"  "May"  "May" 
[21] "May"  "May"  "May"  "May"  "May"  "May"  "May"  "May"  "May"  "May" 
[31] "May"  "June" "June" "June" "June" "June" "June" "June" "June" "June"
[41] "June" "June" "June" "June" "June" "June" "June" "June" "June" "June"
[51] "June" "June" "June" "June" "June" "June" "June" "June" "June" "June"
[61] "June"
> spring$month <- ifelse(spring$month == 5, yes = "May", no = "June")

Notice that we had to use == to indicate equality. This is so that R doesn’t get confused and assume we are using the argument assignment, =.

Let’s inspect spring now.

> str(spring)
'data.frame':   61 obs. of  3 variables:
 $ day  : int  1 2 3 4 5 6 7 8 9 10 ...
 $ month: chr  "May" "May" "May" "May" ...
 $ year : num  2017 2017 2017 2017 2017 ...
> head(spring)
  day month year
1   1   May 2017
2   2   May 2017
3   3   May 2017
4   4   May 2017
5   5   May 2017
6   6   May 2017
> class(spring)
[1] "data.frame"

Now we are ready to add a new column to our data frame spring so that it looks like this:

  day   month   year  wkday
  1     "May"   2017  "Mon"
  2     "May"   2017  "Tues"
  3     "May"   2017  "Wed"
  ...

How should we do this, using the fewest number of steps?

We know that this column will repeat “Mon”, “Tues”, “Wed”, “Thurs”, “Fri”, “Sat”, “Sun” (since May starts on a Monday this year). We also know that we need that list to repeat until the total length of the list is equal to the number of days in May and June, which can be determined by using the nrow() function.

> eachday <- c("Mon", "Tues", "Wed", "Thurs", "Fri", "Sat", "Sun")
> wkday   <- rep(eachday, times = nrow(spring))
> wkday
  [1] "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"  
  [9] "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues" 
 [17] "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"  
 [25] "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs"
 [33] "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"  
 [41] "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"  
 [49] "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"  
 [57] "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"  
 [65] "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues" 
 [73] "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"  
 [81] "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs"
 [89] "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"  
 [97] "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"  
[105] "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"  
[113] "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"  
[121] "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues" 
[129] "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"  
[137] "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs"
[145] "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"  
[153] "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"  
[161] "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"  
[169] "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"  
[177] "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues" 
[185] "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"  
[193] "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs"
[201] "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"  
[209] "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"  
[217] "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"  
[225] "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"  
[233] "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues" 
[241] "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"  
[249] "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs"
[257] "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"  
[265] "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"  
[273] "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"  
[281] "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"  
[289] "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues" 
[297] "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"  
[305] "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs"
[313] "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"  
[321] "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"  
[329] "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"  
[337] "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"  
[345] "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues" 
[353] "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"  
[361] "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs"
[369] "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"  
[377] "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"  
[385] "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"  
[393] "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"  
[401] "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues" 
[409] "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"  
[417] "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs"
[425] "Fri"   "Sat"   "Sun"  

Uh-oh, it looks like we made this list far too long. This is because times means how many times the entire vector needs to be repeated. If we look at the examples in the help page for rep (type help(“rep”)), we can see that we need to use the argument length.out

> wkday <- rep(eachday, length.out = nrow(spring))

Inspect your new object wkday and make sure it’s the correct length.

> length(wkday) == nrow(spring)
[1] TRUE
> wkday
 [1] "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"  
 [9] "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues" 
[17] "Wed"   "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"  
[25] "Thurs" "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs"
[33] "Fri"   "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"  
[41] "Sat"   "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"  
[49] "Sun"   "Mon"   "Tues"  "Wed"   "Thurs" "Fri"   "Sat"   "Sun"  
[57] "Mon"   "Tues"  "Wed"   "Thurs" "Fri"  

Now we are ready to add this vector to our data frame. We can do this by specifying the name of the column we want to add.

> spring$wkday <- wkday
> spring
   day month year wkday
1    1   May 2017   Mon
2    2   May 2017  Tues
3    3   May 2017   Wed
4    4   May 2017 Thurs
5    5   May 2017   Fri
6    6   May 2017   Sat
7    7   May 2017   Sun
8    8   May 2017   Mon
9    9   May 2017  Tues
10  10   May 2017   Wed
11  11   May 2017 Thurs
12  12   May 2017   Fri
13  13   May 2017   Sat
14  14   May 2017   Sun
15  15   May 2017   Mon
16  16   May 2017  Tues
17  17   May 2017   Wed
18  18   May 2017 Thurs
19  19   May 2017   Fri
20  20   May 2017   Sat
21  21   May 2017   Sun
22  22   May 2017   Mon
23  23   May 2017  Tues
24  24   May 2017   Wed
25  25   May 2017 Thurs
26  26   May 2017   Fri
27  27   May 2017   Sat
28  28   May 2017   Sun
29  29   May 2017   Mon
30  30   May 2017  Tues
31  31   May 2017   Wed
32   1  June 2017 Thurs
33   2  June 2017   Fri
34   3  June 2017   Sat
35   4  June 2017   Sun
36   5  June 2017   Mon
37   6  June 2017  Tues
38   7  June 2017   Wed
39   8  June 2017 Thurs
40   9  June 2017   Fri
41  10  June 2017   Sat
42  11  June 2017   Sun
43  12  June 2017   Mon
44  13  June 2017  Tues
45  14  June 2017   Wed
46  15  June 2017 Thurs
47  16  June 2017   Fri
48  17  June 2017   Sat
49  18  June 2017   Sun
50  19  June 2017   Mon
51  20  June 2017  Tues
52  21  June 2017   Wed
53  22  June 2017 Thurs
54  23  June 2017   Fri
55  24  June 2017   Sat
56  25  June 2017   Sun
57  26  June 2017   Mon
58  27  June 2017  Tues
59  28  June 2017   Wed
60  29  June 2017 Thurs
61  30  June 2017   Fri

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.