Easy Way to Convert Col to Row in R
Select Data Frame Columns in R
In this tutorial, you will learn how to select or subset data frame columns by names and position using the R function select()
and pull()
[in dplyr package]. We'll also show how to remove columns from a data frame.
You will learn how to use the following functions:
- pull(): Extract column values as a vector. The column of interest can be specified either by name or by index.
- select(): Extract one or multiple columns as a data table. It can be also used to remove columns from the data frame.
- select_if(): Select columns based on a particular condition. One can use this function to, for example, select columns if they are numeric.
- Helper functions - starts_with(), ends_with(), contains(), matches(), one_of(): Select columns/variables based on their names
Contents:
- Required packages
- Demo dataset
- Extract column values as a vector
- Extract columns as a data table
- Select column by position
- Select columns by names
- Select column based on a condtion
- Remove columns
- Summary
Required packages
Load the tidyverse
packages, which include dplyr
:
library(tidyverse)
Demo dataset
We'll use the R built-in iris data set, which we start by converting into a tibble data frame (tbl_df) for easier data analysis.
my_data <- as_tibble(iris) my_data
## # A tibble: 150 x 5 ## Sepal.Length Sepal.Width Petal.Length Petal.Width Species ## <dbl> <dbl> <dbl> <dbl> <fct> ## 1 5.1 3.5 1.4 0.2 setosa ## 2 4.9 3 1.4 0.2 setosa ## 3 4.7 3.2 1.3 0.2 setosa ## 4 4.6 3.1 1.5 0.2 setosa ## 5 5 3.6 1.4 0.2 setosa ## 6 5.4 3.9 1.7 0.4 setosa ## # ... with 144 more rows
Extract column values as a vector
my_data %>% pull(Species)
## [1] setosa setosa setosa setosa setosa setosa ## [7] setosa setosa setosa setosa setosa setosa ## [13] setosa setosa setosa setosa setosa setosa ## [19] setosa setosa setosa setosa setosa setosa ## [25] setosa setosa setosa setosa setosa setosa ## [31] setosa setosa setosa setosa setosa setosa ## [37] setosa setosa setosa setosa setosa setosa ## [43] setosa setosa setosa setosa setosa setosa ## [49] setosa setosa versicolor versicolor versicolor versicolor ## [55] versicolor versicolor versicolor versicolor versicolor versicolor ## [61] versicolor versicolor versicolor versicolor versicolor versicolor ## [67] versicolor versicolor versicolor versicolor versicolor versicolor ## [73] versicolor versicolor versicolor versicolor versicolor versicolor ## [79] versicolor versicolor versicolor versicolor versicolor versicolor ## [85] versicolor versicolor versicolor versicolor versicolor versicolor ## [91] versicolor versicolor versicolor versicolor versicolor versicolor ## [97] versicolor versicolor versicolor versicolor virginica virginica ## [103] virginica virginica virginica virginica virginica virginica ## [109] virginica virginica virginica virginica virginica virginica ## [115] virginica virginica virginica virginica virginica virginica ## [121] virginica virginica virginica virginica virginica virginica ## [127] virginica virginica virginica virginica virginica virginica ## [133] virginica virginica virginica virginica virginica virginica ## [139] virginica virginica virginica virginica virginica virginica ## [145] virginica virginica virginica virginica virginica virginica ## Levels: setosa versicolor virginica
Extract columns as a data table
Select column by position
- Select columns 1 to 3:
my_data %>% select(1:3)
- Select column 1 and 3 but not 2:
my_data %>% select(1, 3)
Select columns by names
Select columns by names: Sepal.Length and Petal.Length
my_data %>% select(Sepal.Length, Petal.Length)
## # A tibble: 150 x 2 ## Sepal.Length Petal.Length ## <dbl> <dbl> ## 1 5.1 1.4 ## 2 4.9 1.4 ## 3 4.7 1.3 ## 4 4.6 1.5 ## 5 5 1.4 ## 6 5.4 1.7 ## # ... with 144 more rows
Select all columns from Sepal.Length to Petal.Length
my_data %>% select(Sepal.Length:Petal.Length)
## # A tibble: 150 x 3 ## Sepal.Length Sepal.Width Petal.Length ## <dbl> <dbl> <dbl> ## 1 5.1 3.5 1.4 ## 2 4.9 3 1.4 ## 3 4.7 3.2 1.3 ## 4 4.6 3.1 1.5 ## 5 5 3.6 1.4 ## 6 5.4 3.9 1.7 ## # ... with 144 more rows
There are several special functions that can be used inside select(): starts_with(), ends_with(), contains(), matches(), one_of(), etc.
# Select column whose name starts with "Petal" my_data %>% select(starts_with("Petal")) # Select column whose name ends with "Width" my_data %>% select(ends_with("Width")) # Select columns whose names contains "etal" my_data %>% select(contains("etal")) # Select columns whose name maches a regular expression my_data %>% select(matches(".t.")) # selects variables provided in a character vector. my_data %>% select(one_of(c("Sepal.Length", "Petal.Length")))
Select column based on a condtion
It's possible to apply a function to the columns. The columns for which the function returns TRUE are selected.
Select only numeric columns:
my_data %>% select_if(is.numeric)
## # A tibble: 150 x 4 ## Sepal.Length Sepal.Width Petal.Length Petal.Width ## <dbl> <dbl> <dbl> <dbl> ## 1 5.1 3.5 1.4 0.2 ## 2 4.9 3 1.4 0.2 ## 3 4.7 3.2 1.3 0.2 ## 4 4.6 3.1 1.5 0.2 ## 5 5 3.6 1.4 0.2 ## 6 5.4 3.9 1.7 0.4 ## # ... with 144 more rows
Remove columns
Note that, to remove a column from a data frame, prepend its name by minus -.
Removing Sepal.Length and Petal.Length columns:
my_data %>% select(-Sepal.Length, -Petal.Length)
Removing all columns from Sepal.Length to Petal.Length:
my_data %>% select(-(Sepal.Length:Petal.Length))
## # A tibble: 150 x 2 ## Petal.Width Species ## <dbl> <fct> ## 1 0.2 setosa ## 2 0.2 setosa ## 3 0.2 setosa ## 4 0.2 setosa ## 5 0.2 setosa ## 6 0.4 setosa ## # ... with 144 more rows
Removing all columns whose name starts with "Petal":
my_data %>% select(-starts_with("Petal"))
## # A tibble: 150 x 3 ## Sepal.Length Sepal.Width Species ## <dbl> <dbl> <fct> ## 1 5.1 3.5 setosa ## 2 4.9 3 setosa ## 3 4.7 3.2 setosa ## 4 4.6 3.1 setosa ## 5 5 3.6 setosa ## 6 5.4 3.9 setosa ## # ... with 144 more rows
Note that, if you want to drop columns by position, the syntax is as follow.
# Drop column 1 my_data %>% select(-1) # Drop columns 1 to 3 my_data %>% select(-(1:3)) # Drop columns 1 and 3 but not 2 my_data %>% select(-1, -3)
Summary
In this tutorial, we describe how to select columns by positions and by names. Additionally, we present how to remove columns from a data frame.
Recommended for you
This section contains best data science and self-development resources to help you on your path.
Source: https://www.datanovia.com/en/lessons/select-data-frame-columns-in-r/
0 Response to "Easy Way to Convert Col to Row in R"
Post a Comment