Easy Way to Convert Col to Row in R

Select Data Frame Columns in R

In this tutorial, you will learn how to select or subset data frame columns by names and position using the R function select() and pull() [in dplyr package]. We'll also show how to remove columns from a data frame.

You will learn how to use the following functions:

  • pull(): Extract column values as a vector. The column of interest can be specified either by name or by index.
  • select(): Extract one or multiple columns as a data table. It can be also used to remove columns from the data frame.
  • select_if(): Select columns based on a particular condition. One can use this function to, for example, select columns if they are numeric.
  • Helper functions - starts_with(), ends_with(), contains(), matches(), one_of(): Select columns/variables based on their names

Select Columns of a Data Frame in R


Contents:

  • Required packages
  • Demo dataset
  • Extract column values as a vector
  • Extract columns as a data table
    • Select column by position
    • Select columns by names
  • Select column based on a condtion
  • Remove columns
  • Summary

Required packages

Load the tidyverse packages, which include dplyr:

                  library(tidyverse)                

Demo dataset

We'll use the R built-in iris data set, which we start by converting into a tibble data frame (tbl_df) for easier data analysis.

                  my_data <- as_tibble(iris) my_data                
                  ## # A tibble: 150 x 5 ##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species ##          <dbl>       <dbl>        <dbl>       <dbl> <fct>   ## 1          5.1         3.5          1.4         0.2 setosa  ## 2          4.9         3            1.4         0.2 setosa  ## 3          4.7         3.2          1.3         0.2 setosa  ## 4          4.6         3.1          1.5         0.2 setosa  ## 5          5           3.6          1.4         0.2 setosa  ## 6          5.4         3.9          1.7         0.4 setosa  ## # ... with 144 more rows                

Extract column values as a vector

                  my_data %>% pull(Species)                
                  ##   [1] setosa     setosa     setosa     setosa     setosa     setosa     ##   [7] setosa     setosa     setosa     setosa     setosa     setosa     ##  [13] setosa     setosa     setosa     setosa     setosa     setosa     ##  [19] setosa     setosa     setosa     setosa     setosa     setosa     ##  [25] setosa     setosa     setosa     setosa     setosa     setosa     ##  [31] setosa     setosa     setosa     setosa     setosa     setosa     ##  [37] setosa     setosa     setosa     setosa     setosa     setosa     ##  [43] setosa     setosa     setosa     setosa     setosa     setosa     ##  [49] setosa     setosa     versicolor versicolor versicolor versicolor ##  [55] versicolor versicolor versicolor versicolor versicolor versicolor ##  [61] versicolor versicolor versicolor versicolor versicolor versicolor ##  [67] versicolor versicolor versicolor versicolor versicolor versicolor ##  [73] versicolor versicolor versicolor versicolor versicolor versicolor ##  [79] versicolor versicolor versicolor versicolor versicolor versicolor ##  [85] versicolor versicolor versicolor versicolor versicolor versicolor ##  [91] versicolor versicolor versicolor versicolor versicolor versicolor ##  [97] versicolor versicolor versicolor versicolor virginica  virginica  ## [103] virginica  virginica  virginica  virginica  virginica  virginica  ## [109] virginica  virginica  virginica  virginica  virginica  virginica  ## [115] virginica  virginica  virginica  virginica  virginica  virginica  ## [121] virginica  virginica  virginica  virginica  virginica  virginica  ## [127] virginica  virginica  virginica  virginica  virginica  virginica  ## [133] virginica  virginica  virginica  virginica  virginica  virginica  ## [139] virginica  virginica  virginica  virginica  virginica  virginica  ## [145] virginica  virginica  virginica  virginica  virginica  virginica  ## Levels: setosa versicolor virginica                

Extract columns as a data table

Select column by position

  • Select columns 1 to 3:
                    my_data %>% select(1:3)                  
  • Select column 1 and 3 but not 2:
                    my_data %>% select(1, 3)                  

Select columns by names

Select columns by names: Sepal.Length and Petal.Length

                    my_data %>% select(Sepal.Length, Petal.Length)                  
                    ## # A tibble: 150 x 2 ##   Sepal.Length Petal.Length ##          <dbl>        <dbl> ## 1          5.1          1.4 ## 2          4.9          1.4 ## 3          4.7          1.3 ## 4          4.6          1.5 ## 5          5            1.4 ## 6          5.4          1.7 ## # ... with 144 more rows                  

Select all columns from Sepal.Length to Petal.Length

                    my_data %>% select(Sepal.Length:Petal.Length)                  
                    ## # A tibble: 150 x 3 ##   Sepal.Length Sepal.Width Petal.Length ##          <dbl>       <dbl>        <dbl> ## 1          5.1         3.5          1.4 ## 2          4.9         3            1.4 ## 3          4.7         3.2          1.3 ## 4          4.6         3.1          1.5 ## 5          5           3.6          1.4 ## 6          5.4         3.9          1.7 ## # ... with 144 more rows                  

There are several special functions that can be used inside select(): starts_with(), ends_with(), contains(), matches(), one_of(), etc.

                    # Select column whose name starts with "Petal" my_data %>% select(starts_with("Petal"))  # Select column whose name ends with "Width" my_data %>% select(ends_with("Width"))  # Select columns whose names contains "etal" my_data %>% select(contains("etal"))    # Select columns whose name maches a regular expression my_data %>% select(matches(".t."))  # selects variables provided in a character vector. my_data %>% select(one_of(c("Sepal.Length", "Petal.Length")))                  

Select column based on a condtion

It's possible to apply a function to the columns. The columns for which the function returns TRUE are selected.

Select only numeric columns:

                  my_data %>% select_if(is.numeric)                
                  ## # A tibble: 150 x 4 ##   Sepal.Length Sepal.Width Petal.Length Petal.Width ##          <dbl>       <dbl>        <dbl>       <dbl> ## 1          5.1         3.5          1.4         0.2 ## 2          4.9         3            1.4         0.2 ## 3          4.7         3.2          1.3         0.2 ## 4          4.6         3.1          1.5         0.2 ## 5          5           3.6          1.4         0.2 ## 6          5.4         3.9          1.7         0.4 ## # ... with 144 more rows                

Remove columns

Note that, to remove a column from a data frame, prepend its name by minus -.

Removing Sepal.Length and Petal.Length columns:

                  my_data %>% select(-Sepal.Length, -Petal.Length)                

Removing all columns from Sepal.Length to Petal.Length:

                  my_data %>% select(-(Sepal.Length:Petal.Length))                
                  ## # A tibble: 150 x 2 ##   Petal.Width Species ##         <dbl> <fct>   ## 1         0.2 setosa  ## 2         0.2 setosa  ## 3         0.2 setosa  ## 4         0.2 setosa  ## 5         0.2 setosa  ## 6         0.4 setosa  ## # ... with 144 more rows                

Removing all columns whose name starts with "Petal":

                  my_data %>% select(-starts_with("Petal"))                
                  ## # A tibble: 150 x 3 ##   Sepal.Length Sepal.Width Species ##          <dbl>       <dbl> <fct>   ## 1          5.1         3.5 setosa  ## 2          4.9         3   setosa  ## 3          4.7         3.2 setosa  ## 4          4.6         3.1 setosa  ## 5          5           3.6 setosa  ## 6          5.4         3.9 setosa  ## # ... with 144 more rows                

Note that, if you want to drop columns by position, the syntax is as follow.

                  # Drop column 1 my_data %>% select(-1)  # Drop columns 1 to 3 my_data %>% select(-(1:3))  # Drop columns 1 and 3 but not 2 my_data %>% select(-1, -3)                

Summary

In this tutorial, we describe how to select columns by positions and by names. Additionally, we present how to remove columns from a data frame.



Back to Data Manipulation in R

ryanshowelve1981.blogspot.com

Source: https://www.datanovia.com/en/lessons/select-data-frame-columns-in-r/

Related Posts

0 Response to "Easy Way to Convert Col to Row in R"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel