Separate columns with data.table — dt

Separates a column of data into others, by splitting based a separator or regular expression

dt_separate(
  dt_,
  col,
  into,
  sep = ".",
  remove = TRUE,
  fill = NA,
  fixed = TRUE,
  immutable = TRUE,
  dev = FALSE,
  ...
)

Arguments

dt_: the data table (or if not a data.table then it is coerced with as.data.table)
col: the column to separate
into: the names of the new columns created from splitting col.
sep: the regular expression stating how col should be separated. Default is ..
remove: should col be removed in the returned data table? Default is TRUE
fill: if empty, fill is inserted. Default is NA.
fixed: logical. If TRUE match split exactly, otherwise use regular expressions. Has priority over perl.
immutable: If TRUE, .dt is treated as immutable (it will not be modified in place). Alternatively, you can set immutable = FALSE to modify the input object.
dev: If TRUE, the function can be used within other functions. It bypasses the usual non-standard evaluation. Default is FALSE.
...: arguments passed to data.table::tstrplit()

Value

A data.table with a column split into multiple columns.

Examples


library(data.table)
d <- data.table(
  x = c("A.B", "A", "B", "B.A"),
  y = 1:4
)

# defaults
dt_separate(d, x, c("c1", "c2"))
#>        y     c1     c2
#>    <int> <char> <char>
#> 1:     1      A      B
#> 2:     2      A   <NA>
#> 3:     3      B   <NA>
#> 4:     4      B      A

# can keep the original column with `remove = FALSE`
dt_separate(d, x, c("c1", "c2"), remove = FALSE)
#>         x     y     c1     c2
#>    <char> <int> <char> <char>
#> 1:    A.B     1      A      B
#> 2:      A     2      A   <NA>
#> 3:      B     3      B   <NA>
#> 4:    B.A     4      B      A

# need to assign when `immutable = TRUE`
separated <- dt_separate(d, x, c("c1", "c2"), immutable = TRUE)
separated
#>        y     c1     c2
#>    <int> <char> <char>
#> 1:     1      A      B
#> 2:     2      A   <NA>
#> 3:     3      B   <NA>
#> 4:     4      B      A

# don't need to assign when `immutable = FALSE` (default)
dt_separate(d, x, c("c1", "c2"), immutable = FALSE)
#>        y     c1     c2
#>    <int> <char> <char>
#> 1:     1      A      B
#> 2:     2      A   <NA>
#> 3:     3      B   <NA>
#> 4:     4      B      A
d
#>        y     c1     c2
#>    <int> <char> <char>
#> 1:     1      A      B
#> 2:     2      A   <NA>
#> 3:     3      B   <NA>
#> 4:     4      B      A