Separates a column of data into others, by splitting based a separator or regular expression

dt_separate(
  dt_,
  col,
  into,
  sep = ".",
  remove = TRUE,
  fill = NA,
  fixed = TRUE,
  immutable = TRUE,
  dev = FALSE,
  ...
)

Arguments

dt_

the data table (or if not a data.table then it is coerced with as.data.table)

col

the column to separate

into

the names of the new columns created from splitting col.

sep

the regular expression stating how col should be separated. Default is ..

remove

should col be removed in the returned data table? Default is TRUE

fill

if empty, fill is inserted. Default is NA.

fixed

logical. If TRUE match split exactly, otherwise use regular expressions. Has priority over perl.

immutable

If TRUE, .dt is treated as immutable (it will not be modified in place). Alternatively, you can set immutable = FALSE to modify the input object.

dev

If TRUE, the function can be used within other functions. It bypasses the usual non-standard evaluation. Default is FALSE.

...

arguments passed to data.table::tstrplit()

Value

A data.table with a column split into multiple columns.

Examples


library(data.table)
d <- data.table(
  x = c("A.B", "A", "B", "B.A"),
  y = 1:4
)

# defaults
dt_separate(d, x, c("c1", "c2"))
#>        y     c1     c2
#>    <int> <char> <char>
#> 1:     1      A      B
#> 2:     2      A   <NA>
#> 3:     3      B   <NA>
#> 4:     4      B      A

# can keep the original column with `remove = FALSE`
dt_separate(d, x, c("c1", "c2"), remove = FALSE)
#>         x     y     c1     c2
#>    <char> <int> <char> <char>
#> 1:    A.B     1      A      B
#> 2:      A     2      A   <NA>
#> 3:      B     3      B   <NA>
#> 4:    B.A     4      B      A

# need to assign when `immutable = TRUE`
separated <- dt_separate(d, x, c("c1", "c2"), immutable = TRUE)
separated
#>        y     c1     c2
#>    <int> <char> <char>
#> 1:     1      A      B
#> 2:     2      A   <NA>
#> 3:     3      B   <NA>
#> 4:     4      B      A

# don't need to assign when `immutable = FALSE` (default)
dt_separate(d, x, c("c1", "c2"), immutable = FALSE)
#>        y     c1     c2
#>    <int> <char> <char>
#> 1:     1      A      B
#> 2:     2      A   <NA>
#> 3:     3      B   <NA>
#> 4:     4      B      A
d
#>        y     c1     c2
#>    <int> <char> <char>
#> 1:     1      A      B
#> 2:     2      A   <NA>
#> 3:     3      B   <NA>
#> 4:     4      B      A