factorize
uses user provided list of columns to define new parameter for each unique value and update the data.table.
Not for interaction terms
Arguments
- df
a data.table containing the columns of interest
- col_list
an array of column names that should have factor terms defined
- verbose
integer valued 0-4 controlling what information is printed to the terminal. Each level includes the lower levels. 0: silent, 1: errors printed, 2: warnings printed, 3: notes printed, 4: debug information printed. Errors are situations that stop the regression, warnings are situations that assume default values that the user might not have intended, notes provide information on regression progress, and debug prints out C++ progress and intermediate results. The default level is 2 and True/False is converted to 3/0.
Value
returns a list with two named fields. df for the updated dataframe, and cols for the new column names
See also
Other Data Cleaning Functions:
Check_Dupe_Columns()
,
Check_Trunc()
,
Check_Verbose()
,
Convert_Model_Eq()
,
Correct_Formula_Order()
,
Date_Shift()
,
Def_Control()
,
Def_Control_Guess()
,
Def_model_control()
,
Def_modelform_fix()
,
Event_Count_Gen()
,
Event_Time_Gen()
,
Joint_Multiple_Events()
,
Replace_Missing()
,
Time_Since()
,
factorize_par()
,
gen_time_dep()
,
interact_them()
Examples
library(data.table)
a <- c(0, 1, 2, 3, 4, 5, 6)
b <- c(1, 2, 3, 4, 5, 6, 7)
c <- c(0, 1, 2, 1, 0, 1, 0)
df <- data.table::data.table("a" = a, "b" = b, "c" = c)
col_list <- c("c")
val <- factorize(df, col_list)
df <- val$df
new_col <- val$cols