Check_Dupe_Columns
checks for duplicated columns, columns with the same values, and columns with single value. Currently not updated for multi-terms
Arguments
- df
a data.table containing the columns of interest
- cols
columns to check
- term_n
term numbers for each element of the model
- verbose
integer valued 0-4 controlling what information is printed to the terminal. Each level includes the lower levels. 0: silent, 1: errors printed, 2: warnings printed, 3: notes printed, 4: debug information printed. Errors are situations that stop the regression, warnings are situations that assume default values that the user might not have intended, notes provide information on regression progress, and debug prints out C++ progress and intermediate results. The default level is 2 and True/False is converted to 3/0.
- factor_check
a boolean used to skip comparing columns of the form ?_? with the same initial string, which is used for factored columns
See also
Other Data Cleaning Functions:
Check_Trunc()
,
Check_Verbose()
,
Convert_Model_Eq()
,
Correct_Formula_Order()
,
Date_Shift()
,
Def_Control()
,
Def_Control_Guess()
,
Def_model_control()
,
Def_modelform_fix()
,
Event_Count_Gen()
,
Event_Time_Gen()
,
Joint_Multiple_Events()
,
Replace_Missing()
,
Time_Since()
,
factorize()
,
factorize_par()
,
gen_time_dep()
,
interact_them()
Examples
library(data.table)
a <- c(0, 1, 2, 3, 4, 5, 6)
b <- c(1, 2, 3, 4, 5, 6, 7)
c <- c(0, 1, 2, 1, 0, 1, 0)
df <- data.table::data.table("a" = a, "b" = b, "c" = c)
cols <- c("a", "b", "c")
term_n <- c(0, 0, 1)
unique_cols <- Check_Dupe_Columns(df, cols, term_n)