Title: | Implementation of Logistic Box-Cox Regression |
---|---|
Description: | Implements a logistic box-cox model. This model is fully described in Xing, L. et al. (2021) <doi:10.1002/cjs.11587>. |
Authors: | Li Xing [cre, aut], Shiyu Xu [aut], Jing Wang [aut], Kohlton Booth [aut], Xuekui Zhang [aut], Igor Burstyn [aut], Paul Gustafson [aut] |
Maintainer: | Li Xing <[email protected]> |
License: | GPL-3 |
Version: | 1.2 |
Built: | 2025-02-08 04:07:44 UTC |
Source: | https://github.com/cran/lboxcox |
This function processes the box cox transform on varibles
box_cox_new(formula, mydata, ixx, lambda)
box_cox_new(formula, mydata, ixx, lambda)
formula |
a formula of the form y ~ x + z1 + z2 where y is a binary response variable, x is a continuous predictor variable, and z1, z2, ... are covariates |
mydata |
dataset used in box cox transform |
ixx |
continuous predictor |
lambda |
lambda used in box cox transform |
data set after transform, contains transformed ixx
The depress data frame has 8,893 rows and 5 columns from the National Health and Nutrition Examination Survey (NHANES) 2009–2010.
depress
depress
Sample survey data
binary response variable indicating whether the participant has depression (=1) or not (=0)
a numeric vector giving the log-transformed total blood mercury in micro-grams per litre
0 of particiapant is female and 1 if they are male
age of the participant
a numeric vector giving the sampling-weight.
Xing, L., Zhang, X., Burstyn, I., & Gustafson, P. (2021). On logistic Box–Cox regression for flexibly estimating the shape and strength of exposure‐disease relationships. Canadian Journal of Statistics, 49(3), 808-825.
Train the given formula using a Logistic Box-Cox model.
lbc_maxlik( formula, weight_column_name, data, init = NULL, svy_lambda_vector = seq(0, 2, length = 4), init_lambda_vector = seq(0, 2, length = 100), num_cores = 1, seed, iterlim, timelim )
lbc_maxlik( formula, weight_column_name, data, init = NULL, svy_lambda_vector = seq(0, 2, length = 4), init_lambda_vector = seq(0, 2, length = 100), num_cores = 1, seed, iterlim, timelim )
formula |
a formula of the form y ~ x + z1 + z2 where y is a binary response variable, x is a continuous predictor variable, and z1, z2, ... are covariates |
weight_column_name |
the name of the column in 'data' containing the survey weights. |
data |
dataframe containing the dataset to train on |
init |
initial estimates for the coefficients. If NULL the svyglm model will be used |
svy_lambda_vector |
values of lambda used in training svyglm model. Best model is used for initial coefficient estimates. If init is not NULL this parameter is ignored. |
init_lambda_vector |
values of lambda used in finding the optimal lambda with the best loglikelihood |
num_cores |
the number of cores used when finding the best svyglm model. If init is not NULL this parameter is ignored. |
seed |
set seed for MaxLik function |
iterlim |
Maximum number of iterations of MaxLik |
timelim |
Maximum iteration time of MaxLik |
object of class 'maxLik' from the 'maxLik' package. Contains the coefficient estimates that maximizes likelhood among other statistics.
This is reliant on the following work:
Henningsen, A., Toomet, O. (2011). maxLik: A package for maximum likelihood estimation in R. Computational Statistics, 26(3), 443-458.
Microsoft Corporation, Weston, S. (2020). foreach: Provides Foreach Looping Construct. R package version 1.5.1.
Microsoft Corporation, Weston, S. (2020). doParallel: Foreach Parallel Adaptor for the 'parallel' Package. R package version 1.0.16.
Train the given formula using a Logistic Box-Cox model.
lbc_train( formula, weight_column_name, data, init = NULL, svy_lambda_vector = seq(0, 2, length = 100), num_cores = 1 )
lbc_train( formula, weight_column_name, data, init = NULL, svy_lambda_vector = seq(0, 2, length = 100), num_cores = 1 )
formula |
a formula of the form y ~ x + z1 + z2 where y is a binary response variable, x is a continuous predictor variable, and z1, z2, ... are covariates |
weight_column_name |
the name of the column in 'data' containing the survey weights. |
data |
dataframe containing the dataset to train on |
init |
initial estimates for the coefficients. If NULL the svyglm model will be used |
svy_lambda_vector |
values of lambda used in training svyglm model. Best model is used for initial coefficient estimates. If init is not NULL this parameter is ignored. |
num_cores |
the number of cores used when finding the best svyglm model. If init is not NULL this parameter is ignored. |
object of class 'maxLik' from the 'maxLik' package. Contains the coefficient estimates that maximizes likelhood among other statistics.
This is reliant on the following work:
Henningsen, A., Toomet, O. (2011). maxLik: A package for maximum likelihood estimation in R. Computational Statistics, 26(3), 443-458.
Microsoft Corporation, Weston, S. (2020). foreach: Provides Foreach Looping Construct. R package version 1.5.1.
Microsoft Corporation, Weston, S. (2020). doParallel: Foreach Parallel Adaptor for the 'parallel' Package. R package version 1.0.16.
Give the predicted p value of given LBC MaxLik model
lboxcox_maxLik.predict(myMaxLikfit, newdata, formula)
lboxcox_maxLik.predict(myMaxLikfit, newdata, formula)
myMaxLikfit |
Fitted model using lboxcox_maxLik model |
newdata |
Given data for prediction |
formula |
a formula of the form y ~ x + z1 + z2 where y is a binary response variable, x is a continuous predictor variable, and z1, z2, ... are covariates |
p value
This is reliant on the following work:
This function gives the log likelihood of the Box-Cox model. Main purpose is to be an input to the maxLik function.
LogLikeFun(bb, ixx, iyy, iw, iZZ)
LogLikeFun(bb, ixx, iyy, iw, iZZ)
bb |
current values for the intercept and slope coefficients |
ixx |
continuous predictor |
iyy |
binary outcome |
iw |
sample weight |
iZZ |
covariates to be incorporated in the model |
the log likelihood estimate for the coefficients in 'bb'
This function gives the log likelihood of the Box-Cox model. Main purpose is to be an input to the maxLik function.
LogLikeFun_new(bb, ixx, iyy, iw, iZZ)
LogLikeFun_new(bb, ixx, iyy, iw, iZZ)
bb |
current values for the intercept and slope coefficients |
ixx |
continuous predictor |
iyy |
binary outcome |
iw |
sample weight |
iZZ |
covariates to be incorporated in the model |
the log likelihood estimate for the coefficients in 'bb'
Calculates a number that represents the overall gradient measurement between the predictor and log-odds of the risk
Calculates a number that represents the overall gradient measurement between the predictor and log-odds of the risk
median_effect(formula, weight_column_name, data, trained_model) median_effect(formula, weight_column_name, data, trained_model)
median_effect(formula, weight_column_name, data, trained_model) median_effect(formula, weight_column_name, data, trained_model)
formula |
the formula used to train the logistic box-cox model |
weight_column_name |
the name of the column in 'data' containing the survey weights |
data |
dataframe containing the dataset to train on |
trained_model |
the already trained model. The output of 'lbc_train' |
This function gives the gradient of the log likelihood of the Box-Cox model. Main purpose is to be an input to the maxLik function.
ScoreFun(bb, ixx, iyy, iw, iZZ)
ScoreFun(bb, ixx, iyy, iw, iZZ)
bb |
initial values for the intercept and slope coefficients |
ixx |
continuous predictor |
iyy |
binary outcome |
iw |
sample weight |
iZZ |
covariates to be incorporated in the model |
the gradient of the log likelihood estimate for the coefficients in 'bb'
This function gives the gradient of the log likelihood of the Box-Cox model. Main purpose is to be an input to the maxLik function.
ScoreFun_new(init, ixx, iyy, iw, iZZ)
ScoreFun_new(init, ixx, iyy, iw, iZZ)
init |
initial values for the intercept and slope coefficients |
ixx |
continuous predictor |
iyy |
binary outcome |
iw |
sample weight |
iZZ |
covariates to be incorporated in the model |
the gradient of the log likelihood estimate for the coefficients in 'bb'
This function gives the initial value list used in MaxLik_ms function
svyglm_ms( formula, data, lambda_vector = seq(0, 2, length = 100), weight_column_name = NULL, num_cores = 1 )
svyglm_ms( formula, data, lambda_vector = seq(0, 2, length = 100), weight_column_name = NULL, num_cores = 1 )
formula |
formula used in model |
data |
dataframe containing the dataset to train on |
lambda_vector |
values of lambda used in training svyglm model. |
weight_column_name |
the name of the column in 'data' containing the survey weights. |
num_cores |
the number of cores used when finding the best svyglm model. |
initial value list used in MaxLik_ms function