rbartpackages.BART3.bartModelMatrix¶
- class rbartpackages.BART3.bartModelMatrix(X, numcut=0, *, usequants=False, type=7, rm_const=False, cont=False, xinfo=None)[source]¶
Convert covariates to a matrix and compute the BART cutpoints.
Python interface to R’s
BART3::bartModelMatrix. With the defaultnumcut=0the constructor returns the bare design matrix instead of a class instance; otherwise the instance carries the matrix together with the cutpoints metadata.- Parameters:
X (
Float64[ndarray, 'N p']|DataFrame) – The covariates to convert; rows are observations. A dataframe’s factor columns are expanded into indicator columns.numcut (
int, default:0) – Maximum number of cutpoints per variable; 0 means return the bare matrix without computing cutpoints.usequants (
bool, default:False) – Whether the cutpoints are quantiles of the data rather than uniformly spaced over its range.type (
int, default:7) – The quantile algorithm used withusequants(see R’squantile).rm_const (
bool, default:False) – Whether to remove the constant columns fromX(they are flagged inrm_consteither way).cont (
bool, default:False) – Whether to treat all variables as continuous, spacingnumcutcutpoints over the range even when fewer unique values would do.xinfo (
Float64[ndarray, 'p numcut']|None, default:None) – Cutpoints to use, one row per variable; overrides the computed ones.
R documentation
title ----- Create a matrix out of a vector or data.frame name ---- bartModelMatrix alias ----- bartModelMatrix keyword ------- utilities description ----------- The external BART functions operate on matrices in memory. Therefore, if the user submits a vector or data.frame, then this function converts it to a matrix. Also, it determines the number of cutpoints necessary for each column when asked to do so. usage ----- bartModelMatrix(X, numcut=0L, usequants=FALSE, type=7, rm.const=FALSE, cont=FALSE, xinfo=NULL) arguments --------- X A vector or data.frame to create the matrix from. numcut The maximum number of cutpoints to consider. If numcut=0 , then just return a matrix; otherwise, return a list containing a matrix X , a vector numcut and a list xinfo . usequants If usequants is FALSE , then the cutpoints in xinfo are generated uniformly; otherwise, if TRUE , then quantiles are used for the cutpoints. type Determines which quantile algorithm is employed. rm.const Whether or not to remove constant variables. cont Whether or not to assume all variables are continuous. xinfo You can provide the cutpoints to BART or let BART choose them for you. To provide them, use the xinfo argument to specify a list (matrix) where the items (rows) are the covariates and the contents of the items (columns) are the cutpoints. seealso ------- class.ind examples -------- set.seed(99) a <- rbinom(10, 4, 0.4) table(a) x <- runif(10) df <- data.frame(a=factor(a), x=x) b <- bartModelMatrix(df) b b <- bartModelMatrix(df, numcut=9) b b <- bartModelMatrix(df, numcut=9, usequants=TRUE) b f <- bartModelMatrix(as.character(a))
- X: Float64[ndarray, 'N p']¶
Design matrix, with vectors and data frames coerced to numeric and factors expanded to indicators.
- numcut: Int32[ndarray, 'p']¶
Number of cutpoints chosen per column.
- rm_const: Int32[ndarray, '<=p']¶
0-based indices of the non-constant columns of the expanded design.
The indices refer to the columns of
Xbefore removal:rm.const=Trueremoves the constant columns fromX,numcutandxinfo, while the default only detects them.
- xinfo: Float64[ndarray, 'p numcut']¶
Per-column cutpoint grid, NaN-padded to the maximum cut count.