rbartpackages.dbarts.bart

class rbartpackages.dbarts.bart(x_train, y_train, x_test=None, *, sigest=None, sigdf=3.0, sigquant=0.9, k=2.0, power=2.0, base=0.95, splitprobs=None, binaryOffset=0.0, weights=None, ntree=200, ndpost=1000, nskip=100, printevery=100, keepevery=1, keeptrainfits=True, usequants=False, numcut=100, printcutoffs=0, verbose=True, nchain=1, nthread=1, combinechains=True, keeptrees=False, keepcall=True, sampleronly=False, seed=None, proposalprobs=None, keepsampler=None)[source]

Fit BART to continuous or binary outcomes (matrix interface).

Python interface to R’s dbarts::bart. The named numeric vector forms of splitprobs and proposalprobs are given as dictionaries in Python; a named splitprobs requires x_train to have column names (e.g. a data frame). Arguments left to None are omitted from the R call, so R computes its own defaults, described below.

In the attribute shapes below, ndpost counts the kept draws (ndpost / keepevery). Multiple chains are stacked into the ndpost axis when combined (combinechains=True, the bart default), and add a leading nchain axis otherwise.

Parameters:
  • x_train (Float64[ndarray, 'n p'] | DataFrame) – Training predictors; rows are observations. A data frame’s factor columns are expanded into indicator columns.

  • y_train (Float64[ndarray, 'n']) – Training response: continuous, or binary coded as 0/1 (which switches to a probit fit).

  • x_test (Float64[ndarray, 'm p'] | DataFrame | None, default: None) – Test predictors, with the same column structure as x_train.

  • sigest (float | None, default: None) – Rough estimate of the error SD anchoring the sigma prior; default the least-squares estimate. Continuous only.

  • sigdf (float, default: 3.0) – Degrees of freedom of the (inverse-chi-squared) sigma prior. Continuous only.

  • sigquant (float, default: 0.9) – Quantile of the sigma prior placed at sigest; closer to 1 puts more prior weight below sigest. Continuous only.

  • k (float, default: 2.0) – Number of prior SDs between f and the data extremes (+/-0.5 of the rescaled y for continuous, +/-3 on the latent scale for binary); bigger is more conservative. Can also be a chi hyperprior.

  • power (float, default: 2.0) – Exponent of the tree depth prior.

  • base (float, default: 0.95) – Scale of the tree depth prior.

  • splitprobs (dict[str, float] | Float64[ndarray, 'p'] | None, default: None) – Prior split probabilities of the variables; a dict mapping a subset of the column names to values plus a '.default' entry, or one value each. Uniform by default.

  • binaryOffset (float, default: 0.0) – Latent-scale offset for binary outcomes; P(Y = 1 | x) = Phi(f(x) + binaryOffset).

  • weights (Float64[ndarray, 'n'] | None, default: None) – Per-observation weights; the model becomes y | x ~ N(f(x), sigma^2 / w).

  • ntree (int, default: 200) – Number of trees in the sum.

  • ndpost (int, default: 1000) – Number of posterior draws; ndpost / keepevery are returned.

  • nskip (int, default: 100) – Number of burn-in iterations.

  • printevery (int, default: 100) – Interval, in draws, of the progress messages.

  • keepevery (int, default: 1) – Thinning: keep one draw out of keepevery.

  • keeptrainfits (bool, default: True) – Whether to return the training-point function draws.

  • usequants (bool, default: False) – Whether the decision rules use empirical quantiles of each predictor rather than a uniform grid over its range.

  • numcut (int | Integer[ndarray, 'p'], default: 100) – Maximum number of decision rules per predictor (a scalar recycled, or one each).

  • printcutoffs (int, default: 0) – Number of a variable’s decision rules printed before the run.

  • verbose (bool, default: True) – Whether to print progress to the R console.

  • nchain (int, default: 1) – Number of independent chains.

  • nthread (int, default: 1) – Number of threads to use.

  • combinechains (bool, default: True) – Whether the chains are stacked into the draws axis rather than kept on a leading nchain axis.

  • keeptrees (bool, default: False) – Whether the trees are kept, which predict, extract, and tree extraction require; memory-intensive.

  • keepcall (bool, default: True) – Whether the originating R call is stored in call.

  • sampleronly (bool, default: False) – Whether to build and return the underlying dbarts sampler without running it (changing the return type, so unsupported here).

  • seed (int | None, default: None) – Seed of the chains’ RNG; None (R’s NA) seeds from the clock when multi-threaded. Single-threaded, seed R with set.seed.

  • proposalprobs (dict[str, float] | Float64[ndarray, '4'] | None, default: None) – Tree-proposal probabilities, as a dict with keys 'birth_death', 'change', 'swap', and 'birth'.

  • keepsampler (bool | None, default: None) – Whether to keep the underlying sampler even without keeptrees; default keeptrees.

R documentation

title
-----

Bayesian Additive Regression Trees

name
----

bart

alias
-----

residuals.bart

keyword
-------

nonlinear

description
-----------

 BART is a Bayesian  sum-of-trees  model in which each tree is constrained by a prior to be a weak learner.


     For numeric response  y = f(x) + \epsilon y = f(x) + \epsilon , where  \epsilon \sim N(0, \sigma^2) \epsilon ~ N(0, \sigma^2) .
     For binary response  y ,  P(Y = 1 \mid x) = \Phi(f(x)) P(Y = 1 | x) = \Phi(f(x)) , where  \Phi  denotes the standard normal cdf (probit link).



usage
-----


 bart(
     x.train, y.train, x.test = matrix(0.0, 0, 0),
     sigest = NA, sigdf = 3, sigquant = 0.90,
     k = 2.0,
     power = 2.0, base = 0.95, splitprobs = 1 / numvars,
     binaryOffset = 0.0, weights = NULL,
     ntree = 200,
     ndpost = 1000, nskip = 100,
     printevery = 100, keepevery = 1, keeptrainfits = TRUE,
     usequants = FALSE, numcut = 100, printcutoffs = 0,
     verbose = TRUE, nchain = 1, nthread = 1, combinechains = TRUE,
     keeptrees = FALSE, keepcall = TRUE, sampleronly = FALSE,
     seed = NA_integer_,
     proposalprobs = NULL,
     keepsampler = keeptrees)

 bart2(
     formula, data, test, subset, weights, offset, offset.test = offset,
     sigest = NA_real_, sigdf = 3.0, sigquant = 0.90,
     k = NULL,
     power = 2.0, base = 0.95, split.probs = 1 / num.vars,
     n.trees = 75L,
     n.samples = 500L, n.burn = 500L,
     n.chains = 4L, n.threads = min(dbarts::guessNumCores(), n.chains),
     combineChains = FALSE,
     n.cuts = 100L, useQuantiles = FALSE,
     n.thin = 1L, keepTrainingFits = TRUE,
     printEvery = 100L, printCutoffs = 0L,
     verbose = TRUE, keepTrees = FALSE,
     keepCall = TRUE, samplerOnly = FALSE,
     seed = NA_integer_,
     proposal.probs = NULL,
     keepSampler = keepTrees,
      )

 plot bart (
     x,
     plquants = c(0.05, 0.95), cols = c('blue', 'black'),
      )

 predict bart (
     object, newdata, offset, weights,
     type = c("ev", "ppd", "bart"),
     combineChains = TRUE,
     n.threads,
      )

 extract(object,  )
 extract bart (
     object,
     type = c("ev", "ppd", "bart", "trees"),
     sample = c("train", "test"),
     combineChains = TRUE,  )

 fitted bart (
     object,
     type = c("ev", "ppd", "bart"),
     sample = c("train", "test"),
      )

 residuals bart (object,  )


arguments
---------


     x.train
      Explanatory variables for training (in sample) data. May be a matrix or a data frame, with rows corresponding to observations and columns to variables. If a variable is a factor in a data frame, it is replaced with dummies. Note that  q  dummies are created if  q > 2  and one dummy is created if  q = 2 , where  q  is the number of levels of the factor.

     y.train
      Dependent variable for training (in sample) data. If  y.train  is numeric a continous response model is fit (normal errors). If  y.train  is a binary factor or has only values 0 and 1, then a binary response model with a probit link is fit.

     x.test
      Explanatory variables for test (out of sample) data. Should have same column structure as  x.train .  bart  will generate draws of  f(x)  for each  x  which is a row of  x.test .

     sigest
      For continuous response models, an estimate of the error variance,  \sigma^2 , used to calibrate an inverse-chi-squared prior used on that parameter. If not supplied, the least-squares estimate is derived instead. See  sigquant  for more information. Not applicable when  y  is binary.

     sigdf
      Degrees of freedom for error variance prior. Not applicable when  y  is binary.

     sigquant
      The quantile of the error variance prior that the rough estimate ( sigest ) is placed at. The closer the quantile is to 1, the more aggresive the fit will be as you are putting more prior weight on error standard deviations ( \sigma ) less than the rough estimate. Not applicable when  y  is binary.

     k
      For numeric  y ,  k  is the number of prior standard deviations  E(Y|x) = f(x)  is away from  \pm 0.5 +/- 0.5 . The response ( y.train ) is internally scaled to range from  -0.5  to  0.5 . For binary  y ,  k  is the number of prior standard deviations  f(x)  is away from  \pm 3 +/- 3 . In both cases, the bigger  k  is, the more conservative the fitting will be. The value can be either a fixed number, or the a  hyperprior  of the form  chi(degreesOfFreedom = 1.25, scale = Inf) . For  bart2 , the default of  NULL  uses the value 2 for continuous reponses and a  chi  hyperprior for binary ones. The default  chi  hyperprior is improper, and slightly penalizes small values of  k .

     power
      Power parameter for tree prior.

     base
      Base parameter for tree prior.

     splitprobs, split.probs
      Prior and transition probabilities of variables used to generate splits. Can be missing/empty/ NULL  for equiprobability, a numeric vector of length equal to the number variables, or a named numeric vector with only a subset of the variables specified and a  .default  named value. Values given for factor variables are replicated for each resulting column in the generated model matrix.  numvars  and  num.vars  symbols will be rebound before execution to the number of columns in the model matrix.

     binaryOffset
      Used for binary  y . When present, the model is  P(Y = 1 \mid x) = \Phi(f(x) + \mathrm{binaryOffset}) P(Y = 1 | x) = \Phi(f(x) + binaryOffset) , allowing fits with probabilities shrunk towards values other than  0.5 .

     weights
       An optional vector of weights to be used in the fitting process. When present, BART fits a model with observations  y \mid x \sim N(f(x), \sigma^2 / w) y | x ~ N(f(x), \sigma^2 / w) , where  f(x)  is the unknown function.

     ntree, n.trees
      The number of trees in the sum-of-trees formulation.

     ndpost, n.samples
      The number of posterior draws after burn in,  ndpost / keepevery  will actually be returned.

     nskip, n.burn
      Number of MCMC iterations to be treated as burn in.

     printevery, printEvery
      As the MCMC runs, a message is printed every  printevery  draws.

     keepevery, n.thin
      Every  keepevery  draw is kept to be returned to the user. Useful for  thinning  samples.

     keeptrainfits, keepTrainingFits
      If  TRUE  the draws of  f(x)  for  x  corresponding to the rows of  x.train  are returned.

     usequants, useQuantiles
      When  TRUE , determine tree decision rules using estimated quantiles derived from the  x.train  variables. When  FALSE , splits are determined using values equally spaced across the range of a variable. See details for more information.

     numcut, n.cuts
      The maximum number of possible values used in decision rules (see  usequants , details). If a single number, it is recycled for all variables; otherwise must be a vector of length equal to  ncol(x.train) . Fewer rules may be used if a covariate lacks enough unique values.

     printcutoffs, printCutoffs
      The number of cutoff rules to printed to screen before the MCMC is run. Given a single integer, the same value will be used for all variables. If 0, nothing is printed.

     verbose
      Logical; if  FALSE  supress printing.

     nchain, n.chains
      Integer specifying how many independent tree sets and fits should be calculated.

     nthread, n.threads
      Integer specifying how many threads to use. Depending on the CPU architecture, using more than the number of chains can degrade performance for small/medium data sets. As such some calculations may be executed single threaded regardless.

     combinechains, combineChains
      Logical; if  TRUE , samples will be returned in arrays of dimensions equal to  nchain   \times   ndpost   \times  number of observations.

     keeptrees, keepTrees
      Logical; must be  TRUE  in order to use  predict  with the result of a  bart  fit. Note that for models with a large number of observations or a large number of trees, keeping the trees can be very memory intensive.

     keepcall, keepCall
      Logical; if  FALSE , returned object will have  call  set to  call("NULL") , otherwise the call used to instantiate BART.

     seed
      Optional integer specifying the desired pRNG  seed . It should not be needed when running single-threaded -  set.seed  will suffice, and can be used to obtain reproducible results when multi-threaded. See Reproducibility section below.

     proposalprobs, proposal.probs
      Named numeric vector or  NULL , optionally specifying the proposal rules and their probabilities. Elements should be  "birth_death" ,  "change" , and  "swap"  to control tree change proposals, and  "birth"  to give the relative frequency of birth/death in the  "birth_death"  step. Defaults are 0.5, 0.1, 0.4, and 0.5 respectively.

     keepsampler, keepSampler
      Logical that can be used to save the underlying  dbartsSampler-class  object even if  keepTrees  is false.

     formula
      The same as  x.train , the name reflecting that a formula object can be used instead.

     data
      The same as  y.train , the name reflecting that a data frame can be specified when a formula is given instead.

     test
      The same as  x.train . Can be missing.

     subset
      A vector of logicals or indicies used to subset of the data. Can be missing.

     offset
      The same as  binaryOffset . Can be missing.

     offset.test
      A vector of offsets to be used with test data, in case it is different than the training offset. If  offest  is missing, defaults to  NULL .

     object
      An object of class  bart , returned from either the function  bart  or  bart2 .

     newdata
      Test data for prediction. Obeys all the same rules as  x.train  but cannot be missing.

     sampleronly, samplerOnly
      Builds the sampler from its arguments and returns it without running it. Useful to use the  bart2  interface in more complicated models.

     x
      Object of class  bart , returned by function  bart , which contains the information to be plotted.

     plquants
      In the plots, beliefs about  f(x)  are indicated by plotting the posterior median and a lower and upper quantile.  plquants  is a double vector of length two giving the lower and upper quantiles.

     cols
      Vector of two colors. First color is used to plot the median of  f(x)  and the second color is used to plot the lower and upper quantiles.

     type
      The quantity to be returned by generic functions. Options are  "ev"  - samples from the posterior of the individual level expected value,  "bart"  - the sum of trees component; same as  "ev"  for linear models but on the probit scale for binary ones,  "ppd"  - samples from the posterior predictive distribution, and  "trees"  - a data frame with tree information for when model was fit with  keepTrees  equal to  TRUE . To synergize with  predict.glm ,  "response"  can be used as a synonym for  "ev"  and  "link"  can be used as a synonym for  "bart" . For information on extracting trees, see the subsection below.

     sample
      Either  "train"  or  "test" .


      Additional arguments passed on to  plot ,  dbartsControl , or  extract  when  type  is  "trees" . Not used in  predict .



details
-------


   BART is an Bayesian MCMC method. At each MCMC interation, we produce a draw from the joint posterior  (f, \sigma) \mid (x, y) (f, \sigma) | (x, y)  in the numeric  y  case and just  f  in the binary  y  case.

   Thus, unlike a lot of other modeling methods in R,  bart  does not produce a single model object from which fits and summaries may be extracted. The output consists of values  f^*(x) f*(x)  (and  \sigma^* \sigma*  in the numeric case) where * denotes a particular draw. The  x  is either a row from the training data ( x.train ) or the test data ( x.test ).

    Decision Rules
     Decision rules for any tree are of the form  x \le c   vs.  x > c  for each  x  corresponding to a column of  x.train .  usequants  determines the means by which the set of possible  c  is determined. If  usequants  is  TRUE , then the  c  are a subset of the values interpolated half-way between the unique, sorted values obtained from the corresponding column of  x.train . If  usequants  is  FALSE , the cutoffs are equally spaced across the range of values taken on by the corresponding column of  x.train .

     The number of possible values of  c  is determined by  numcut . If  usequants  is  FALSE ,  numcut  equally spaced cutoffs are used covering the range of values in the corresponding column of  x.train . If  usequants  is  TRUE , then for a variable the minimum of  numcut  and one less than the number of unique elements for that variable are used.

    End-node prior parameter  k
     The amount of shrinkage of the node parameters is controlled by  k .  k  can be given as either a fixed, positive number, or as any value that can be used to build a supported hyperprior. At present, only  \chi_\nu s  priors are supported, where  \nu  is a degrees of freedom and  s  is a scale. Both values must be positive, however the scale can be infinite which yields an improper prior, which is interpretted as just the polynomial part of the distribution. If  nu  is 1 and  s  is  \infty , the prior is  flat .

     For BART on binary outcomes, the degree of overfitting can be highly sensitive to  k  so it is encouraged to consider a number of values. The default hyperprior for binary BART,  chi(1.25, Inf) , has been shown to work well in a large number of datasets, however crossvalidation may be helpful. Running for a short time with a flat prior may be helpful to see the range of values of  k  that are consistent with the data.

    Generics
      bart  and  rbart_vi  support  fitted  to return the posterior mean of a predicted quantity, as well as  predict  to return a set of posterior samples for a different sample. In addition, the  extract  generic can be used to obtain the posterior samples for the training data or test data supplied during the initial fit.

     Using  predict  with a  bart  object requires that it be fitted with the option  keeptrees / keepTrees  as  TRUE . Keeping the trees for a fit can require a sizeable amount of memory and is off by default.

     All generics return values on the scale of expected value of the response by default. This means that  predict ,  extract , and  fitted  for binary outcomes return probabilities unless specifically the sum-of-trees component is requested ( type = "bart" ). This is in contrast to  yhat.train / yhat.test  that are returned with the fitted model.


    Saving
      save ing and  load ing fitted BART objects for use with  predict  requires that R's serialization mechanism be able to access the underlying trees, in addition to being fit with  keeptrees / keepTrees  as  TRUE . For memory purposes, the trees are not stored as R objects unless specifically requested. To do this, one must  touch  the sampler's state object before saving, e.g. for a fitted object  bartFit , execute  invisible(bartFit$fit$state) .


    Reproducibility
     Behavior differs when running multi- and single-threaded, as the pseudo random number generators (pRNG) used by R are not thread safe. When single-threaded, R's built-in generator is used; if set at the start, the global  .Random.seed  will be used and its value updated as samples are drawn. When multi-threaded, the default behavior is to draw new random seeds for each thread using the clock and use thread-specific pRNGs.

     This behavior can be modified by setting  seed , or by using   to pass arguments to  dbartsControl . For the single-threaded case, a new pRNG is built using that seed that is separate from R's native generator. As such, the global state will not be modified by subsequent calls to the generator. For multi-threaded, the seeds for threads are drawn sequentially using the supplied seed, and will again be separate from R's native generator.

     Consequently, the  seed  argument is not needed when running single-threaded -  set.seed  will suffice. However, when multi-threaded the  seed  argument can be used to obtain reproducible results.


    Extracting Trees
     When a model is fit with  keeptrees  ( bart ) or  keepTrees  ( bart2 ) equal to  TRUE , the generic  extract  can be used to retrieve a data frame containing the tree fit information. In this case,  extract  will accept the additional, optional arguments:  chainNums ,  sampleNums , and  treeNums . Each should be an integer vector detailing the desired trees to be returned.

     The result of  extract  will be a data frame with columns:

          sample ,  chain ,  tree  - index variables
          n  - number of observations in node
          var  - either the index of the variable used for splitting or -1 if the node is a leaf
          value  - either the value such that observations less than or equal to it are sent down the left path of the tree or the predicted value for a leaf node

     The order of nodes in the result corresponds to a depth-first traversal, going down the left-side first. The names of variables used in splitting can be recovered by examining the column names of the  fit$data@x  element of a fitted  bart  or  bart2  model. See the package vignette  Working with dbarts Saved Trees .



value
-----


    bart  and  bart2  return lists assigned the class  bart . For applicable quantities,  ndpost / keepevery  samples are returned. In the numeric  y  case, the list has components:

    yhat.train
     A array/matrix of posterior samples. The  (i, j, k)  value is the  j th draw of the posterior of  f  evaluated at the  k th row of  x.train  (i.e.  f^*(x_k) f(x_k) ) corresponding to chain  i . When  nchain  is one or  combinechains  is  TRUE , the result is a collapsed down to a matrix.

    yhat.test
     Same as  yhat.train  but now the  x s are the rows of the test data.

    yhat.train.mean
     Vector of means of  yhat.train  across columns and chains, with length equal to the number of training observations.

    yhat.test.mean
     Vector of means of  yhat.test  across columns and chains.

    sigma
     Matrix of posterior samples of  sigma , the residual/error standard deviation. Dimensions are equal to the number of chains times the numbers of samples unless  nchain  is one or  combinechains  is  TRUE .

    first.sigma
     Burn-in draws of  sigma .

    varcount
     A matrix with number of rows equal to the number of kept draws and each column corresponding to a training variable. Contains the total count of the number of times that variable is used in a tree decision rule (over all trees).

    sigest
     The rough error standard deviation ( \sigma ) used in the prior.

    y
     The input dependent vector of values for the dependent variable. This is used in  plot.bart .

    fit
     Optional sampler object which stores the values of the tree splits. Required for using  predict  and only stored if  keeptrees  or  keepsampler  is  TRUE .

    n.chains
     Information that can be lost if  combinechains  is  TRUE  is tracked here.

    k
     Optional matrix of posterior samples of  k . Only present when  k  is modeled, i.e. there is a hyperprior.

    first.k
     Burn-in draws of  k , if modeled.


   In the binary  y  case, the returned list has the components  yhat.train ,  yhat.test , and  varcount  as above.  In addition the list has a  binaryOffset  component giving the value used.

   Note that in the binary  y , case  yhat.train  and  yhat.test  are  f(x) + \mathrm{binaryOffset} f(x) + binaryOffset . For draws of the probability  P(Y = 1 | x) , apply the normal cdf ( pnorm ) to these values.

   The  plot  method sets  mfrow  to  c(1, 2)  and makes two plots. The first plot is the sequence of kept draws of  \sigma  including the burn-in draws. Initially these draws will decline as BART finds a good fit and then level off when the MCMC has burnt in. The second plot has  y  on the horizontal axis and posterior intervals for the corresponding  f(x)  on the vertical axis.


author
------


 Hugh Chipman:  hugh.chipman@gmail.com ,
 Robert McCulloch:  robert.mcculloch1@gmail.com ,
 Vincent Dorie:  vdorie@gmail.com .


references
----------


 Chipman, H., George, E., and McCulloch, R. (2009)
    BART: Bayesian Additive Regression Trees.

 Chipman, H., George, E., and McCulloch R. (2006)
    Bayesian Ensemble Learning.
    Advances in Neural Information Processing Systems 19,
    Scholkopf, Platt and Hoffman, Eds., MIT Press, Cambridge, MA, 265-272.

 both of the above at:
 https://www.rob-mcculloch.org

 Friedman, J.H. (1991)
    Multivariate adaptive regression splines.
          The Annals of Statistics ,  19 , 1--67.


seealso
-------


 pdbart


examples
--------


 ## simulate data (example from Friedman MARS paper)
 ## y = f(x) + epsilon , epsilon ~ N(0, sigma)
 ## x consists of 10 variables, only first 5 matter

 f <- function(x) {
     10 * sin(pi * x[,1] * x[,2]) + 20 * (x[,3] - 0.5)^2 +
         10 * x[,4] + 5 * x[,5]
 }

 set.seed(99)
 sigma <- 1.0
 n     <- 100

 x  <- matrix(runif(n * 10), n, 10)
 Ey <- f(x)
 y  <- rnorm(n, Ey, sigma)

 ## run BART
 set.seed(99)
 bartFit <- bart(x, y)

 plot(bartFit)

 ## compare BART fit to linear matter and truth = Ey
 lmFit <- lm(y ~ ., data.frame(x, y))

 fitmat <- cbind(y, Ey, lmFit$fitted, bartFit$yhat.train.mean)
 colnames(fitmat) <- c('y', 'Ey', 'lm', 'bart')
 print(cor(fitmat))
fit: dbarts | None = None

The sampler as a dbarts object, kept only with keeptrees or keepsampler.

binaryOffset: Float64[ndarray, 'n'] | None = None

Per-observation offset on the latent probit scale (binary outcomes only).

extract(*, type=None, sample=None, combineChains=None)[source]

Return the kept draws for the training (default) or test points.

Like predict, the draws are on the expected-value scale by default. With type='trees' (requires keeptrees=True) the tree structures are returned as a data frame instead. Arguments left to None are omitted from the R call.

Parameters:
  • type (Literal['ev', 'ppd', 'bart', 'trees'] | None, default: None) – Quantity returned: 'ev', 'ppd', 'bart' (see predict), or 'trees' for the tree structures.

  • sample (Literal['train', 'test'] | None, default: None) – Which points to extract: 'train' or 'test'.

  • combineChains (bool | None, default: None) – Whether the chains are stacked into the draws axis rather than kept on a leading nchain axis.

Returns:

Float64[ndarray, 'ndpost n'] | Float64[ndarray, 'nchain ndpost n'] | DataFrame – The draws at the requested points, or the tree-structure data frame with type='trees'.

first_k: Float64[ndarray, 'nskip'] | Float64[ndarray, 'nchain nskip'] | None = None

Burn-in draws of k (only when k is given a hyperprior).

first_sigma: Float64[ndarray, 'nskip'] | Float64[ndarray, 'nchain nskip'] | None = None

Burn-in error-SD draws (continuous outcomes only).

fitted(*, type=None, sample=None)[source]

Return the posterior mean for the training (default) or test points.

Parameters:
  • type (Literal['ev', 'ppd', 'bart'] | None, default: None) – Quantity averaged: 'ev', 'ppd', or 'bart' (see predict).

  • sample (Literal['train', 'test'] | None, default: None) – Which points to use: 'train' or 'test'.

Returns:

Float64[ndarray, 'n']The posterior mean at the requested points.

k: Float64[ndarray, 'ndpost'] | Float64[ndarray, 'nchain ndpost'] | None = None

End-node-prior k draws (only when k is given a hyperprior).

n_chains: int | None = None

Number of MCMC chains; None when the sampler is kept in fit.

sigest: float | None = None

Rough residual SD used to set the sigma prior (continuous outcomes only).

sigma: Float64[ndarray, 'ndpost'] | Float64[ndarray, 'nchain ndpost'] | None = None

Kept error-SD draws, continuous outcomes only (burn-in is in first_sigma).

y: Float64[ndarray, 'n'] | None = None

The training responses (continuous outcomes only).

yhat_test: Float64[ndarray, 'ndpost m'] | Float64[ndarray, 'nchain ndpost m'] | None = None

Test-point posterior function draws; None without test data.

yhat_test_mean: Float64[ndarray, 'm'] | None = None

Posterior mean of yhat_test (continuous outcomes with test data only).

yhat_train: Float64[ndarray, 'ndpost n'] | Float64[ndarray, 'nchain ndpost n'] | None = None

Training-point posterior function draws (latent probit scale for binary).

None with keeptrainfits=False.

yhat_train_mean: Float64[ndarray, 'n'] | None = None

Posterior mean of yhat_train (continuous outcomes only).

call: LangVector

The R call that created the fit.

With keepcall=False this is a dummy NULL() call, not None.

varcount: Int32[ndarray, 'ndpost p'] | Int32[ndarray, 'nchain ndpost p']

Per-draw count of splits on each variable, summed over trees.

predict(newdata, *, offset=None, weights=None, type=None, combineChains=None, n_threads=None)[source]

Compute predictions at new points; requires a keeptrees=True fit.

Arguments left to None are omitted from the R call, so R computes its own defaults, described below.

Parameters:
  • newdata (Float64[ndarray, 'm p'] | DataFrame) – New predictors, with the same column structure as x_train.

  • offset (Float64[ndarray, 'm'] | float | None, default: None) – Offset added to the predictions.

  • weights (Float64[ndarray, 'm'] | None, default: None) – Per-observation weights of the predictive distribution.

  • type (Literal['ev', 'ppd', 'bart'] | None, default: None) – Quantity returned: 'ev' (expected value, i.e. probabilities for binary fits), 'ppd' (posterior predictive draws), or 'bart' (the latent sum-of-trees).

  • combineChains (bool | None, default: None) – Whether the chains are stacked into the draws axis rather than kept on a leading nchain axis.

  • n_threads (int | None, default: None) – Number of threads to use.

Returns:

Float64[ndarray, 'ndpost m'] | Float64[ndarray, 'nchain ndpost m'] – The predictions at newdata, on the expected-value scale unless type='bart'.