compute_distribution_variance(x, x_type, categorical_mode='multinomial')[source]#

This function computes the variances (variances) of a given distribution, based on the type of its underlying data. Supports binary, date-like, numerical and categorical data for the distribution.

  • x (pd.Series) – The series representing the distribution for which the variance will be calculated

  • x_type (str) – This is the underlying type of the target attribute distribution and is passed to avoid errors caused by very specific groping.

  • categorical_mode (str, optional) – Allows the user to choose which method will be used for computing the first moment for categorical (and implicitly, binary) series. Can be “square”, “entropy” which will use the mode or “multinomial”, which returns the probability of each variable occuring. Defaults to “multinomial”.


The variance (or variances if considering a categorical distribution to be multinomial, for example) of the given distribution.

Return type

Union[float, pd.Series]