fairlens.plot.two_column_heatmap#

two_column_heatmap(df, num_num_metric=<function pearson>, cat_num_metric=<function kruskal_wallis>, cat_cat_metric=<function cramers_v>, columns_x=None, columns_y=None)[source]#

This function creates a correlation heatmap out of a dataframe, using user provided or default correlation metrics for all possible types of pairs of series (i.e. numerical-numerical, categorical-numerical, categorical-categorical).

Parameters
  • df (pd.DataFrame) – The dataframe used for computing correlations and producing a heatmap.

  • num_num_metric (Callable[[pd.Series, pd.Series], float], optional) – The correlation metric used for numerical-numerical series pairs. Defaults to Pearson’s correlation coefficient.

  • cat_num_metric (Callable[[pd.Series, pd.Series], float], optional) – The correlation metric used for categorical-numerical series pairs. Defaults to Kruskal-Wallis’ H Test.

  • cat_cat_metric (Callable[[pd.Series, pd.Series], float], optional) – The correlation metric used for categorical-categorical series pairs. Defaults to corrected Cramer’s V statistic.

  • columns_x (Optional[List[str]]) – The sensitive dataframe column names that will be used in generating the correlation heatmap.

  • columns_y (Optional[List[str]]) – The non-sensitive dataframe column names that will be used in generating the correlation heatmap.