fairlens.plot.distr_plot#

distr_plot(df, target_attr, groups, distr_type=None, show_hist=None, show_curve=None, shade=True, normalize=False, cmap=None, ax=None)[source]#

Plot the distribution of the groups with respect to the target attribute.

Parameters

df (pd.DataFrame) – The input dataframe.
target_attr (str) – The target attribute.
groups (Sequence[Union[Mapping[str, List[Any]], pd.Series]]) – A list of groups of interest. Each group can be a mapping / dict from attribute to value or a predicate itself, i.e. pandas series consisting of bools which can be used as a predicate to index a subgroup from the dataframe. Examples: {“Sex”: [“Male”]}, df[“Sex”] == “Female”
distr_type (Optional[str]) – The type of distribution of the target attribute. Can take values from [“categorical”, “continuous”, “binary”, “datetime”]. If None, the type of distribution is inferred based on the data in the column. Defaults to None.
show_hist (Optional[bool], optional) – Shows the histogram if True. Defaults to True if the data is categorical or binary.
show_curve (Optional[bool], optional) – Shows a KDE if True. Defaults to True if the data is continuous or a date.
shade (bool, optional) – Shades the curve if True. Defaults to True.
normalize (bool, optional) – Normalizes the counts so the sum of the bar heights is 1. Defaults to False.
cmap (Optional[Sequence[Tuple[float, float, float]]], optional) – A sequence of RGB tuples used to colour the histograms. If None seaborn’s default pallete will be used. Defaults to None.
ax (Optional[matplotlib.axes.Axes], optional) – An axis to plot the figure on. Defaults to plt.gca(). Defaults to None.

Returns

The matplotlib axis containing the plot.

Return type

matplotlib.axes.Axes

Examples

>>> df = pd.read_csv("datasets/compas.csv")
>>> g1 = {"Ethnicity": ["African-American"]}
>>> g2 = {"Ethnicity": ["Caucasian"]}
>>> distr_plot(df, "RawScore", [g1, g2])
>>> plt.show()

fairlens.plot.attr_distr_plot

fairlens.plot.mult_distr_plot