Fairness Scorer#

The class fairlens.FairnessScorer can be used to automatically analyze a dataset and assess fairness.

The fairness scorer takes in 2 parameters, the dataset in a dataframe, and a target variable. The target variable can be the output column of the dataset or the result of a prediction made by a model on the dataset. The sensitive attributes are automatically detected, but users can be explicit and pass them in. Additionally, the user can choose to be explicit about the the type of data in each column i.e. categorical, continuous, binary, date; however, by default this is automatically inferred.

The fairness scorer aims to measure group fairness; i.e. if the chosen target column is independent of the sensitive attributes in a row. If the distribution of the target column in each subgroup of sensitive values isn’t significantly different to the overall distribution, then we assume independence. Hence to find the most prejudiced subgroups we can use statistical distances and hypothesis tests to measure the significance of the skew in each subgroup.

Report Generation#

The fairness scorer supports three different methods for report generation.

The plot_distributions method produces plots of the distribution of the target variable in each subgroup in a column, for each column. This is useful for understanding the different distributions of protected groups and identifying any inherent biases present in a dataset.

In [1]: import pandas as pd

In [2]: import fairlens as fl

In [3]: df = pd.read_csv("../datasets/compas.csv")

In [4]: df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20281 entries, 0 to 20280
Data columns (total 22 columns):
 #   Column                   Non-Null Count  Dtype  
---  ------                   --------------  -----  
 0   PersonID                 20281 non-null  int64  
 1   AssessmentID             20281 non-null  int64  
 2   CaseID                   20281 non-null  int64  
 3   Agency                   20281 non-null  object 
 4   LastName                 20281 non-null  object 
 5   FirstName                20281 non-null  object 
 6   MiddleName               5216 non-null   object 
 7   Sex                      20281 non-null  object 
 8   Ethnicity                20281 non-null  object 
 9   DateOfBirth              20281 non-null  object 
 10  ScaleSet                 20281 non-null  object 
 11  AssessmentReason         20281 non-null  object 
 12  Language                 20281 non-null  object 
 13  LegalStatus              20281 non-null  object 
 14  CustodyStatus            20281 non-null  object 
 15  MaritalStatus            20281 non-null  object 
 16  ScreeningDate            20281 non-null  object 
 17  RecSupervisionLevelText  20281 non-null  object 
 18  RawScore                 20281 non-null  float64
 19  DecileScore              20281 non-null  int64  
 20  ScoreText                20245 non-null  object 
 21  AssessmentType           20281 non-null  object 
dtypes: float64(1), int64(4), object(17)
memory usage: 3.4+ MB

In [5]: fscorer = fl.FairnessScorer(df, "RawScore", ["Ethnicity", "Sex"])

In [6]: fscorer.plot_distributions()

The demographic_report estimates the extent to which the distribution of the target column is independent of the sensitive attributes. This is done by using a suitable distance metric to test the significance of the distance between the distributions of each senstitive demographic and the overall population. In the below case, our target variable, “RawScore”, is continuous, so the Kolmogorov Smirnov test is carried out by default. This process is done on all possible demographics for the given sensitive attributes. This report produces a list of the most prejudiced groups (i.e. largest distance / p-value) by this criterion.

In [7]: fscorer.demographic_report()
Sensitive Attributes: ['Ethnicity', 'Sex']

                 Group Distance  Proportion  Counts   P-Value
African-American, Male    0.201    0.353138    7162 4.03e-188
      African-American    0.156    0.444899    9023 3.25e-133
              Hispanic    0.164    0.143681    2914  5.07e-60
             Caucasian    0.107    0.358020    7261  2.33e-53
                Female    0.127    0.219072    4443  1.53e-51
     Caucasian, Female    0.176    0.089295    1811  1.70e-45
      Hispanic, Female    0.279    0.028450     577  4.77e-39
                 Other    0.220    0.042601     864  9.90e-36
        Hispanic, Male    0.136    0.115231    2337  2.32e-34
       Caucasian, Male    0.087    0.268724    5450  7.19e-29

Weighted Mean Statistical Distance: 0.11587609533968031

Users can also opt to measure the distance between the distributions in the subgroup and the data without the subgroup as an alternative.

In [8]: fscorer.demographic_report(method="dist_to_rest")
Sensitive Attributes: ['Ethnicity', 'Sex']

                 Group Distance  Proportion  Counts   P-Value
      African-American    0.281    0.444899    9023  0.00e+00
African-American, Male    0.311    0.353138    7162  0.00e+00
             Caucasian    0.166    0.358020    7261 8.93e-113
                  Male    0.162    0.780928   15838  2.14e-80
                Female    0.162    0.219072    4443  2.14e-80
              Hispanic    0.191    0.143681    2914  3.54e-80
     Caucasian, Female    0.193    0.089295    1811  2.20e-54
       Caucasian, Male    0.119    0.268724    5450  7.90e-50
        Hispanic, Male    0.154    0.115231    2337  3.05e-43
      Hispanic, Female    0.287    0.028450     577  2.19e-41

Weighted Mean Statistical Distance: 0.19632255130989817

Scoring API#

Individual functions of the demographic report can be called for further analysis.

In [9]: sensitive_attrs = ["Ethnicity", "Sex"]

In [10]: target_attr = "RawScore"

In [11]: fscorer = fl.FairnessScorer(df, target_attr, sensitive_attrs)

In [12]: df_dist = fscorer.distribution_score()

In [13]: df_dist
Out[13]: 
                       Group  Distance  Proportion  Counts
                Caucasian  0.106594    0.358020    7261
         African-American  0.156011    0.444899    9023
                 Hispanic  0.163547    0.143681    2914
                    Other  0.220045    0.042601     864
                    Asian  0.320241    0.005325     108
          Native American  0.186467    0.003599      73
                 Oriental  0.306338    0.000641      13
                   Arabic  0.163487    0.001233      25
                     Male  0.035569    0.780928   15838
                   Female  0.126795    0.219072    4443
         Caucasian, Male  0.087198    0.268724    5450
  African-American, Male  0.201106    0.353138    7162
African-American, Female  0.050609    0.091761    1861
          Hispanic, Male  0.136174    0.115231    2337
       Caucasian, Female  0.175940    0.089295    1811
             Other, Male  0.196701    0.035156     713
        Hispanic, Female  0.278789    0.028450     577
             Asian, Male  0.307059    0.004388      89
           Other, Female  0.361963    0.007445     151
           Asian, Female  0.422706    0.000937      19
 Native American, Female  0.217331    0.000986      20
   Native American, Male  0.191119    0.002613      53
          Oriental, Male  0.274779    0.000493      10
            Arabic, Male  0.175154    0.001183      24
          Arabic, Female  0.773877    0.000049       1
        Oriental, Female  0.460184    0.000148       3

Visualizing distributions

Significance Tests