fairlens.metrics.Norm¶
-
class
Norm
(bin_edges=None, ord=2)[source]¶ Bases:
fairlens.metrics.distance.CategoricalDistanceMetric
LP Norm between two probability distributions.
Methods
- param bin_edges
A list of bin edges used to bin continuous data by or to indicate bins of pre-binned data.
Check whether the input is valid.
Distance between the distribution of numerical data in x and y.
Distance between 2 aligned normalized histograms.
Returns a p-value for the test that x and y are sampled from the same distribution.
-
__call__
(x, y)¶ Calculate the distance between two distributions.
- Parameters
x (pd.Series) – The data in the column representing the first group.
y (pd.Series) – The data in the column representing the second group.
- Returns
The computed distance.
- Return type
Optional[float]
-
__init__
(bin_edges=None, ord=2)[source]¶ - Parameters
bin_edges (Optional[np.ndarray], optional) – A list of bin edges used to bin continuous data by or to indicate bins of pre-binned data. Defaults to None.
ord (Union[str, int], optional) – The order of the norm. Possible values include positive numbers, ‘fro’, ‘nuc’. See numpy.linalg.norm for more details. Defaults to 2.
-
check_input
(x, y)¶ Check whether the input is valid. Returns False if x and y have different dtypes by default.
- Parameters
x (pd.Series) – The data in the column representing the first group.
y (pd.Series) – The data in the column representing the second group.
- Returns
Whether or not the input is valid.
- Return type
bool
-
distance
(x, y)¶ Distance between the distribution of numerical data in x and y. Derived classes must implement this.
- Parameters
x (pd.Series) – Numerical data in a column.
y (pd.Series) – Numerical data in a column.
- Returns
The computed distance.
- Return type
float
-
distance_pdf
(p, q, bin_edges)[source]¶ Distance between 2 aligned normalized histograms. Derived classes must implement this.
- Parameters
p (pd.Series) – A normalized histogram.
q (pd.Series) – A normalized histogram.
bin_edges (Optional[np.ndarray]) – bin_edges for binned continuous data. Used by metrics such as Earth Mover’s Distance to compute the distance metric space.
- Returns
The computed distance.
- Return type
float
-
property
id
¶ A string identifier for the method. Used by fairlens.metrics.stat_distance(). Derived classes must implement this.
-
p_value
(x, y)¶ Returns a p-value for the test that x and y are sampled from the same distribution.
- Parameters
x (pd.Series) – Numerical data in a column.
y (pd.Series) – Numerical data in a column.
- Returns
The computed p-value.
- Return type
float