summaryrefslogtreecommitdiff
path: root/background/eq.tex
diff options
context:
space:
mode:
Diffstat (limited to 'background/eq.tex')
-rw-r--r--background/eq.tex42
1 files changed, 42 insertions, 0 deletions
diff --git a/background/eq.tex b/background/eq.tex
new file mode 100644
index 0000000..8a76ee7
--- /dev/null
+++ b/background/eq.tex
@@ -0,0 +1,42 @@
+
+\label{sec:bck_fair}
+Algorithmic fairness aims at reducing biases in ML model predictions.
+Indeed, data records belonging to certain subgroups influence $targetmodel$'s predictions more than others.
+For instance in criminal justice, the ethnicity of a culprit plays a non-negligible role in the prediction of them reoffending~\cite{fairjustice}. Generally, data records in the minority subgroup face unfair prediction behaviour compared to data records in the majority subgroup. These subgroups are identified based on a sensitive attribute (e.g., race or sex).
+Those biases are learnt by $targetmodel$ as they are part of the distribution of the training dataset.
+There is two main categories of fairness of a ML model:
+
+\textbf{Individual fairness} ensures that two data records with same attributes except for $S$ have the same model prediction.
+This notion does not dwell on sensitive attribute and as such is not really useful in our goal of mitigating attribute inference attack at inference time.
+So we set it aside for the rest of the paper.
+
+\textbf{Group fairness} comes from the idea that different subgroups defined by an attribute such a skin color or gender should be treated equally.
+We focus our study on group fairness where $S$ represents either sex or race (i.e., $S(i)$ equals to 0 for woman, 1 for man, and 0 for black, 1 for white, respectively).
+There are different definitions of group fairness which have been introduced in prior work.
+We discuss two well-established and commonly used metrics: demographic parity and equality of odds.
+
+\begin{definition}
+\label{def:dp}
+ $\hat{Y}$ satisfies demparity for $S$ if and only if: $P(\hat{Y}=0 | S=0) = P(\hat{Y}=0 | S=1)$.
+ From that, we will call $|P(\hat{Y}=0 | S=0) - P(\hat{Y}=0 | S=1)|$ the demPar-level of $\hat{Y}$.
+\end{definition}
+
+demparity is the historical definition of fairness.
+Legally, disparate impact is the fairness definition recognized by law, where 80\% disparity is an agreed upon tolerance decided in the legal arena.
+demparity ensures that the number of correct prediction is the same for each population.
+However, this may result in different false positive and true positive rates if the true outcome does actually vary with $S$~\cite{dpbad}.
+Hardt et al.~\cite{fairmetric2} proposed eo as a modification of demparity to ensure that both the true positive rate and false positive rate will be the same for each population.
+
+\begin{definition}
+ \label{def:eo}
+ $\hat{Y}$, classifier of $Y$, satisfies equality of odds for $S$ if and only if: $\forall (\hat{y},y)\in\{0,1\}^2 \quad
+ P(\hat{Y}=\hat{y} | S=0,Y=y) = P(\hat{Y}=\hat{y} | S=1,Y=y)$.
+\end{definition}
+
+The above fairness definitions can be achieved using three main fairness mechanisms: (a) pre-processing, (b) in-processing and (c) post-processing. \textit{Pre-processing} algorithms such as reweighing requires access to the training data and assigns weights to the data records to remove discrimination~\cite{preprocessing}.
+\textit{In-processing} algorithms such as advdebias~\cite{debiase} and egd~\cite{reductions} add constraint during $targetmodel$'s training to ensure fairness. %reductions
+\textit{Post-processing} techniques, in turn, hide the bias in output predictions to satisfy the above fairness constraints but the underlying model is still biased.
+Similar to previous work~\cite{chang2021privacy}, we focus on in-processing algorithms.
+
+Our work focuses on the theoretical guaranties on attribute inference attacks given by the different fairness notions and not so much on how to implement in-processing fairness mechanism.
+Nevertheless in the experiment section we try production ready state of the art implementations of those fairness constraints along unconstrained ML algorithm.