debut du background sur ZF

author: Jan Aalmoes <jan.aalmoes@inria.fr> 2024-08-27 21:07:18 +0200
committer: Jan Aalmoes <jan.aalmoes@inria.fr> 2024-08-27 21:07:18 +0200
commit: 57715cacec8d0f0d3d1436a26f92ae5c0f0e128e (patch)
tree: 985ae1d9895e0233f4e24a1f34046a42b46eb648 /background/proba.tex
parent: 4edf87ea8a5ce3e76285172af2eaecc7bc21813d (diff)
1 files changed, 42 insertions, 0 deletions
diff --git a/background/proba.tex b/background/proba.tex
new file mode 100644
index 0000000..bea43e7
--- /dev/null
+++ b/background/proba.tex
@@ -0,0 +1,42 @@
+
+Probability theory is deeply linked with machine learning and most of the properties of machine learning, such as differential privacy, fairness definitions, utility metrics... are often mathematically written within this framework.
+This paper does not differ and hence we provide a short background of this field and how it connects with the previously defined notions of ML introduced in section \ref{sec:ml}.
+
+Soit $A$ un ensemble.
+L'ensemble des parties de $A$ est $\mathcal{P}(A)$. 
+Chaque élément $a \in \mathcal{P}(A)$ est tel que $a \subset A$.
+Une tribue $\mathcal{A}$ est un sous esemble de $\mathcal{P}(A)$ qui contien $\emptyset$, $A$ par complémentaire est union dénombrable.
+Nous disons que $(A,\mathcal{A})$ est un espace mesurable.
+Une mesure $d$ est une fonction $d$:$\mathcal{A}$ $\rightarrow$ $[0,+\infty]$ telle que $d(\emptyset) = 0$ et $d\left(\bigcup_{i\in \mathbb{N}} A_i\right) = \sum_{i\in \mathbb{N}}d(A_i)$ pour chaque $(A_1, A_2, \cdots) \in \mathcal{A}^\mathbb{N} $ avec $\forall (i,j) A_i\cap A_j = \emptyset$.
+Nous disons alors que $(A, \mathcal{A}, d)$ est un espace mesuré.
+Nous appelons fonction mesurable un fonction de $A$ à $B$ telle que  $\forall b\in\mathcal{B}$~$f^{-1}(b)\in\mathcal{A}$.
+Nous notons alors $f:(A, \mathcal{A})\rightarrow (B, \mathcal{B})$ ou $f:(A, \mathcal{A},d)\rightarrow (B, \mathcal{B})$
+
+Dans le cas particulier où $d(A) = 1$, nous appelons $d$ une mesure de probabilité.
+ $(A,\mathcal{A},d)$ est alors un espace probailisé et les fonctions mesurables sur cet espace sont appelés variables aléatoires.
+Le loi de probabilité d'une variable aléatoire $f$ sur $(X,\mathcal{X})$ est la mesure de probabilite suivante :
+$d_X :\mathcal{X}\rightarrow [0,1]$, $x\mapsto d(X^{-1}(x))$.
+
+Having introduced probability theory, we explicit the relation with the ML theory described previously.
+Let $I$ a finite set, $\mathcal{X}$, $\mathcal{S}$ and $\mathcal{Y}$ the sets of features, sensitive attribute and label.
+Let $d:I\rightarrow \mathcal{X}\times\mathcal{S}\times\mathcal{Y}$ a dataset.
+Let $\#$ be the measure on $(I,\mathcal{P}(I))$ which maps to every $a$ in $\mathcal{P}(I)$ the number of elements of $a$.
+Let $P:\mathcal{P}(I)\rightarrow [0,1]$, $a\mapsto \frac{\#(a)}{\#(I)}$.
+Then $(I, \mathcal{P}(I), P)$ is a probability space.
+On this space we can define the following random variables:
+\begin{itemize}
+    \item $X:I\rightarrow \mathcal{X},~i\mapsto (d(i))_0$
+    \item $S:I\rightarrow \mathcal{S},~i\mapsto (d(i))_1$
+    \item $Y:I\rightarrow \mathcal{Y},~i\mapsto (d(i))_2$
+\end{itemize}
+Where for a vector $u$, $u_j$ refers to the $j$th element of $u$.
+
+From there we can define various random variables that will be useful in the rest of the paper.
+For instance $\hat{Y}=f\circ X$ is random variable that represents the prediction of a trained machine learning model $f$. 
+We can use it to write the accuracy in a compact way: $P(\hat{Y}=Y)$ by using the well accepted abuse of notations that for a random variable $A$ and an event $a$, 
+$\{A\in a\} = \{i\in\mathcal{P}(I)~|~A(i)\in a\} = A^{-1}(a)$.
+The accuracy is a reliable metric of a trained model's utility when $P(Y=0) = P(Y=1) = \frac{1}{2}$ but not so much when there is unbalance in $Y$. 
+To take into account an eventual unbalanced distribution of the labels, we will consider the balanced accuracy : 
+$\frac{P(\hat{Y}=0|Y=0) + P(\hat{Y}=1|Y=1)}{2}$.
+
+Finally in the context of attribute inference attack at inference time, we define the random variable $\hat{S}=a\circ \hat{Y}$ where here $a$ is a machine learning model trained to infer sensitive attribute from model's output.
author	Jan Aalmoes <jan.aalmoes@inria.fr>	2024-08-27 21:07:18 +0200
committer	Jan Aalmoes <jan.aalmoes@inria.fr>	2024-08-27 21:07:18 +0200
commit	57715cacec8d0f0d3d1436a26f92ae5c0f0e128e (patch)
tree	985ae1d9895e0233f4e24a1f34046a42b46eb648 /background/proba.tex
parent	4edf87ea8a5ce3e76285172af2eaecc7bc21813d (diff)