traduction classification fini

author: Jan Aalmoes <jan.aalmoes@inria.fr> 2024-09-04 00:12:49 +0200
committer: Jan Aalmoes <jan.aalmoes@inria.fr> 2024-09-04 00:12:49 +0200
commit: 0e95544f85b523a95fb05b36c4e6b8789c73abfa (patch)
tree: 4229bad597b487e2395a2fc91e7023c6672bb29d /classification_finie/finit_classif.tex
parent: dc5a898dc39326fa3f733f3d9e006bbe3d1f8e4c (diff)
1 files changed, 138 insertions, 131 deletions
diff --git a/classification_finie/finit_classif.tex b/classification_finie/finit_classif.tex
index 44aa173..f518cdc 100644
--- a/classification_finie/finit_classif.tex
+++ b/classification_finie/finit_classif.tex
@@ -1,21 +1,21 @@
 
 \subsection{Notations}
-\begin{itemize}
-    \item $(x_0,x_1,\cdots,x_n)\in X_0\times X_1 \times\cdots\times X_n$
-    \item If $A$ is a finite set, then $\# A$ denotes the cardinal number of $A$
-    \item $f:\left\{\begin{matrix}A&\rightarrow &B\\ a&\mapsto & f(b)\end{matrix}\right.$ 
-            Denotes a function from $A$ to $B$ mapping each element $a$ in $A$ to $f(a)$ in $B$.
-    \item $f\circ g$ is the composition of $f$ and $g$.
-    \item $f^{-1}$ can be either the inverse function of $f$ if $f$ is a bijection or its inverse image otherwise.
-\end{itemize}
+%\begin{itemize}
+    %\item $(x_0,x_1,\cdots,x_n)\in X_0\times X_1 \times\cdots\times X_n$
+    %\item If $A$ is a finite set, then $\# A$ denotes the cardinal number of $A$
+    %\item $f:\left\{\begin{matrix}A&\rightarrow &B\\ a&\mapsto & f(b)\end{matrix}\right.$ 
+    %        Denotes a function from $A$ to $B$ mapping each element $a$ in $A$ to $f(a)$ in $B$.
+    %\item $f\circ g$ is the composition of $f$ and $g$.
+    %\item $f^{-1}$ can be either the inverse function of $f$ if $f$ is a bijection or its inverse image otherwise.
+%\end{itemize}
 
 \subsection{Problem setup}
-We dispose of two finite sets, a feature space $E$ and a classification space $F$.
-The cardinal numbers of $E$ and $F$ are respectively $m\in\mathbb{N}^*$ and $n\in\mathbb{N}^*$.
-Let $\varphi$ be a bijection from $E$ to $[|0,m-1|]$ and $\psi$ be a bijection from $F$ to $[|0,n-1|]$.
-We also dispose of a $o$-tuple $d: [|0,o-1] \rightarrow E\times F$.
-In machine learning we would say that $d$ models a dataset.
-We build the "dataset of indices" as such : 
+Nous nous donnons deux ensembles finits, un ensemble $E$ de données d'entrée et un espace d'étiquette $F$.
+Nous notons $m=\#E$ et $n=\#F$.
+Soit $\varphi$ une bijection de $E$ dans $[|0,m-1|]$ et $psi$ une bijection de $F$ dans $[|0,n-1|]$.
+Nous supposons que nous avons un $o$-uplet $d: [|0,o-1] \rightarrow E\times F$.
+$d$ modélise une jeu de donnée en pratique comme il est utilisé en apprantissage automatique.
+Nous pouvons alors construire un jeu de donnée d'indices :
 \begin{equation*}
     d' : \left\{
         \begin{matrix}
@@ -27,31 +27,33 @@ We build the "dataset of indices" as such :
 
 \begin{definition}
     \label{def:BA}
-    The balanced accuracy of $f$ on the $o$-tuple $d$ relatively to the set $F$, $BA_F^d(f)$, is a number in $[0,1]$ such that 
+    La \textit{balanced accuracy} empirique de $f$ sur le $o$-uplet $d$ relativements à $F$, que l'on appelle $BA_F^d(f)$, est un nombre dans $[0,1]$ tel que 
     \begin{equation*}
         BA_F^d(f) = \frac{1}{n}
             \sum_{y\in F} 
             \frac{
-                \#\left\{j\in [|0,o-1|]\quad| f(d_0(j))=d_1(j)\text{ and }d_1(j) = y\right\}
+                \#\left\{j\in [|0,o-1|]\quad| f(d_0(j))=d_1(j)\wedge d_1(j) = y\right\}
             }
             {\#\{j\in [|0,o-1|]\quad| d_1(j)=y\}}
     \end{equation*}
-
 \end{definition}
-\textbf{The problem consists in finding an application $f:E\rightarrow F$ such that the balanced accuracy of $f$ on $d$ is maximal.}
+Cette définition est un approximation de la \textit{balanced accuracy} qui nous avons définit plus haut.
+\textbf{Le problème consiste à trouver une application $f:E\rightarrow F$ telle que la \textit{balanced accuracy} de $f$ sur $d$ et maximal.}
+
+\subsection{D'un proclème sur les élément à un problème sur les indices}
 
-\subsection{From a problem on elements to a problem on indices}
-We call the set of functions from $E$ to $F$, $B_{E\rightarrow F}$.
-To simplify notation, the set of function from $[|0,m-1|]$ to $[|0,n-1|]$ we call it $B_{m\rightarrow n}$.
+Nous commencons par noter par $B_{E\rightarrow F}$ l'ensemble des fonctions de $E$ dans $F$.
+Pour simplifier un peu les notations, nous appelerons $B_{m\rightarrow n}$ l'ensemble des fnoctins de $[|0,m-1|]$ dans $[|0,n-1|]$.
 
 \begin{theorem}
     \label{th:bij}
-    Let $E$ and $F$ be two finite sets of cardinal numbers $m$ and $n$. There exists a bijection from $B_{E\rightarrow F}$ to $B_{m\rightarrow n}$.
+    Soient $E$ et $F$ deux ensemble finis de cardinaux $m$ et $n$. 
+    Il existe une bijection de $B_{E\rightarrow F}$ dans $B_{m\rightarrow n}$.
 \end{theorem}
 
 \begin{proof}
-    We proceed by expliciting such a bijection. 
-    Let 
+    Nous procédons en explicitant une telle bijection.
+    Soit
     \begin{equation}
         \Phi :\left\{
             \begin{matrix}
@@ -61,10 +63,10 @@ To simplify notation, the set of function from $[|0,m-1|]$ to $[|0,n-1|]$ we cal
             \right.
     \end{equation}
     
-    Let's show that $\Phi$ is an injection.
-    Let $(u,v)\in \left(B_{E\rightarrow F}\right)^2$ such that 
+    Montrons maintenant que $\Phi$ est un bijection.
+    Soit $(u,v)\in \left(B_{E\rightarrow F}\right)^2$ telle que
     $\Phi(u) = \Phi(v)$.
-    Then 
+    Alors
     \begin{align*}
         & \psi\circ u\circ\varphi^{-1} = \psi\circ v\circ\varphi^{-1}\\
         \Leftrightarrow& \psi^{-1}\circ\psi\circ u\circ\varphi^{-1} = \psi^{-1}\circ\psi\circ v\circ\varphi^{-1}\\
@@ -72,53 +74,57 @@ To simplify notation, the set of function from $[|0,m-1|]$ to $[|0,n-1|]$ we cal
          \Leftrightarrow&u\circ\varphi^{-1}\circ\varphi =  v\circ\varphi^{-1}\circ\varphi\\
          \Leftrightarrow&u = v\\
     \end{align*}
-    Hence $\Phi$ is injective. Let's show that $\Phi$ is surjective.
-    Let $g\in B_{m\rightarrow n}$.
-    Then 
+    Ainsi $\Phi$ est injective. 
+
+    Montrons maintenant que $\Phi$ est surjective.
+    Soit $g\in B_{m\rightarrow n}$.
+    Alors
     $\Phi(\psi^{-1}\circ g\circ\varphi) = 
     \psi\circ\psi^{-1}\circ g\circ\varphi\circ\varphi^{-1} = g$
-    Hence $\Phi$ is surjective. 
-    In conclusion $\Phi$ is both injective and surjective: it is a bijection.
+    Ainsi $\Phi$ est surjective. 
+
+    En conclusion $\Phi$ est à la fois injéctive et surjéctive : c'est une bijection.
 \end{proof}
 
-$\varphi$ and $\psi$ can be seen as indices on $E$ and $F$. 
-For instance, each element $e$ in $E$ has a unique index $\varphi(e)$.
-This abstraction step allows us to build explicit functions from $E$ to $F$ without taking into consideration the type of mathematical object that contains those sets.
-    Indeed, theorem \ref{th:bij} gives us that for each function mapping indices of $E$ to indices on $F$ we can find a unique function from $E$ to $F$.
-    And the proof even gives us how to find it: by using $\Phi^{-1}$.
+$\varphi$ et $\psi$ peuvent être vus comme des indives sur $E$ et $F$.
+Par exemple, chaque élément $e$ dans $E$ a un unqiue index $\varphi(e)$.
+Cette étape d'abstraction nous permet de contruire des fonctions explicites de $E$ dans $F$ sans prendre en comptes les spécificités de objets mathématiques dans ses ensembles.
+En effet, le théorème~\ref{th:bij} nous donne que pour chaque fonction des indices de $E$ vers les indices de $F$ nous pouvons trouver une unique fonction de de $E$ dans $F$.
+Et la preuve étant constructive nous indique que pour trouver cette fonction nous pouvons utiliser $\Phi^{-1}$.
 
-    Let's now explore how the balanced accuracy behaves when composing with $\Phi$.
+    Etudions donc comment se comporte la \textit{balanced accuracy} quand on compose avec $\Phi$.
 \begin{theorem}
     \label{th:BAphi=BA}
-    Let $E$ and $F$ two finite sets.
-    Let $d$ a tuple of $E\times F$.
-    We have the following equality 
+    Soit $E$ et $F$ deux ensembles finis.
+    Soit $d$ un uplet de $E\times F$.
+    Alors nous avons l'égalitée suivante :
     \begin{equation*}
         BA^{d'}_{[|0,\#F-1|]}\circ\Phi = BA^d_F
     \end{equation*}
 \end{theorem}
 
 \begin{proof}
-    Let $E$ and $F$ two finite sets.
+    Soit $E$ et $F$ deux ensemle finis.
+    Nous avons deux bijections :
     We have two bijections : 
-    $\varphi$ from E to $[|0,\#E-1|]$ and
-    $\psi$ from F to $[|0,\#F-1|]$.
-    With those two functions we build a third bijection 
-    $\Phi$ from $B_{E\rightarrow F}$ to $B_{\#E\rightarrow \#F }$  similarly as in the proof of theorem \ref{th:bij}.
-    Let $o\in\mathbb{N}^*$ and $d$ a $o$-tuple of $E\times F$.
+    $\varphi$ de $E$ dans $[|0,\#E-1|]$ et
+    $\psi$ de $F$ dans $[|0,\#F-1|]$.
+    Avec ces deux fonctions nous allons contruire une troisième bijections 
+    $\Phi$ de $B_{E\rightarrow F}$ dans $B_{\#E\rightarrow \#F }$  similaire à celle de la preuve du théorème~\ref{th:bij}.
+    Soient $o\in\mathbb{N}^*$ et $d$ un $o$-uplet de $E\times F$.
     
-    Let $f\in B_{E\rightarrow F}$ 
-    then 
+    Soit $f\in B_{E\rightarrow F}$ 
+    alors
     \begin{equation}
         \label{eq:BAdp}
         \left(BA^{d'}_{[|0,\#F-1|]}\circ\Phi\right)(f) =  
         \frac{1}{\#F}
         \sum_{i=0}^{\#F-1}\frac{
-            \#\left\{j\in[|0,o-1|]\quad | \Phi(f)(d'_0(j))=d'_1(j)\text{ and }d'_1(j)=i\right\}}
+            \#\left\{j\in[|0,o-1|]\quad | \Phi(f)(d'_0(j))=d'_1(j)\wedge d'_1(j)=i\right\}}
         {\#\left\{j\in[|0,o-1|]\quad | d'_1(j)=i\right\}}\\
     \end{equation}
     
-    We also remark that 
+    Nous remarquons aussi que 
     \begin{equation*}
     \Phi(f)\circ d'_0= 
     \psi\circ f\circ\varphi^{-1}\circ d'_0 = 
@@ -126,7 +132,7 @@ This abstraction step allows us to build explicit functions from $E$ to $F$ with
     \psi\circ f\circ d_0
     \end{equation*}
 
-    Hence, let $j\in[|0,o-1|]$
+    Ainsi, soit $j\in[|0,o-1|]$
     \begin{align*}
         &\left(\Phi(f)\circ d'_0\right)(j) = d'_1(j)\\
         \Leftrightarrow &\left(\psi\circ f\circ d_0\right)(j)= d'_1(j)\\
@@ -134,7 +140,7 @@ This abstraction step allows us to build explicit functions from $E$ to $F$ with
         \Leftrightarrow &\left(f\circ d_0\right)(j) = d_1(j)\\
     \end{align*}
 
-    Which gives us the following assertion: 
+    Ce qui nous donnes les assertions suivantes :
     \begin{equation}
         \label{eq:d1j}
         \forall j\in[|0,o-1|]\quad
@@ -145,7 +151,7 @@ This abstraction step allows us to build explicit functions from $E$ to $F$ with
             \right]
     \end{equation}
 
-    Let's now do the same work of switching from indices to elements on "$d'_1(j) = i$".
+    De même, passons des indices aux élements sur "$d'_1(j) = i$".
     Let $i\in[|0,\#F-1|]$ and $j\in[|0,o-1|]$.
     \begin{align*}
         &d'_1(j) = i\\
@@ -153,30 +159,30 @@ This abstraction step allows us to build explicit functions from $E$ to $F$ with
         \Leftrightarrow & d_1(j) = \psi^{-1}(i)
     \end{align*}
 
-    Hence from equations \ref{eq:BAdp} and \ref{eq:d1j} we obtain
-
-    \begin{align*}
+    Ansin avec les equations \ref{eq:BAdp} et \ref{eq:d1j} nous obtenons
+    \begin{align}
         &\left(BA^{d'}_{[|0,\#F-1|]}\circ\Phi\right)(f) =  
         \frac{1}{\#F}
         \sum_{y=0}^{\#F-1}\frac{
-            \#\left\{j\in[|0,o-1|]\quad | f(d_0(j))=d_1(j)\text{ and }d_1(j)=\psi^{-1}(i)\right\}}
-        {\#\left\{j\in[|0,o-1|]\quad | d_1(j)=\psi^{-1}(i)\right\}}\\
+            \#\left\{j\in[|0,o-1|]\quad | f(d_0(j))=d_1(j)\wedge d_1(j)=\psi^{-1}(i)\right\}}
+        {\#\left\{j\in[|0,o-1|]\quad | d_1(j)=\psi^{-1}(i)\right\}}\\\nonumber
         &=
         \frac{1}{\#F}
         \sum_{y=\psi^{-1}(0),\cdots,\psi^{-1}(\#F-1)}\frac{
-            \#\left\{j\in[|0,o-1|]\quad | f(d_0(j))=d_1(j)\text{ and }d_1(j)=y\right\}}
-        {\#\left\{j\in[|0,o-1|]\quad | d_1(j)=y\right\}}\\
+            \#\left\{j\in[|0,o-1|]\quad | f(d_0(j))=d_1(j)\wedge d_1(j)=y\right\}}
+        {\#\left\{j\in[|0,o-1|]\quad | d_1(j)=y\right\}}\\\nonumber
         &=
         \frac{1}{\#F}
         \sum_{y\in F}\frac{
-            \#\left\{j\in[|0,o-1|]\quad | f(d_0(j))=d_1(j)\text{ and }d_1(j)=y\right\}}
+            \#\left\{j\in[|0,o-1|]\quad | f(d_0(j))=d_1(j)\wedge d_1(j)=y\right\}}
         {\#\left\{j\in[|0,o-1|]\quad | d_1(j)=y\right\}}\\
-    \end{align*}
+        \label{eq:fini-egaba}
+    \end{align}
 
-    The final quantity in this sequence of equalities is equal to $BA_F^d(f)$ according to definition \ref{def:BA}.
+    D'après la définition~\ref{def:BA} l'experession~\ref{eq:fini-egaba} est égale à $BA_F^d(f)$
 \end{proof}
 
-Using theorem \ref{th:BAphi=BA} we can deduce the following result which will be key to finding $\text{argmax}\left(BA_F^d\right)$.
+En utilisant le théorème~\ref{th:BAphi=BA} nous déduisons le corollère suivant qui jouera un rôle clé dans le recherche de la solution à $\text{argmax}\left(BA_F^d\right)$.
 
 \begin{corollary}
     \label{co:argmax}
@@ -187,41 +193,41 @@ Using theorem \ref{th:BAphi=BA} we can deduce the following result which will be
 \end{corollary}
 
 \begin{proof}
-    Let $f' = \text{argmax}\left(BA_{[|0,\#F-1|]}^{d'}\right)$.
-    Then for all $g$ in $B_{E\rightarrow F}$,
+    Soit $f' = \text{argmax}\left(BA_{[|0,\#F-1|]}^{d'}\right)$. 
+    Alors, pour tout $g$ dans $B_{E\rightarrow F}$,
     $BA_F^d(g) = BA_{[|0,\#F-1|]}^{d'}(\Phi(g)) \leq
     BA_{[|0,\#F-1|]}^{d'}(f') = BA_F^d(\Phi^{-1}(f'))$
 \end{proof}
 
-With corollary \ref{co:argmax} we have that, to solve the classification problem on any dataset, it is sufficient to solve on one of the corresponding indices dataset.
-The focus of the next section is on finding an algorithm to solve such a problem.
+Grâce au corollère~\ref{co:argmax} nous avons que, pour résoudres le problème de classification sur n'importe quel ensemble, il est suffisant de le résoudre sur l'ensemble d'indices correspondant.
+L'objetif de la prochaine section est donc la recherche d'un algortihme de résolution d'un tel problème.
 
-\subsection{Building a classification algorithm on $B_{m\rightarrow n}$}
-Let $m$, $n$ and $o$ be natural numbers greater than zero.
-We take $d$, a $o$-tuple of $[|0,m-1|]\times[|0,n-1|]$.
-Since we already know that we are going to work on indices, we don't bother with the general sets $E$ and $F$ from the previous section.
-Instead we use $E=\{0,1,\cdots,m-1\}$ and $F=\{0,1,\cdots,n-1\}$.
+\subsection{Contruiction d'un algorithme de classification sur $B_{m\rightarrow n}$}
 
-The direct approach to find an algorithm that maximizes $BA_{[|0,n-1|]}^d$ would be to compute the balanced accuracy for every function of $B_{m\rightarrow n}$.
-This method works fine for small values of $m$ and $n$ but becomes quickly impossible to compute as those values increase.
-Indeed, we know from combinatorics that $B_{m\rightarrow n}$ contains $n^m$ elements.
-It results in an algorithm with a number of $\mathcal{O}(on^m)$ operations.
-Instead, we propose an algorithm in $\mathcal{O}(onm)$ operations.
+Soient $m$, $n$ et $p$ des entiers naturels non-nuls.
+Soit aussi $d$, un $o$-uplet de $[|0,m-1|]\times[|0,n-1|]$.
+Come nous savons que nous allons travailler sur les indices, nous ne nous préocupons pas d'ensembles quelconqus $E$ et $F$ comme à la section précedente.
+A la place nous prenons $E=\{0,1,\cdots,m-1\}$ and $F=\{0,1,\cdots,n-1\}$.
 
-To build it we are going to "distribute" the argmax operator, simplifying the expression of the optimal balanced accuracy.
-This distribution requires to find an expression of the balanced accuracy that is context-wise more appropriate.
-We express this alternative form of the balanced accuracy in the following lemma.
+L'aproche la plus directe pour maximiser $BA_{[|0,n-1|]}^d$ serait l'algorithme qui consiste à essayer de calculer la \textit{balanced accuracy} pour toutes les fonctions de $B_{m\rightarrow n}$.
+Cette methode est viable pour des petites valeures de $m$ et $n$ mais devient rapidement impossible à calculer pour des grandes valeures.
+En effet, par denombrement nous savons que $B_{m\rightarrow n}$ contiens $n^m$ éléments.
+L'algorithme directe à donc une complexite de $\mathcal{O}(on^m)$ operations. 
+Nous allons construire à la place un algoritheme que garantie de maximiser la \textit{balanced accuracy} en $\mathcal{O}(onm)$ operations.
+
+Pour le constuire nous allons, d'une certaine manière, distribuer l'opératuer argmax, simplifiant ainsi l'expression de la \textit{balanced accuracy} optimale.
+Pour cela, dans le lemme qui suit nous allons reformuler la \textit{balanced accuracy}.
 
 \begin{lemma}
     \label{lem:sumei}
-    For every $i$ in $[|0,m-1|]$, we introduction the following $n$-tuple:
+    Pour tout $i$ dans $[|0,m-1|]$, nous definissons le $n$-uplet suivant.
     \begin{equation*}
         e_i:\left\{
             \begin{matrix}
                 [|0,n-1|]&\longrightarrow&\mathbb{N}\\
                 l&\mapsto&
                 \frac{
-                \#\{j\in[|0,o-1|]\quad| d_0(j)=i\text{ and }d_1(j)=l\}
+                \#\{j\in[|0,o-1|]\quad| d_0(j)=i\wedge d_1(j)=l\}
             }{
                 \#\{j\in[|0,o-1|]\quad| d_1(j)=l\}
             }\\
@@ -229,7 +235,7 @@ We express this alternative form of the balanced accuracy in the following lemma
             \right.
     \end{equation*}
 
-    Then we can write the balanced accuracy as:
+    Nous pouvons alors écrir la \textit{balanced accuracy} de la manière suivant :
     \begin{equation*}
         BA_{[|0,n-1|]}^d(h) = \frac{1}{n}
         \sum_{i=0}^{m-1} e_i(h(i))
@@ -237,17 +243,17 @@ We express this alternative form of the balanced accuracy in the following lemma
     Where $h\in B_{m\rightarrow n}$.
 \end{lemma}
 \begin{proof}
-    Let $l\in[|0,n-1|]$ and $h$, a function in $B_{m\rightarrow n}$.
+    Soit $l\in[|0,n-1|]$ et $h$, une fonction dans $B_{m\rightarrow n}$.
     \begin{align*}
         &
         \frac{
-            \#\{j\in[|0,o-1|]\quad| h(d_0(j))=d_1(j)\text{ and }d_1(j)=l\}
+            \#\{j\in[|0,o-1|]\quad| h(d_0(j))=d_1(j)\wedge d_1(j)=l\}
         }{
             \#\{j\in[|0,o-1|]\quad| d_1(j)=l\}
         }\\
         =&
         \frac{
-            \#\{j\in[|0,o-1|]\quad| h(d_0(j))=l\text{ and }d_1(j)=l\}
+            \#\{j\in[|0,o-1|]\quad| h(d_0(j))=l\wedge d_1(j)=l\}
         }{
             \#\{j\in[|0,o-1|]\quad| d_1(j)=l\}
         }
@@ -255,7 +261,7 @@ We express this alternative form of the balanced accuracy in the following lemma
     \begin{align*}
         =&
         \frac{
-            \#\{j\in[|0,o-1|]\quad| h(d_0(j))=l\text{ and }d_1(j)=h(d_0(j))\}
+            \#\{j\in[|0,o-1|]\quad| h(d_0(j))=l\wedge d_1(j)=h(d_0(j))\}
         }{
             \#\{j\in[|0,o-1|]\quad| d_1(j)=l\}
         }
@@ -270,17 +276,18 @@ We express this alternative form of the balanced accuracy in the following lemma
         }
     \end{align}
 
-    In the previous expression, $l$ refers to an element of the label set $F$.
-    To prove the result we substitute in equation \ref{eq:sansi} $h(d_0(j))$ by an expression containing $i$: an element of $E$.
-    The end goal is to exhibit the quantity of interest $e_{i,j}$.
-    We first remark that for all $j$ in $[|0,o-1|]$ 
+    Dans l'expression précédente, $l$ est un élément de l'ensemble des indeices $F$.
+    Pour montrer le résultat, on remplace $h(d_0(j))$~\ref{eq:sansi} par une expression qui contient $i$ : un élément de $E$.
+    Le but de faire aparaitre la quantité qui nous intersse : $e_{i,j}$.
+
+    Nous commencons par remarquer que pour tout $j$ dans $[|0,o-1|]$
     \begin{equation*}
         h(d_0(j))=l
         \Leftrightarrow d_0(j)\in h^{-1}(\{l\})
         \Leftrightarrow \exists i\in h^{-1}(\{l\}), d_0(j)=i
     \end{equation*}
 
-    Which means that 
+    Ce qui signifie que 
     \begin{align*}
         &\left\{
             j\in[|0,o-1|]\quad| h(d_0(j)) = l
@@ -294,7 +301,7 @@ We express this alternative form of the balanced accuracy in the following lemma
         \right\}
     \end{align*}
 
-    Hence, by substitution of $\{j\in[|0,o-1|]\quad| h(d_0(j)) = l\}$ en equation \ref{eq:sansi}, we obtain
+    Aninsi, par substitution de $\{j\in[|0,o-1|]\quad| h(d_0(j)) = l\}$ dans l'équation \ref{eq:sansi}, nous obtenons
     \begin{align*}
         &\frac{
             \#\left(
@@ -314,7 +321,7 @@ We express this alternative form of the balanced accuracy in the following lemma
                 \bigcup_{i\in h^{-1}(\{l\})}
                 \left\{
                     j\in[|0,o-1|]\quad| d_0(j)=i
-                    \text{ and }
+                    \wedge
                     d_1(j)=h(d_0(j))
                 \right\}
             \right)
@@ -325,7 +332,7 @@ We express this alternative form of the balanced accuracy in the following lemma
         \frac{
             \#\left\{
                 j\in[|0,o-1|]\quad| d_0(j)=i
-                \text{ and }
+                \wedge
                 d_1(j)=h(i)
             \right\}
         }{
@@ -338,17 +345,17 @@ We express this alternative form of the balanced accuracy in the following lemma
         e_i(h(i))
     \end{align}
 
-    Then, according to definition \ref{def:BA}
+    Ensuite, d'après la définition~\ref{def:BA} 
     \begin{equation*}
         BA_{[|0,n-1|]}^d(h) = \frac{1}{n}
             \sum_{l=0}^{n-1} 
             \frac{
-                \#\left\{j\in [|0,o-1|]\quad| h(d_0(j))=d_1(j)\text{ and }d_1(j) = l\right\}
+                \#\left\{j\in [|0,o-1|]\quad| h(d_0(j))=d_1(j)\wedge d_1(j) = l\right\}
             }
             {\#\{j\in [|0,o-1|]\quad| d_1(j)=l\}}
     \end{equation*}
-    We substitute the general term of this sum by the result obtained in equation \ref{eq:sumei}:
 
+    Par substitution du terme générale de cette somme par le résultat obtenu dans l'équation~\ref{eq:sumei} :
     \begin{align*}
         &BA_{[|0,n-1|]}^d(h)\\
         =&\frac{1}{n}
@@ -359,82 +366,82 @@ We express this alternative form of the balanced accuracy in the following lemma
         \sum_{i=0}^{m-1} e_i(h(i))\sum_{l=0}^{n-1}1_{h^{-1}(\{l\})}(i)\\
     \end{align*}
     
-    Since $1_{h^{-1}(\{l\})}(i) = 1$ if and only if $l=h(i)$, 
+    Comme $1_{h^{-1}(\{l\})}(i) = 1$ si et seulement si $l=h(i)$, nous avons  
     $\sum_{l=0}^{n-1}1_{h^{-1}(\{l\})}(i) = 1$.
-
-    Hence we have the expected result.
+    Ce qui donne le résultat attendu.
 \end{proof}
 
-This lemma allows us to shift the computing of the argmax from all possible functions of $B_{m\rightarrow n}$ to the entries of the matrix $M = \left(e_i(l)\right)_{i\in[|0,m-1|],l\in[|0,m-1|]}$.
-More precisely, we find the maximum of each row of $M$ resulting in parcouring once every element of $M$.
-We formalise this idea in the following theorem.
+Ce lemme nous permet de calculer l'argmax souhaité en calculant le entrée de la matrice $M = \left(e_i(l)\right)_{i\in[|0,m-1|],l\in[|0,m-1|]}$
+au lieu de calcule la \textit{balanced accuracy} de toutes le fonctions de  $B_{m\rightarrow n}$.
+Nous cherchons donc le maximum de chaque ligne de $M$ ce qui fait que nous n'avons qu'a parcourir une fois chaque élément de $M$.
+Nous formalisons cette idée dans le théorème suivant :
 
 \begin{theorem}
-    Let $e_i$ be the following $n$-tuples of $\mathbb{N}$:
+    Soit $e_i$ le $n$-uplet de $\mathbb{N}$ suivant :
     \begin{equation*}
         e_i:\left\{
             \begin{matrix}
                 [|0,n-1|]&\longrightarrow&\mathbb{N}\\
                 l&\mapsto&
                 \frac{
-                \#\{j\in[|0,o-1|]\quad| d_0(j)=i\text{ and }d_1(j)=l\}
+                \#\{j\in[|0,o-1|]\quad| d_0(j)=i\wedge d_1(j)=l\}
             }{
                 \#\{j\in[|0,o-1|]\quad| d_1(j)=l\}
             }\\
             \end{matrix}
             \right.
     \end{equation*}
-    Let $f\in B_{m\rightarrow n}$ such that for all $i$ in $[|0,m-1|]$
+    Soit $f\in B_{m\rightarrow n}$ telle que pour tout $i$ dans $[|0,m-1|]$
     \begin{equation*}
         f(i) = \text{argmax}\left(e_i\right)
     \end{equation*}
 
-    Then 
+    Alors
     \begin{equation*}
         f = \text{argmax}\left(BA_{[|0,n-1|]}^d\right)
     \end{equation*}
 \end{theorem}
 
 \begin{proof}
-    Let $g\in B_{m\rightarrow n}$. 
-    We are going to show that $BA_{[|0,n-1|]}^d(g)\leq BA_{[|0,n-1|]}^d(f)$.
-    We start by saying that
-    for all $i\in[|0,n-1|]$, $0\leq e_i(g(i))\leq e_i(f(i))$.
-    Which gives us that
+    Soit $g\in B_{m\rightarrow n}$. 
+    Nous allons montrer que $BA_{[|0,n-1|]}^d(g)\leq BA_{[|0,n-1|]}^d(f)$.
+    Nous commencons par dire que 
+    pour tout $i\in[|0,n-1|]$, $0\leq e_i(g(i))\leq e_i(f(i))$.
+    Ce qui donne que
     \begin{equation*}
         \sum_{i=0}^{m-1}e_i(g(i)) \leq \sum_{i=0}^{m-1}e_i(f(i))
     \end{equation*}
-    and hence
+    et donc
     \begin{equation*}
         \frac{1}{n}\sum_{i=0}^{m-1}e_i(g(i)) \leq \frac{1}{n}\sum_{i=0}^{m-1}e_i(f(i))
     \end{equation*}
 
-    And finally, according to lemma \ref{lem:sumei} we have the result.
+    Enfin, en appliquant le lemme~\ref{lem:sumei} nous avons le résultat attedu.
 \end{proof}
 
-According to this result, we can write the following algorithm in $\mathcal{O}(onm)$ to solve our optimisation problem.
+En utiliant ce résulat, nous pouvons maintenant écrir l'algorithm suivant en $\mathcal{O}(onm)$ pour résoudre notre problème d'optimisation.
 
 \begin{algorithm}
-    \caption{Optimisation: finding $\text{argmax}\left(BA^d_{[|0,n-1|]}\right)$}
+    \caption{Optimisation: recherche de l'$\text{argmax}\left(BA^d_{[|0,n-1|]}\right)$}
     \label{algo:argmax}
     \begin{algorithmic}
         \For{$i\gets 0,\cdots,m-1$} 
             \For{$l\gets 0,\cdots,n-1$}
                 \State $e_{i,l}\gets
                 \frac{
-                \#\{j\in[|0,o-1|]\quad | d_0(j)=i\text{ and }d_1(j)=l\}
+                \#\{j\in[|0,o-1|]\quad | d_0(j)=i\wedge d_1(j)=l\}
             }{
                 \#\{j\in[|0,o-1|]\quad | d_1(j)=l\}
             }$
                 \footnotesize
-                \Comment{Compute $e_i(l)$}
+                \Comment{Calcul de $e_i(l)$}
                 \normalsize
             \EndFor
         \EndFor
         \For{$i\gets 0,\cdots,n-1$}
             \State $f(i)\gets\text{argmax}_l(e_{i,l})$
             \footnotesize
-            \Comment{Value of $l$ that maximizes $e_{i,l}$}
+            \Comment{Valeur de $l$ que maximise $e_{i,l}$}
             \normalsize
         \EndFor
         \State \Return $f$
@@ -443,16 +450,16 @@ According to this result, we can write the following algorithm in $\mathcal{O}(o
 
 \FloatBarrier
 \subsection{Extention to unseen data}
-Alogrithm \ref{algo:argmax} is an efficient algorithm to find a classifier the maximizes balanced accuracy on the set of indices. 
-From the result $f$ of this alogrithm we find a classifier that solves the problem of maximizing the balanced accuracy on element by applying the inversse of $\Phi$.
-Hence $\Phi^{-1}(f)$ is solution.
-Computing it requires $\mathcal{O}(on)$ operations resulting in an overall complexity of $\mathcal{O}(onm)$.
+%Alogrithm \ref{algo:argmax} is an efficient algorithm to find a classifier the maximizes balanced accuracy on the set of indices. 
+%From the result $f$ of this alogrithm we find a classifier that solves the problem of maximizing the balanced accuracy on element by applying the inversse of $\Phi$.
+%Hence $\Phi^{-1}(f)$ is solution.
+%Computing it requires $\mathcal{O}(on)$ operations resulting in an overall complexity of $\mathcal{O}(onm)$.
 
-This classifier algorithm is limited to finite feature space but there are cases where we can find workaround to still use it.
-For instance, by using clusturing prior to our method we can reduce to a finit feature space.
-Also, if $(E, O)$ is a sub-topology we can match any element of the englobing set to its nearest counterpart in $E$.
-We did that on LAW and COMPAS dataset and compare our approach to a random forest.
+%This classifier algorithm is limited to finite feature space but there are cases where we can find workaround to still use it.
+%For instance, by using clusturing prior to our method we can reduce to a finit feature space.
+%Also, if $(E, O)$ is a sub-topology we can match any element of the englobing set to its nearest counterpart in $E$.
+%We did that on LAW and COMPAS dataset and compare our approach to a random forest.
 
 
-The main takeaway from figures \ref{fig:ba} and \ref{fig:time} is that our finite classifier alogirthm outperforms state of the art in terms of balanced accuracy and is way faster at achieving this result.
+%The main takeaway from figures \ref{fig:ba} and \ref{fig:time} is that our finite classifier alogirthm outperforms state of the art in terms of balanced accuracy and is way faster at achieving this result.
author	Jan Aalmoes <jan.aalmoes@inria.fr>	2024-09-04 00:12:49 +0200
committer	Jan Aalmoes <jan.aalmoes@inria.fr>	2024-09-04 00:12:49 +0200
commit	0e95544f85b523a95fb05b36c4e6b8789c73abfa (patch)
tree	4229bad597b487e2395a2fc91e7023c6672bb29d /classification_finie/finit_classif.tex
parent	dc5a898dc39326fa3f733f3d9e006bbe3d1f8e4c (diff)