I am reading Monge's formulation of the optimal transportation problem. It says that one wishes to find the a transport map $T: (X,\mu) \rightarrow (Y,\nu)$ that minimizes the transport cost $$\int_X |x-T(x)|d\mu (x).$$
Here we assume that $T$ is a mass preserving map, namely $\nu(B) = \mu(T^{-1}(B))$ for any measurable set $B$ in $Y$ (i.e. the amount of mass that ends up in any set $B$ in the target space $Y$ is exactly the amount of mass that was originally in the preimage $T^{-1}(B)$ in the source space $X$.)
My question: Why don't we define a mass preserving map as $\mu(B)= \nu(T(B))$ for any measurable set $B$ in $X$? Is there something unrealistic with this definition?