A few months back I managed to find an article1 presenting the Legendre transform in a satisfactory way, in essence it being
The Legendre transform $f^*$ of a (strictly) convex $C^2$ function $f$ is a "reinterpretation" of the function in terms of its slope at every point, possible due to injectivity of the convex function's derivative.
Since then I have barely touched any related material, and a few days ago I tried to reconstruct the definition on my own. Assuming $f\in C^2,f''>0$, it didn't take me long to pin the description to the following formula:
$$f^*(s)=f\left((f')^{-1}(s)\right),$$
which turned out to be wrong (I use $s$ instead of $p$ to remind myself it's the slope).
I can easily deduce involutivity and $(f^*)'=(f')^{-1}$ from the correct formula $f^*(s)=x(s)\cdot s-f(x(s))$ where $x(s)=(f')^{-1}(s)$, but I fail to see how these properties determine what the correct formula should be.
I also understand the geometric picture where $f^*(s)$ is the y-intercept of the tangent to $f$ at $x$, but that's just a visual representation of the correct formula and does not explain it any further (it is not clear why we should take the y-intercept instead of e.g. the x-intercept or the intersection with any other line).
Here are my questions. Assume $f\in C^2,f''>0$ for simplicity through the entire question, the generalization to $\max_x(sx-f(x))$ then follows easily.
- Why/how is my wrong formula wrong? I actually managed to partially answer this question myself, and share it as an answer below.
- Is there a natural way to fix the wrong formula? What reasoning can I use to deduce that I should subtract my formula specifically from $x(s)\cdot s$ to make it involutive (and not, say multiply it by $\exp(f'(x))$ or whatever instead)?
- Alternatively, if there's no easy way to fix it, we can start afresh - is there really only one Legendre transform? Can we deduce its formula only from the two properties, involutivity and $(f^*)'=(f')^{-1}$? In fact, are these two really defining properties for the transform?
I'm chasing classical mechanics with a look at QM and just want to have a perfectly clear sight on what exactly physicists do when they change variables and transform the Lagrangian into a Hamiltonian. The paper where I read about the "reinterpretation" of $f$ is here https://arxiv.org/abs/0806.1147. Another paper with a similar line of though is this one https://www.andrew.cmu.edu/course/33-765/pdf/Legendre.pdf, which even spends a paragraph on the wrong formula, but (sadly) then directly reveals the correct answer.
My gut feeling is that my formula does some horizontal squish-&-stretch on the graph of $f$, destroying convexity (I checked with $x\ln x$, it transforms into $(s-1)\exp(s-1)$ which is concave for $s<-1$). And somehow subtracting it from $s\cdot x(s)$ is a way to bring convexity back. Is it the only way?