Quantcast
Channel: Active questions tagged real-analysis - Mathematics Stack Exchange
Viewing all articles
Browse latest Browse all 9295

Gradient of softmax function with the inner product argument

$
0
0

Suppose that $x_i$ and $y_j$ are vectors in $\mathbb{R}^d$, where $i,j\in\{1,2,\dots,N\}$. Let the loss function be defined as $$\mathcal{L} = -\frac{1}{N}\ln\left(\frac{\exp({\frac{x_i.y_i}{\tau\lVert x_i \rVert \lVert y_i \rVert })}}{\sum _{j=1}^N \exp({\frac{x_i.y_j}{\tau\lVert x_i \rVert \lVert y_j \rVert })}}\right),$$ where $.$ denotes the dot product and $\tau$ is a constant. The goal is to find the gradient $\frac{\partial \mathcal{L}}{\partial x_i}$. I tried to write the vectors in the component form but computing the gradient is really difficult in this form, even in the case $d = 2$ and $N = 2$ (mainly because there are multiple function compositions). How the gradient can be computed using the vector calculus? Maybe with the help of vector calculus the gradient can be computed more efficiently.


Viewing all articles
Browse latest Browse all 9295

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>