Quantcast
Viewing all articles
Browse latest Browse all 9241

Gradient of softmax function with the inner product argument

Suppose that $x_i$ and $y_j$ are vectors in $\mathbb{R}^d$, where $i,j\in\{1,2,\dots,N\}$. Let the loss function be defined as $$\mathcal{L} = -\frac{1}{N}\ln\left(\frac{\exp({\frac{x_i.y_i}{\tau\lVert x_i \rVert \lVert y_i \rVert })}}{\sum _{j=1}^N \exp({\frac{x_i.y_j}{\tau\lVert x_i \rVert \lVert y_j \rVert })}}\right),$$ where $.$ denotes the dot product and $\tau$ is a constant. The goal is to find the gradient $\frac{\partial \mathcal{L}}{\partial x_i}$. I tried to write the vectors in the component form but computing the gradient is really difficult in this form, even in the case $d = 2$ and $N = 2$ (mainly because there are multiple function compositions). How the gradient can be computed using the vector calculus? Maybe with the help of vector calculus the gradient can be computed more efficiently.


Viewing all articles
Browse latest Browse all 9241

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>