Shannon Information Content related to Uncertainty?
I'm a data scientist student currently writing my master thesis which resolves around the Cross Entropy (CE) Loss Function for neural networks. From my understanding, the CE is based on the Entropy, which in turn is based on the Shannon Information Content (SIC), however I struggle to interpret and explain it in such a way that my fellow students can understand it without using concepts of information theory (which itself is already a completely different and complicated area).
In the book Elements of Information Theory from Cover, it is said that the (Shannon) Entropy is a measure of uncertainty of a random variable:
$H(X) = \sum p(x) \log \left( \frac{1}{p(x)}\right)$
So my question is: if the Entropy is the expected amount of uncertainty of a RV $X$, shouldn't the SIC then quantify the uncertainty of a single outcome $x$?
The only interpretation (without relying on Information theory) I could find is that the SIC is a natural measure of the information content of some event $x$. So, am I completely off with the interpretation or did I skip over literature?
I would be very grateful for any tips from you on how to proceed, as I'm getting desperate at this point and don't know how to explain the concept as simple as possible.
Topic cross-entropy information-theory probability loss-function
Category Data Science