The Nature of Information V
George Hrabovsky
MAST
Introduction
In this article I will write some Mathematica code for determining the entropy of a string.
Characterizing Probabilities
The new version of Mathematica allows you to deal with probabilities in a symbolic way.
First we begin by determiing a probability. Let’s say we choose a probability distribution, like the Gaussian, or normal, distribution. What is the probability of finding a value, say x, such that 1≤x≤4.
In[17]:=
Out[17]=
Here μ is the mean of the distribution and σ is the standard devaition. Erfc is the complementary error function for some complex number z is,
where,
We can make a plot of possible values for μ and σ,
In[21]:=
Out[21]=
From this we can choose μ=.5,σ=.5,
In[22]:=
Out[22]=
The Entropy and Strings
The entropy is
In[23]:=
Out[23]=
This can be extended for strings. Say we have four symbols drawn from the same distribution,
In[30]:=
The probability of the string is then,
In[32]:=
Out[32]=
The entropy of the string is,
In[33]:=
Out[33]=
and
In[38]:=
Out[38]=
The actual number of strings of this length is,
In[35]:=
Out[35]=
This tells us that there are 256 possible strings, while the effective number of strings is
In[39]:=
Out[39]=
which is much lower than the actual number of strings. This reduces the possible error.