Normal distribution is one of the most commonly found distribution types in nature. The normal distribution is a continuous probability distribution where the data tends to cluster around a mean or average. If you were to plot the frequency distribution of a normal distribution, you will tend to get the famous inverted bell-shaped curve also known as the Gaussian function.
Coming to the point, we are sometimes faced with situations where we would like to test out a hypothesis or carry out some experiment on a dummy set of normally distributed data points. To create a normally distributed set of number in Excel, you can use this formula:
=NORMINV(RAND(), mean_value, standard_deviation)
Ex. =NORMINV(RAND(), 10, 50)
All you need to do is to enter this formula in a single cell and copy to as many cells as you want.
mean = is the simple average of the given set of numbers
standard deviation = measure of the dispersion of a set of data from its mean
RAND() = Excel function that generates a random fractional value between 0 and 1 (both included)
How does this formula for generating normal distribution work ?
The formula NORMINV() function in excel basically generates a number that has a probability represented by RAND() and which belongs to a normal distribution curve with a mean represented by mean_value and standard deviation represented by standard_deviation in the above formula. Now RAND() basically generates a fractional random number between 0 and 1 and therefore can be used to generate a random percentage figure between 0 and 100%. Hence by using both the RAND() and the NORMINV() functions together, we can create a set of numbers that are normally distributed over the entire range (0 to 100%) of a normal distribution curve with a given mean and standard deviation.
To make this even more flexible, enter the desired mean in cell A1 and the desired standard deviation in A2, and reference these values in your formula:
=NORMINV(RAND(), $A$1, $A$2)
June 10th, 2009 at 9:45 amIf the above is true, then after we generate the random set of data, its mean and standard deviation should be 10 and 50 respectively. But thats not what comes if u try to find it for the set. Try it.
June 7th, 2010 at 11:40 pmThe answer should not be excactly what you give it i.e. 10 and 50, but it is close to the original values. The whole idea is that you want to get randomness into the dataset, therefore you get small random variations from the given values. It is meant to be like that, since we try to simulate a sampling process.
If you do it for a large sample (10 000) you get more accurate stdev and average.
August 30th, 2010 at 6:05 amWorks well, but it also generates negative numbers. I want to generate a normally distributed population with only positive numbers.
October 27th, 2014 at 12:00 pmSuppose, if I am having average values, i.e 100 average values of 5 observations each, and if i would like to creat a new data set with same average but different observations, then how could I do it?
April 5th, 2016 at 11:33 am