|
Back to Summary
Histo:
The histo application is used to calculate and display univariant statistics for differing data sets. For a
sample population histo can be used to calculate basic statistics such as the mean, and variance. It may
also be used to display the behavior of several different populations at once using stacked histograms and
box and whisker plots. Histo can also be used to infer information about the population, i.e. is the
population normal, using a 2
(chi-squared) test or by plotting the frequency distribution as a probability
plot.
One of the basic assumptions of many estimation techniques is that the sample set is normally
distributed. Using a histogram is a quick means to get a feel for the shape of the sample
distribution. For a normal (Gaussian) distribution one would expect approximately a bell shape
curve with similar tails. This data set is not normally distributed.
The yellow dashed line indicates the mean, the white dased line is the median, and the gray bars
represent one, two, and three standard deviations.
Click for full size image
|
 |
Probability plots are also useful for determining if a data set has a normal distribution. Ideally
the sample points should form a stright line. As seen in the above figure, this data set is not
normally distributed.
Click for full size image
|
 |
Just because the raw data are not normally distributed does not mean we can't apply various
estimation techniques. Some methods don't rely on the normality assumtion (e.g. indicator kriging),
but even for those that do, transforming the raw data can resolve the issue. Here a log10 transform
was applied to the sample. While not perfectly normal, it is much closer.
Click for full size image
|
 |
Agaim using a log10 tranform, and displaying the results on a probability plot, a nearly straight line
is shown suggesting the data is log-normally distriguted.
Click for full size image
|
 |
Histo can also compare several data sets at once (distcomp is more robust). Here sonic
velocities from seven hydrofacies units are compared. The goal here is to help show the hydrofacies
are different.
Click for full size image
|
 |
The same hydrofacies data is also shown in a box and wisker plot. These plots are useful for displaying
the mean, median, 10/90 percentiles, 25/75 percentiles, full sample range, and standard deviation of
each data set.
Click for full size image
|
 |
Back to Summary
|