Tuesday, January 11, 2011

Probability and statistics

Sample space is the set of all the possible outcomes. When rolling a die, the sample space consists of S = {1, 2, 3, 4, 5, 6}. Page 3 has a bunch of examples such as two dice where the space is S = {2, 3, ..., 12}. The sample space for the event of two successive heads on flipping a fair coin is  S = {x : x = 1, 2, 3, ...}. If we don't know any better, our sample size starts at zero and it goes on to one, two, and continues through infinity. 

We can also have sample space with {match, difference}, {male, female} and so on. We can assign these {0, 1} space as necessary. Random variables use the notation X, Y, Z and the possible values of the variables use the small case notation x, y, z. For example, random variable X is the number of heads out of a two coin flip. Thus, x = 0, 1, 2. Sample space S is {x : x = 0, 1, 2}. 

Probability based on theory gives us distribution of the random variable. If we can determine by theory the fractions of times a random experiment ends in the respective outcomes, we have described a distribution of the random variable sometimes called a population. Often we cannot determine the distribution through theoretical reasoning but must actually perform the random experiment a number of times to obtain guesses or estimates of these fractions. 
To understand the background behind statistical inferences that are made from the sample, we need a knowledge of some probability, basic distributions, and sampling distribution theory.
Law of large numbers: is a theorem that describes the result of performing the same experiment a large number of times (see wikipedia http://en.wikipedia.org/wiki/Law_of_large_numbers). 
A relative frequency is usually very unstable for small values of n, but it tends to stabilize about some number-say, p-as n increases.
Data summarization: 
We summarize data with 
  1. frequency table 
  2. histogram 
    1. frequency histogram 
    2. relative frequency histogram

No comments:

Post a Comment

Please be kind.