See below under More information if this is confusing. Try changing your sample size and watch what happens to the alternate scenarios. That tells you what happens if you don't use the recommended sample size, and how M. Otherwise, look at the more advanced books. If you'd like to see how we perform the calculation, view the page source. This calculation is based on the Normal distribution , and assumes you have more than about 30 samples.

About Response distribution : If you ask a random sample of 10 people if they like donuts, and 9 of them say, "Yes", then the prediction that you make about the general population is different than it would be if 5 had said, "Yes", and 5 had said, "No". The sample size calculator computes the critical value for the normal distribution. Wikipedia has good articles on statistics.

How do you like this web page? In studies where the plan is to estimate the difference in means between two independent populations, the formula for determining the sample sizes required in each comparison group is given below:. Recall from the module on confidence intervals that, when we generated a confidence interval estimate for the difference in means, we used Sp, the pooled estimate of the common standard deviation, as a measure of variability in the outcome based on pooling the data , where Sp is computed as follows:.

If data are available on variability of the outcome in each comparison group, then Sp can be computed and used in the sample size formula. However, it is more often the case that data on the variability of the outcome are available from only one group, often the untreated e. When planning a clinical trial to investigate a new drug or procedure, data are often available from other trials that involved a placebo or an active control group i. The standard deviation of the outcome variable measured in patients assigned to the placebo, control or unexposed group can be used to plan a future trial, as illustrated below.

Note that the formula for the sample size generates sample size estimates for samples of equal size. If a study is planned where different numbers of patients will be assigned or different numbers of patients will comprise the comparison groups, then alternative formulas can be used. An investigator wants to plan a clinical trial to evaluate the efficacy of a new drug designed to increase HDL cholesterol the "good" cholesterol.

The plan is to enroll participants and to randomly assign them to receive either the new drug or a placebo. HDL cholesterol will be measured in each participant after 12 weeks on the assigned treatment. The investigator would like the margin of error to be no more than 3 units.

## How to Determine Sample Size, Determining Sample Size

How many patients should be recruited into the study? To plan this study, we can use data from the Framingham Heart Study. In participants who attended the seventh examination of the Offspring Study and were not on treatment for high cholesterol, the standard deviation of HDL cholesterol is We will use this value and the other inputs to compute the sample sizes as follows:. Again, these sample sizes refer to the numbers of participants with complete data.

In order to ensure that the total sample size of is available at 12 weeks, the investigator needs to recruit more participants to allow for attrition.

### Confidence intervals for proportions

An investigator wants to compare two diet programs in children who are obese. One diet is a low fat diet, and the other is a low carbohydrate diet. The plan is to enroll children and weigh them at the start of the study. Each child will then be randomly assigned to either the low fat or the low carbohydrate diet. Each child will follow the assigned diet for 8 weeks, at which time they will again be weighed.

The number of pounds lost will be computed for each child. How many children should be recruited into the study? To plan this study, investigators use data from a published study in adults. Suppose one such study compared the same diets in adults and involved participants in each diet group.

## Sample Size Calculator with Steps

The study reported a standard deviation in weight lost over 8 weeks on a low fat diet of 8. These data can be used to estimate the common standard deviation in weight lost as follows:. Again, these sample sizes refer to the numbers of children with complete data. In order to ensure that the total sample size of is available at 8 weeks, the investigator needs to recruit more participants to allow for attrition. In studies where the plan is to estimate the mean difference of a continuous outcome based on matched data, the formula for determining sample size is given below:. It is extremely important that the standard deviation of the difference scores e.

In studies where the plan is to estimate the difference in proportions between two independent populations i. In order to estimate the sample size, we need approximate values of p 1 and p 2. Thus, if there is no information available to approximate p 1 and p 2 , then 0. Similar to the situation for two independent samples and a continuous outcome at the top of this page, it may be the case that data are available on the proportion of successes in one group, usually the untreated e.

If so, the known proportion can be used for both p 1 and p 2 in the formula shown above. The formula shown above generates sample size estimates for samples of equal size.

- Fog City Nocturne: From the Casebook of Nick Chambers.
- Codename Ryan.
- Le clos des arpents 1/2 (French Edition).
- Margin of Error.
- Lady Eureka, Volume 2.
- Answer Freshmen Smoking - Page 4.

Interested readers can see Fleiss for more details. An investigator wants to estimate the impact of smoking during pregnancy on premature delivery. Normal pregnancies last approximately 40 weeks and premature deliveries are those that occur before 37 weeks. The sample sizes i. We will use that estimate for both groups in the sample size computation. In the module on hypothesis testing for means and proportions, we introduced techniques for means, proportions, differences in means, and differences in proportions.

While each test involved details that were specific to the outcome of interest e. For example, in each test of hypothesis, there are two errors that can be committed. The first is called a Type I error and refers to the situation where we incorrectly reject H 0 when in fact it is true. The second type of error is called a Type II error and it is defined as the probability we do not reject H 0 when it is false. In hypothesis testing, we usually focus on power, which is defined as the probability that we reject H 0 when it is false, i. Power is the probability that a test correctly rejects a false null hypothesis.

A good test is one with low probability of committing a Type I error i. Here we present formulas to determine the sample size required to ensure that a test has high power. The effect size is the difference in the parameter of interest that represents a clinically meaningful difference. Similar to the margin of error in confidence interval applications, the effect size is determined based on clinical or practical criteria and not statistical criteria. The concept of statistical power can be difficult to grasp. Before presenting the formulas to determine the sample sizes required to ensure high power in a test, we will first discuss power from a conceptual point of view.

We compute the sample mean and then must decide whether the sample mean provides evidence to support the alternative hypothesis or not. This is done by computing a test statistic and comparing the test statistic to an appropriate critical value. However, it is also possible to select a sample whose mean is much larger or much smaller than When we run tests of hypotheses, we usually standardize the data e.

The rejection region is shown in the tails of the figure below. This concept was discussed in the module on Hypothesis Testing. Now, suppose that the alternative hypothesis, H 1 , is true i. The figure below shows the distributions of the sample mean under the null and alternative hypotheses. The values of the sample mean are shown along the horizontal axis.

### Step 2: Apply the Equation Sample Size = (Z*σ / Margin of Error)^2

If the true mean is 94, then the alternative hypothesis is true. The critical value The upper critical value would be The effect size is the difference in the parameter of interest e. The figure below shows the same components for the situation where the mean under the alternative hypothesis is Notice that there is much higher power when there is a larger difference between the mean under H 0 as compared to H 1 i.

A statistical test is much more likely to reject the null hypothesis in favor of the alternative if the true mean is 98 than if the true mean is Notice also in this case that there is little overlap in the distributions under the null and alternative hypotheses. If a sample mean of 97 or higher is observed it is very unlikely that it came from a distribution whose mean is The inputs for the sample size formulas include the desired power, the level of significance and the effect size.

The effect size is selected to represent a clinically meaningful or practically important difference in the parameter of interest, as we will illustrate. The formulas we present below produce the minimum sample size to ensure that the test of hypothesis will have a specified probability of rejecting the null hypothesis when it is false i.

In planning studies, investigators again must account for attrition or loss to follow-up. The formulas shown below produce the number of participants needed with complete data, and we will illustrate how attrition is addressed in planning studies.

- Seule dans mon grand lit blanc (Romans contemporains) (French Edition).
- Sample size calculator.
- Inklusion (German Edition).
- Wisdom (Notes) … (a Sky Design).
- Margin of Error Calculator - eMathHelp.

For example, many studies involve random sampling by which a selection of a target population is randomly asked to complete a survey. Confidence level: The level of confidence of a sample is expressed as a percentage and describes the extent to which you can be sure it is representative of the target population; that is, how frequently the true percentage of the population who would select a response lies within the confidence interval. Margin of Error: Margin of error is also measured in percentage terms. It indicates the extent to which the outputs of the sample population are reflective of the overall population.

The lower the margin of error, the nearer the researcher is to having an accurate response at a given confidence level. To determine the margin of error, take a look at our margin of error calculator. Percentage of population selecting a given choice: The accuracy of the research outputs also varies according to the percentage of the sample that chooses a given response.