Skip to main content

Table 2 The questions directed in a particular order for an example hypothesis test

From: StatXFinder: a web-based self-directed tool that provides appropriate statistical test selection for biomedical researchers in their scientific studies

Step

Questions

The statistical terms observed in the questions*

Definitions

Answers

1

Does your data set have only one variable?

Variablea

Since the example only addresses the values that a single variable, in the form of systolic blood pressure (mm/hg), could take, the answer to this question should be “yes”

Yes

2

Is it a “one sample” problem?

Sampleb

Since the experimenter obtained measurement values from the patient before and after administering drugs, there are two groups of data in question

No

3

Is it a “two samples” problem?

Sampleb

The experimenter has two groups of data such as pre and post administering

Yes

4

Does your data appear to be normally distributed (bell-shaped curve)?

Normally distributedc

The answer is given as “no” since the analysis was conducted with the assumption that the measurement values were not normally distributed and the sample size is smaller than 30

No

5

Does your data have a binomial distribution?

Binomial distributiond

Since systolic blood pressure (values) do not have two possible outcomes as “success” and “failure”, it does not show a binomial distribution and the answer for this question should be “no”

No

6

Do you have person-time data?

Person-time datae

Since the blood pressure measurement value is not a variable which is observed over time such as some individuals who developed lung cancer over a year (time) this question was answered as “no”

No

7

Are your samples (=groups) independent?

Sampleb, independentf

Since two measurements were conducted on the same group before and after administering drugs to the patients, such as pre-treatment vs. post-treatment, the systolic blood pressure values measured in these two groups were dependent to each other; therefore, the question was answered as no

No

Recommendation—use Wilcoxon signed rank test or Sign test

N/A

  1. aA characteristic that consists of two or more categories or values, and that differs from subject to subject or from time to time. Categories such as occupation or nationality, or values such as age or intelligence score are examples of variable. The opposite of variable is constant. The term variable is often used as a shortened form of random variable
  2. bA subset of cases drawn or selected, according to some specified criteria, from a larger set or population of cases with the purpose of estimating characteristics of the larger set or population, drawing inferences about the these characteristics and generalizing results from sample to population. A sample should be representative of the population from which it is drawn in order to be useful. For instance, to find out the relationship between drug abuse and mental health, it would be possible or practical to investigate this relationship by taking a sample of the population. Thus, it would be possible to determine to what extent this relationship is likely to be found in the population
  3. cA theoretical distribution which shows the frequency or probability of all the possible values that a continuous variable can take. This distribution is bell shaped. The horizontal axis of the distribution represents all possible values of the variable while the vertical axis represents the frequency or probability of those values. In any normal distribution: (1) 68 % of the observations fall within σ of the mean μ, (2) 95 % of the observations fall within 2σ of μ, and (3) 99.7 % of the observations fall within 3σ of μ. This is known as 68–95–99.7 rule. It is also called the gaussian distribution
  4. dThe probability distribution of the number of successes in n independent Bernoulli trials, such as a person passing or failing or being a woman or a man, where each trial has two outcomes (conveniently labeled success and failure), and the probability of success p is the same for each trial
  5. eData referring to a measurement obtained by combining person data and time data. It is obtained as the sum of individual units of time that the subjects in the study population have been exposed to certain risk. It can also be obtained as the number of persons at risk of the event of interest multiplied by the average length of the study period
  6. fIndependence is a characteristic of observations or random events. Essentially, the term is used to describe the property of independence of events or sample observations. It is an assumption required by many statistical tests. Independent variable is the variable in an experiment that is under the control of, and may be manipulated by, the experimenter. In regression analysis it is the variable being used to regress or predict the value of the dependent variable. It is also commonly known as regressor, predictor, or explanatory variable