Statistical Tests

Help Questions

USMLE Step 1 › Statistical Tests

Questions 1 - 10

Based on this result, what percentage of the variability in serum LDL cholesterol levels can be explained by the variability in daily saturated fat intake?

35%

49%

70%

95%

Explanation

The coefficient of determination, or r-squared (r²), represents the proportion of the variance in the dependent variable that is predictable from the independent variable. To find this value, the correlation coefficient (r) is squared. In this case, r = 0.70, so r² = (0.70)² = 0.49. This means that 49% of the variability in serum LDL cholesterol levels can be explained by the variability in daily saturated fat intake in this study population.

Based on this information, which of the following is the most appropriate conclusion?

Consumption of ice cream increases the risk of drowning.

Drowning incidents lead to increased ice cream consumption.

The observed association is likely due to a confounding variable.

The statistical test used was inappropriate for this type of data.

Explanation

Correlation does not imply causation. A strong correlation between two variables does not mean that one causes the other. In this classic example, a third factor, or confounding variable (e.g., warm weather), is likely responsible for the increase in both ice cream sales and swimming activities, which in turn leads to more drowning incidents. This is a much more plausible explanation than a direct causal link.

Which of the following is the best interpretation of this correlation coefficient?

Higher BMI is associated with higher fasting glucose levels in this group.

60% of the variation in fasting glucose is explained by the variation in BMI.

Elevated BMI is the primary cause of elevated fasting glucose.

For every 1 unit increase in BMI, fasting glucose increases by 0.60 mg/dL.

Explanation

The Pearson correlation coefficient (r) measures the strength and direction of a linear relationship between two continuous variables. A positive value (r = +0.60) indicates a positive association, meaning that as one variable (BMI) increases, the other variable (fasting glucose) tends to increase. The slope of the relationship is determined by regression analysis, not the correlation coefficient itself. The proportion of variance explained is r-squared (r²), which would be 0.36 or 36% in this case. Correlation does not establish causation.

Which statistical test is most appropriate for comparing the mean SBP before and after the intervention in this group of participants?

Chi-square test

Independent samples t-test

Linear regression

Paired t-test

Explanation

A paired t-test is the most appropriate test in this scenario because the measurements are taken from the same group of individuals at two different time points (before and after). This means the data points are paired or dependent. An independent samples t-test would be used if comparing two different groups of people. A chi-square test is for categorical data, and linear regression is for prediction.

Which statistical test should be used to determine if there is a significant difference in mean pain scores among the four treatment groups?

A series of six independent t-tests

Analysis of variance (ANOVA)

Chi-square test

Paired t-test

Explanation

Analysis of variance (ANOVA) is the appropriate statistical test for comparing the means of a continuous variable across three or more independent groups. Using multiple t-tests to compare each pair of groups would inflate the probability of making a Type I error (falsely concluding there is a difference). ANOVA analyzes the variance between groups relative to the variance within groups to test the null hypothesis that all group means are equal.

Which statistical test is most suitable for determining if there is a significant association between the presence of Allele A and the diagnosis of Crohn's disease?

Chi-square test

Pearson correlation

Independent samples t-test

Analysis of variance (ANOVA)

Explanation

The Chi-square test is used to assess for an association between two categorical variables. In this scenario, both the genetic polymorphism (present/absent) and the disease status (diagnosed/not diagnosed) are categorical. The test compares the observed frequencies in each category to the frequencies that would be expected if there were no association between the variables.

The use of an independent samples t-test in this situation may be invalid because which of its key assumptions has been violated?

The samples must be independent.

The variances of the two groups must be equal.

The data from the underlying population must be normally distributed.

The outcome variable must be continuous.

Explanation

The t-test is a parametric test, which means it relies on certain assumptions about the data. A key assumption is that the data are sampled from a population with a normal distribution. While the t-test is robust to minor violations, especially with larger sample sizes, significant skewness in a small sample (n=12) makes its results unreliable. The assumption of independence is met, and the outcome is continuous. While equal variances (homoscedasticity) is an assumption, the violation of normality is the more critical issue here prompting the use of a non-parametric test.

Which statistical test is the most appropriate to determine if there was a significant change in telomere length?

Mann-Whitney U test

Paired t-test

Wilcoxon signed-rank test

Spearman correlation

Explanation

This study uses a pre-post design with the same subjects, so the data are paired. The paired t-test is the standard parametric test for this design, but it assumes that the differences between paired measurements are normally distributed. Since this assumption is violated (the differences are skewed), the appropriate nonparametric alternative is the Wilcoxon signed-rank test.

What is the most appropriate statistical test to use to compare the central tendency of time to pain relief among the three groups?

Analysis of variance (ANOVA)

A series of Mann-Whitney U tests

Wilcoxon signed-rank test

Kruskal-Wallis test

Explanation

The Kruskal-Wallis test is the nonparametric alternative to one-way ANOVA. It is used to determine if there are statistically significant differences between two or more independent groups on a continuous or ordinal dependent variable when the assumptions of ANOVA (like normality) are not met. Since the data are highly skewed and there are three independent groups, the Kruskal-Wallis test is the most suitable choice.

Which statistical method would allow the researcher to evaluate the independent contribution of each factor to bone mineral density while controlling for the effects of the others?

Logistic regression

A series of Pearson correlations

Analysis of variance (ANOVA)

Multiple linear regression

Explanation

Multiple linear regression is the appropriate technique when assessing the relationship between a single continuous dependent variable (BMD) and two or more independent variables (age, calcium intake, vitamin D levels). This method allows for the assessment of each variable's independent effect while statistically controlling for the others, which helps to mitigate confounding.

Page 1 of 4

Return to subject

AI TutorYour personal study buddy

Question of the DayDaily practice to build your skills

AI SolverStep-by-step problem solutions

Learn by ConceptMaster concepts step by step

Practice TestsTest your skills with exam questions

QuizzesTest your knowledge with a quiz

GamesLearn through free games