Home

Tutoring

Subjects

Live Classes

Study Coach

Essay Review

On-Demand Courses

Colleges

Games

Opening subject page...

Loading your content

Statistics

Statistics Practice Test: Practice Test 21

Practice Test 21 for Statistics: real questions and explanations from the Varsity Tutors practice-test pool.

0%

0 / 25 answered

Question 1 of 25

A library wants to estimate the population parameter ppp, the proportion of all library visitors who prefer e-books over printed books. A random sample of n=120n=120n=120 visitors was surveyed, and the sample statistic was p^=0.35\hat{p}=0.35p^​=0.35.

Students ran a simulation of repeated random sampling (5,000 samples of size 120) using a model where the true proportion is 0.35. They found that about 95% of the simulated sample proportions were within 0.08 of 0.35.

What does the margin of error mean in context?

Question Navigator

All questions

Question 1

A library wants to estimate the population parameter ppp, the proportion of all library visitors who prefer e-books over printed books. A random sample of n=120n=120n=120 visitors was surveyed, and the sample statistic was p^=0.35\hat{p}=0.35p^​=0.35.

Students ran a simulation of repeated random sampling (5,000 samples of size 120) using a model where the true proportion is 0.35. They found that about 95% of the simulated sample proportions were within 0.08 of 0.35.

What does the margin of error mean in context?

  1. About 95% of random samples of 120 visitors will have p^\hat{p}p^​ within 8 percentage points of the true population proportion ppp. (correct answer)
  2. About 95% of individual visitors prefer e-books, within 8 percentage points.
  3. The true population proportion ppp is guaranteed to be between 0.27 and 0.43.
  4. The margin of error is 0.35 because that is the sample statistic.

Explanation: This problem asks about interpreting what a margin of error means in context. The simulation shows that when we repeatedly take random samples of 120 visitors, about 95% of those sample proportions fall within 0.08 (or 8 percentage points) of the true population proportion. This margin of error tells us about the reliability of our sampling process—it describes how much sample proportions typically vary from the true population value. The correct interpretation is that about 95% of random samples of 120 visitors will have sample proportions within 8 percentage points of the true population proportion. A common misconception is thinking the margin applies to individual visitors rather than to sample statistics. The margin of error describes sampling variability, not individual behavior. To interpret margin of error correctly, focus on what happens across many random samples, not on guarantees about the true value or individual data points.

Question 2

Using the outcomes listed in S={0,1,2,3,4,5,6,7,8,9}S=\{0,1,2,3,4,5,6,7,8,9\}S={0,1,2,3,4,5,6,7,8,9} for a random experiment where one digit is selected, let A={x∈S∣x is a multiple of 3}A=\{x\in S\mid x\text{ is a multiple of }3\}A={x∈S∣x is a multiple of 3} and let B={x∈S∣x is even}B=\{x\in S\mid x\text{ is even}\}B={x∈S∣x is even}. Which set of outcomes represents A∪BA\cup BA∪B (A or B)?

  1. {0,2,4,6,8}\{0,2,4,6,8\}{0,2,4,6,8}
  2. {0,3,6,9}\{0,3,6,9\}{0,3,6,9}
  3. {0,2,3,4,6,8,9}\{0,2,3,4,6,8,9\}{0,2,3,4,6,8,9} (correct answer)
  4. {6}\{6\}{6}

Explanation: This question asks about the union of two events defined by mathematical conditions. An event is a subset of the sample space containing outcomes meeting specific criteria. The union A∪B means 'A or B', including any outcome in at least one set. Event A contains multiples of 3: {0,3,6,9} and event B contains even numbers: {0,2,4,6,8}. The correct answer {0,2,3,4,6,8,9} includes all numbers that are multiples of 3 OR even (or both, like 0 and 6). A common error is finding only the intersection (choice D shows just {6}, which is both). To solve unions with conditions: list outcomes satisfying each condition separately, then combine all unique outcomes.

Question 3

A company inspects products for features. Event AAA is “a randomly selected product is waterproof,” and event BBB is “it has wireless charging.” The data show P(A)=0.55P(A)=0.55P(A)=0.55, P(B)=0.40P(B)=0.40P(B)=0.40, and P(A∩B)=0.25P(A\cap B)=0.25P(A∩B)=0.25. What is the probability that AAA or BBB occurs, where “or” is inclusive (waterproof or wireless charging or both)?

  1. 0.95
  2. 0.20
  3. 0.45
  4. 0.70 (correct answer)

Explanation: This problem requires the Addition Rule to find the probability of waterproof or wireless charging or both. Adding P(A) = 0.55 and P(B) = 0.40 gives 0.95, but this counts products with both features twice. We must subtract P(A∩B) = 0.25 to remove the double-counting of products that are both waterproof AND have wireless charging. The calculation is P(A∪B) = 0.55 + 0.40 - 0.25 = 0.70. This correctly accounts for all products with at least one of these features. The key insight is that when events overlap, we need to subtract their intersection—imagine a Venn diagram where the middle section would be counted twice without this correction.

Question 4

There are 7 runners in a race. Gold, silver, and bronze medals are awarded. Because medals are different, order matters. What is the probability that two particular runners, A and B, both win medals (in any order)?

  1. 3!7P3\dfrac{3!}{7\mathrm{P}3}7P33!​
  2. 2P2⋅5P17P3\dfrac{2\mathrm{P}2\cdot 5\mathrm{P}1}{7\mathrm{P}3}7P32P2⋅5P1​ (correct answer)
  3. (22)(51)(73)\dfrac{\binom{2}{2}\binom{5}{1}}{\binom{7}{3}}(37​)(22​)(15​)​
  4. 2P2⋅5P1⋅3!7P3\dfrac{2\mathrm{P}2\cdot 5\mathrm{P}1\cdot 3!}{7\mathrm{P}3}7P32P2⋅5P1⋅3!​

Explanation: This problem involves awarding distinct medals (gold, silver, bronze) where order matters because the medals are different. The total possible outcomes are 7P3 = 7×6×5 = 210 ways to award three different medals to 7 runners. For favorable outcomes (both A and B win medals), we need to count arrangements where A and B occupy 2 of the 3 medal positions, with the third position filled by one of the remaining 5 runners. This gives us 2P2 × 5P1 = 2×1 × 5 = 10 ways (2P2 for arranging A and B in 2 positions, 5P1 for the third medalist). Therefore, the probability is (2P2 × 5P1)/7P3 = 10/210 = 1/21. A key insight: switching A from gold to silver creates a different outcome, confirming order matters.

Question 5

A retailer suspects some online orders are fraudulent. Historically, 2% of orders are fraudulent.

Two automated rules are being considered:

  • Rule High-Sensitivity: flags 95% of fraudulent orders, but flags 10% of legitimate orders.
  • Rule High-Precision: flags 70% of fraudulent orders, but flags 1% of legitimate orders.

Flagged orders are held for manual review, while unflagged orders ship immediately. The retailer’s goal is to minimize the probability that a shipped (unreviewed) order is fraudulent, even if that increases the number of reviews.

Which strategy is most reasonable given the probabilities?

  1. Rule High-Sensitivity, because it misses fewer fraudulent orders, reducing fraud among shipped orders. (correct answer)
  2. Rule High-Precision, because it makes the review team more efficient by flagging mostly true fraud.
  3. Rule High-Precision, because fraud is only 2%, so missing some fraud will not change shipped-order risk much.
  4. Rule High-Sensitivity, because a flagged order has a 95% chance of being fraudulent.

Explanation: This problem involves analyzing decisions using probability to select a fraud detection rule. The decision goal is to minimize the probability that a shipped order is fraudulent. The key probabilities are the 2% base fraud rate, High-Sensitivity's 95% true positive and 10% false positive rates, and High-Precision's 70% true positive and 1% false positive rates. The High-Sensitivity rule best aligns with the goal because its higher detection rate leads to a lower conditional probability of fraud given shipping (about 0.11% versus 0.62% for High-Precision). One incorrect option is D, which confuses sensitivity with the goal metric, a flaw in misapplying conditional probabilities. Remember, probabilities guide minimizing fraud risk but do not guarantee no fraudulent orders ship. To transfer this strategy, identify the goal, then compare conditional probabilities like P(fraud | shipped) that matter most.

Question 6

A librarian wants to test whether a reminder text message increases the rate of on-time book returns. Each student who checks out a book is randomly assigned to either receive a reminder text 2 days before the due date or receive no text. The librarian then records whether each book is returned on time. Which type of study is described?

  1. Sample survey, because students are contacted by text message
  2. Observational study, because the librarian is only recording return times
  3. Experiment, because the librarian imposes a treatment (text vs no text) using random assignment (correct answer)
  4. Observational study, because random assignment only applies to sampling, not treatments

Explanation: This is a clear example of an experiment because the librarian assigns a treatment using randomization. When students check out books, the librarian randomly assigns them to either receive a reminder text or no text—this is random assignment of a treatment. The key feature of experiments is that researchers manipulate variables by assigning treatments, which happens here. Random sampling would involve selecting which students to include in the study from a larger population, but here all students checking out books are included. The random assignment allows the librarian to make causal conclusions: if the text group has higher on-time return rates, the difference can be attributed to the reminder texts rather than other factors. This isn't an observational study because the librarian actively intervenes by sending texts to some students. The fact that the outcome (on-time return) is observed doesn't make it observational—what matters is that a treatment was assigned. Ask: "Did the researcher assign a treatment?" Yes, so it's an experiment.

Question 7

A delivery company is choosing between two routing software packages. For each day, exactly one traffic condition occurs, producing the listed net profit (after all costs) for that day.

Strategy A:

  • Net profit $900 with probability 0.50
  • Net profit $300 with probability 0.30
  • Net loss $200 with probability 0.20

Strategy B:

  • Net profit $700 with probability 0.70
  • Net profit $100 with probability 0.20
  • Net loss $500 with probability 0.10

Which strategy has the greater expected value (higher average net profit per day) over many days?

  1. Strategy A, because it has the highest possible profit ($900).
  2. Strategy A, because its expected value is higher over many days. (correct answer)
  3. Strategy B, because its expected value is higher over many days.
  4. Strategy B, because it has the higher probability of a profit (0.90 vs 0.80).

Explanation: The skill here is comparing strategies using expected value to pick the routing software with higher average daily profit. Expected value is the long-run average net profit over many days. Compute it for each strategy by multiplying each profit or loss by its probability and summing the products. In this case, Strategy A has a higher expected value than Strategy B. Over many days, Strategy A would lead to higher average net profits, though daily results fluctuate. A common distractor is choosing Strategy A just for its highest possible profit of $900, but the decision relies on the probability-weighted average. For other situations, multiply all outcomes by their probabilities and compare the totals instead of isolated highs or lows.

Question 8

A water tank is being filled and a model for the amount of water is y=6x−10,y = 6x - 10,y=6x−10, where xxx is time since filling began (minutes) and yyy is the amount of water in the tank (liters). Is the yyy-intercept meaningful in context? Choose the best statement.

  1. The intercept means the tank gains −10-10−10 liters per minute at the start.
  2. Mathematically, when x=0x=0x=0 the model predicts y=6y=6y=6 liters, so the tank starts with 6 liters.
  3. The intercept means that for each 1 minute increase in xxx, the water decreases by 101010 liters per minute.
  4. Mathematically, when x=0x=0x=0 the model predicts y=−10y=-10y=−10 liters, but a negative amount of water is not meaningful in context. (correct answer)

Explanation: Evaluating the meaningfulness of the y-intercept in linear models requires context. The slope is change in y per unit x, 6 liters per minute, showing water increase. However, the intercept is -10 liters at x=0 minutes, which mathematically exists but negative water isn't meaningful physically. In the tank filling scenario, it suggests the model might not apply at t=0 or indicates an extrapolation issue. Common confusion is treating the intercept as a rate like the slope. To interpret, label axes 'water (liters)' on y, 'time (minutes)' on x, with units attached.

Question 9

A wildlife volunteer recorded the number of minutes it took to spot a particular bird on 13 different mornings:

6, 7, 7, 8, 8, 8, 9, 9, 10, 10, 11, 12, 25

Which measure of spread is most increased by the 25-minute morning?

  1. Distance between the median and Q1Q_1Q1​
  2. Spread of the middle 50% of the data (as described by the box length in a box plot)
  3. IQR
  4. Range (correct answer)

Explanation: This question addresses how outliers affect spread measures in bird-spotting times. The distribution is right-skewed, with most times 6–12 minutes and a tail at 25. Typical time is the median at 9, resistant to the outlier. The outlier increases the range most, from 12–6 to 25–6, while IQR changes little. Choice A is supported, as the data list shows 25 extends the max without shifting quartiles much. A error is thinking IQR is most affected, but it focuses on middle 50%, ignoring extremes. Spot outliers first, then see which spread like range is sensitive.

Question 10

At a school, two events are defined: Event A = “a student earns an A in English,” and Event B = “a student is in the yearbook club.” A student says, “If you’re in yearbook club, you’ll get an A in English.” The data only show that A’s are more common among yearbook club members than among nonmembers. Which statement best describes whether A and B are independent?

  1. They are independent because some yearbook club members earn A’s and some do not.
  2. They are not independent because knowing a student is in yearbook club changes the chance the student earns an A in English. (correct answer)
  3. They are independent because being in yearbook club and earning an A in English are unrelated activities.
  4. They are not independent because being in yearbook club causes students to earn an A in English.

Explanation: The concept tested is independence between club membership and grades, using school data to check for association. 'Given that B' limits to yearbook club members, where earning an A is more common than among nonmembers. This difference means knowing about club membership changes the probability of an A, indicating not independent, as in choice B. The data show correlation without proving causation, matching the scenario's nuance. One misconception is equating independence with no overlap, but it's about whether conditionals match unconditionals. Rewrite: 'Among yearbook members (B), what fraction earn an A (A)?'—if it differs from overall, they're linked. This approach helps verify direction and spot dependencies in extracurricular impacts.

Question 11

A school collected data from 168 students about whether they bring lunch from home or buy lunch at school, and whether they drink water or juice at lunch. The two-way table shows the counts.

What is the conditional relative frequency (as a percent) of students who drink water given that the student brings lunch from home?

WaterJuiceTotal
Brings from home543084
Buys at school364884
Total9078168
  1. 60.0%
  2. 53.6%
  3. 64.3% (correct answer)
  4. 32.1%

Explanation: This question asks for the conditional relative frequency of drinking water given that the student brings lunch from home, based on that subgroup. Use the row total for brings from home as the denominator, which is 84. Locate the water cell in that row, 54, and calculate 54 divided by 84, approximately 0.643 or 64.3%. A common misconception is using the water column total of 90, yielding 54/90 = 60%, disregarding the condition. To transfer this, interpret 'given brings from home' as the row total, then ratio the relevant cell to that total for the conditional frequency.

Question 12

A card is drawn from a standard 52-card deck. Let AAA be “the card is a heart” and BBB be “the card is a face card (J, Q, or K).” Given that P(A)=14P(A)=\tfrac{1}{4}P(A)=41​, P(B)=313P(B)=\tfrac{3}{13}P(B)=133​, and P(A∩B)=0.05P(A\cap B)=0.05P(A∩B)=0.05, are AAA and BBB independent? Justify using probabilities by comparing P(A∩B)P(A\cap B)P(A∩B) to P(A)P(B)P(A)P(B)P(A)P(B).

  1. No; P(A)P(B)=14⋅313=352≈0.0577P(A)P(B)=\tfrac{1}{4}\cdot\tfrac{3}{13}=\tfrac{3}{52}\approx 0.0577P(A)P(B)=41​⋅133​=523​≈0.0577, and this is not 0.050.050.05. (correct answer)
  2. Yes; P(A)P(B)=14⋅313=352=0.05P(A)P(B)=\tfrac{1}{4}\cdot\tfrac{3}{13}=\tfrac{3}{52}=0.05P(A)P(B)=41​⋅133​=523​=0.05, matching P(A∩B)P(A\cap B)P(A∩B).
  3. Yes; P(A)+P(B)=0.25+313≈0.4808P(A)+P(B)=0.25+\tfrac{3}{13}\approx 0.4808P(A)+P(B)=0.25+133​≈0.4808, which equals P(A∩B)=0.05P(A\cap B)=0.05P(A∩B)=0.05.
  4. Yes; suit and rank seem unrelated, so AAA and BBB must be independent.

Explanation: Events are independent if one doesn't influence the other's probability. We check this by seeing if P(A∩B) equals P(A)P(B). Given P(A) = 1/4 = 0.25 and P(B) = 3/13 ≈ 0.2308, P(A)P(B) = 0.25 × 3/13 = 3/52 ≈ 0.0577. This does not match P(A∩B) = 0.05, so A and B are not independent. A misconception in choice C is claiming 3/52 equals 0.05, but 3/52 is actually about 0.0577, not 0.05. The reliable strategy is to compute the exact product and compare it to P(A∩B). Avoid assuming independence based on unrelated traits like suit and rank without verifying the probabilities.

Question 13

A library randomly assigned 56 patrons to receive one of two checkout messages at the self-checkout kiosk: Treatment A (message emphasizing due date) and Treatment B (message emphasizing returning items for others). There were 28 patrons per group. The outcome was whether the patron returned the book on time.

Results: Treatment A on-time returns = 23/28 (82.1%), Treatment B on-time returns = 16/28 (57.1%). Observed difference in proportions (A − B) is about 0.250.250.25.

A randomization test shuffled the 56 return outcomes between groups 5,000 times under “no treatment effect.” In the simulations, 15 of the 5,000 shuffled differences were at least as large as 0.250.250.25.

Which conclusion is most reasonable about whether Treatment A increases the on-time return rate compared to Treatment B (one-sided, A − B)?

  1. Because 15 out of 5,000 simulated differences were at least as large as 0.250.250.25, the observed difference is rare under no effect, so there is evidence that Treatment A increases the on-time return rate. (correct answer)
  2. Because the study used random assignment, it proves Treatment A increases the on-time return rate for every patron, not just on average.
  3. Because 15 out of 5,000 is a large number, the observed difference is common under no effect, so there is no evidence of a treatment effect.
  4. Because patrons were not randomly sampled from all library patrons everywhere, random assignment cannot support a causal conclusion within this experiment.

Explanation: In comparing treatments using a randomized experiment and simulation, we examine if emphasizing due dates (Treatment A) raises on-time book returns over emphasizing others' needs (Treatment B). Random assignment ensures fairness, supporting that differences stem from the messages. The observed difference is the on-time proportion in A minus B, about 0.25 here. The 'no effect' simulation reassigns outcomes 5,000 times, forming a distribution of chance differences under no treatment effect. Only 15 out of 5,000 were at least as large, indicating rarity and evidence for A's benefit. Misconception: random assignment ≠ random sampling—it allows causality in this experiment, not universally, and rarity offers evidence, not proof. For other studies, focus on the frequency of simulated differences equaling or surpassing the observed extremity.

Question 14

A ball is tossed upward from a balcony, and its height is measured at different times. The data are shown below.

Time (seconds): 0, 1, 2, 3, 4, 5 Height (meters): 12, 19, 22, 21, 16, 7

Which type of function is most reasonable for modeling the relationship between time and height?

  1. Inverse variation
  2. Quadratic (correct answer)
  3. Linear
  4. Exponential

Explanation: To fit a function to real-world data, we examine cues like constant differences (linear), constant ratios (exponential), or constant second differences with a turning point (quadratic). Here, the first differences are 7, 3, -1, -5, -9, and second differences are -4, -4, -4, -4, which are constant, signaling a quadratic model. This curved shape with a peak at around 2-3 seconds matches the parabolic trajectory of a tossed ball under gravity. Quadratic is reasonable for modeling projectile motion with acceleration due to gravity. A misconception is mistaking the initial rise for exponential growth, but the turning point and negative second differences rule that out. Always check second differences after first ones to identify quadratic patterns.

Question 15

A game uses a fair spinner with 4 equal sections labeled 1, 2, 3, and 4. The spinner is spun once. Define the random variable XXX as the winnings in dollars, where the player wins 0,1,1,0, 1, 1,0,1,1, or 333 dollars when the spinner lands on 1, 2, 3, or 4 respectively. Which table correctly represents the probability distribution of XXX?

  1. | XXX | P(X)P(X)P(X) |\n|---|---|\n| 000 | 14\tfrac{1}{4}41​ |\n| 111 | 12\tfrac{1}{2}21​ |\n| 333 | 14\tfrac{1}{4}41​ | (correct answer)
  2. | XXX | P(X)P(X)P(X) |\n|---|---|\n| 000 | 14\tfrac{1}{4}41​ |\n| 111 | 14\tfrac{1}{4}41​ |\n| 333 | 14\tfrac{1}{4}41​ |
  3. | XXX | P(X)P(X)P(X) |\n|---|---|\n| 000 | 14\tfrac{1}{4}41​ |\n| 111 | 14\tfrac{1}{4}41​ |\n| 222 | 14\tfrac{1}{4}41​ |\n| 333 | 14\tfrac{1}{4}41​ |
  4. | XXX | P(X)P(X)P(X) |\n|---|---|\n| 111 | 14\tfrac{1}{4}41​ |\n| 222 | 14\tfrac{1}{4}41​ |\n| 333 | 14\tfrac{1}{4}41​ |\n| 444 | 14\tfrac{1}{4}41​ |

Explanation: This problem involves a random variable representing winnings based on spinner outcomes. A random variable X maps each outcome to a numerical value - here, the dollar amount won. The spinner outcomes {1, 2, 3, 4} map to winnings: X(1) = 0,X(2)=0, X(2) = 0,X(2)=1, X(3) = 1,X(4)=1, X(4) = 1,X(4)=3. Since outcomes 2 and 3 both yield $1, we combine their probabilities: P(X = 1) = 1/4 + 1/4 = 1/2. The complete distribution shows P(X = 0) = 1/4, P(X = 1) = 1/2, P(X = 3) = 1/4, totaling 1. Students often mistakenly think each X-value must have equal probability, but probabilities depend on how many outcomes map to each value. To construct the distribution: map each outcome to its X-value → group outcomes with the same X-value → sum their probabilities.

Question 16

A card is drawn at random from a set of 12 equally likely cards labeled 1 through 12.

Let event AAA be “the number is a multiple of 3” and event BBB be “the number is even.”

Using the outcomes in this model, what is P(A∣B)P(A\mid B)P(A∣B)? Give your answer as a simplified fraction.

  1. 12\tfrac{1}{2}21​
  2. 13\tfrac{1}{3}31​ (correct answer)
  3. 16\tfrac{1}{6}61​
  4. 14\tfrac{1}{4}41​

Explanation: This question tests conditional probability with a card model containing 12 equally likely cards. The notation P(A|B) means we're finding the probability of A given that B has occurred. Event B (even numbers) contains {2, 4, 6, 8, 10, 12}, which gives us 6 outcomes. Among these 6 even numbers, we need to count how many are also multiples of 3 (event A). The numbers 6 and 12 are both even and multiples of 3, so 2 outcomes satisfy both conditions. Therefore, P(A|B) = 2/6 = 1/3, the fraction of even numbers that are also multiples of 3. A typical mistake is dividing by 12 (all cards), but the strategy is to first identify all outcomes in B, then count those also in A.

Question 17

A gym tracked 180 teen members by workout time and whether they prefer cardio or strength training. Based on the two-way table, which value represents the joint relative frequency (as a decimal) of a teen who works out in the Morning and prefers Cardio?

  1. 0.20 (correct answer)
  2. 0.33
  3. 0.36
  4. 0.56

Explanation: Joint relative frequency represents the proportion of the entire sample that falls into both specified categories simultaneously. Here we need teens who work out in the morning AND prefer cardio, divided by the total 180 teens. Locate the cell where the Morning row intersects with the Cardio column—this gives the count of morning cardio teens. Then divide by 180 to get the decimal form: (morning cardio count) ÷ 180. A frequent mistake is dividing by just the morning total or just the cardio total instead of the grand total. Remember: joint frequencies always use the entire sample size as the denominator, capturing how common this specific combination is overall.

Question 18

A box of candies is said to contain 25% strawberry candies, so P(strawberry)=0.25P(\text{strawberry})=0.25P(strawberry)=0.25. You randomly select 16 candies (with replacement) and get 7 strawberry candies. You simulate 200 runs of 16 selections under P(strawberry)=0.25P(\text{strawberry})=0.25P(strawberry)=0.25. In the simulations, 6 out of 200 runs produced 7 or more strawberry candies. What does the simulation suggest about how unusual the observed result is, and whether you should doubt the model?

  1. It is not unusual because the correct comparison is 7 or fewer strawberries, not 7 or more.
  2. It is somewhat rare (6 out of 200), so it may raise doubt about the 25% model, though it does not prove the model is wrong. (correct answer)
  3. It proves the model is wrong because the observed count is not exactly 0.25×16=40.25\times 16=40.25×16=4.
  4. It is fairly common (6 out of 200), so it strongly supports the 25% model.

Explanation: We're checking if data like 7 strawberry candies in 16 selections fit P(strawberry)=0.25. Unusual outcomes happen by chance, but very rare ones can prompt doubt. Simulation mimics selections many times under the model to frequency-count extremes, such as 7 or more strawberries. With 6 out of 200 simulations (3%) showing that, it's somewhat rare, possibly raising doubt without proving the model wrong. This suggests mild evidence against the 25% claim. Misconception: randomness isn't alternating flavors; small samples fluctuate. Broadly, define 'extreme,' simulate extensively, and interpret the frequency for consistency.

Question 19

In a video game, two events happen in order. First, you must find a key (event AAA). If you find the key, you then attempt to open a locked door (event BBB). Suppose P(A)=0.40P(A)=0.40P(A)=0.40 and P(B∣A)=0.75P(B\mid A)=0.75P(B∣A)=0.75. What is the probability that both AAA and BBB occur, P(A∩B)P(A\cap B)P(A∩B)?

  1. 0.300.300.30 (correct answer)
  2. 0.750.750.75
  3. 1.151.151.15
  4. 0.150.150.15

Explanation: This question tests the Multiplication Rule for probability. The "and" represents sequential events in the game: first you must find the key, then you can attempt to open the door. The first event A (finding the key) has probability P(A) = 0.40. Given that you found the key, the conditional probability of opening the door is P(B|A) = 0.75. By the Multiplication Rule, P(A∩B) = P(A) × P(B|A) = 0.40 × 0.75 = 0.30. A common error is to add these probabilities (getting 1.15), but the strategy is to read "and" as "first find the key, then open the door" and multiply.

Question 20

A cafeteria manager claims: “Most students prefer the new menu.” To support this, they surveyed 200 students by standing next to the salad bar during lunch and asking students who walked by to respond. 150 said they prefer the new menu and 50 said they do not. Which critique best evaluates the claim?

  1. The claim is well supported because 150 out of 200 is more than half.
  2. The claim is not well supported because some students might have been hungry while answering.
  3. The claim is not well supported because the sample is a convenience sample taken near the salad bar and may not represent all students’ preferences. (correct answer)
  4. The claim is not well supported because 200 students is too small to estimate what most students prefer.

Explanation: In evaluating data-based reports, sample representativeness is critical for generalizing claims. The manager claims most students prefer the new menu, based on surveying 200 students near the salad bar, with 150 approving. The key limitation is the convenience sample, which may overrepresent health-conscious students and not reflect the whole population. This undermines the claim by introducing sampling bias, potentially skewing results. The correct critique is essential as it points out how non-random sampling limits broader inferences. A misconception is that a large sample like 200 ensures validity, but size doesn't correct bias; representativeness matters more. To evaluate, check (1) data collection (convenience method), (2) comparisons (none, just proportion), and (3) honest display (no issues).

Question 21

A company models the cost of producing custom T-shirts with y=120+7x,y = 120 + 7x,y=120+7x, where xxx is the number of shirts produced (shirts) and yyy is the total cost (dollars). What does the y-intercept represent in this context? Include units in your interpretation.

  1. When x=0x=0x=0, the model predicts the total cost is 777 dollars.
  2. For each additional 111 shirt, the model predicts the total cost is 120120120 dollars per shirt.
  3. For each additional 111 dollar of cost, the model predicts 777 more shirts are produced.
  4. When x=0x=0x=0, the model predicts the total cost is 120120120 dollars. (correct answer)

Explanation: The concept here is interpreting the slope and intercept in a linear model, which breaks down costs into fixed and variable parts in production contexts. The slope represents the predicted change in the response variable y for each 111-unit increase in the predictor variable xxx, with units such as dollars per shirt. In this T-shirt production model, the slope of 777 means that for each additional shirt produced, the total cost is predicted to increase by 777, reflecting the marginal cost per unit. The y-intercept is the predicted value of y when xxx equals 000, often indicating setup costs. Here, the intercept of 120120120 shows that when no shirts are produced (x=0x=0x=0 shirts), the predicted total cost is 120120120, which is meaningful as it could cover initial setup or fixed expenses. A common misconception is mixing up slope and intercept, such as thinking the fixed cost is the per-unit rate rather than the baseline. To apply this elsewhere, label the axes with variable names and units, like 'shirts' for xxx and 'dollars' for y, and attach units to the slope and intercept for precise analysis.

Question 22

A music app compared the number of songs streamed by users in one day. Data Set A is right-skewed with a few heavy users. Data Set B is also right-skewed but less extreme.

Summary statistics:

  • Data Set A: mean xˉ=36.0\bar{x}=36.0xˉ=36.0, median =22.0=22.0=22.0, standard deviation s=40.0s=40.0s=40.0, IQR =18=18=18
  • Data Set B: mean xˉ=28.0\bar{x}=28.0xˉ=28.0, median =24.0=24.0=24.0, standard deviation s=18.0s=18.0s=18.0, IQR =12=12=12

Which statement best compares the typical value and variability of Data Set A and Data Set B using appropriate measures?

  1. Use the median and IQR: Data Set A typically streams more songs (higher median) and is more variable (larger IQR) than Data Set B.
  2. Use the median and IQR: Data Set B typically streams more songs (higher median) and is more consistent (smaller IQR) than Data Set A. (correct answer)
  3. Use the mean and standard deviation: Data Set A typically streams more songs (higher mean) and is more consistent (smaller sss) than Data Set B.
  4. Use the mean and standard deviation: Data Set B typically streams more songs (higher mean) and is more variable (larger sss) than Data Set A.

Explanation: Comparing distributions means analyzing center and spread for typical values and variability. Both sets skewed, so median and IQR are robust choices. Data Set B's median of 24 tops A's 22, showing more typical streams in B. Data Set B's IQR of 12 is less than A's 18, indicating higher consistency in B. The correct statement employs median and IQR, matching B's elevated center and lower spread. Error to avoid: using mean in skewed data, as A's is pulled high by extremes. Process: identify skew, pick median/IQR, and compare in streaming behavior context.

Question 23

A cafeteria tracked 200 student purchases by drink choice and whether the student bought a dessert. Based on the two-way table, what is the conditional relative frequency (as a percent) of Bought dessert = Yes given that the student chose Water?

  1. 20%
  2. 40%
  3. 60%
  4. 30% (correct answer)

Explanation: To find the conditional relative frequency of buying dessert given water choice, focus only on water drinkers as your population. The phrase "given that the student chose Water" identifies this subgroup. Locate the water row and find both the total water drinkers and how many bought dessert. Calculate: (water drinkers buying dessert) ÷ (total water drinkers) × 100%. Don't use the full 200 students—conditional frequencies restrict to a specific subgroup. This reveals whether water drinkers have different dessert-purchasing habits compared to other drink choices. Remember to convert your decimal answer to a percentage.

Question 24

School context: Event A is “a student is enrolled in AP Biology,” and event B is “a student is a senior.” The school reports: among seniors, 30% are enrolled in AP Biology; among non-seniors, 30% are enrolled in AP Biology. Which statement best describes whether A and B are independent?

  1. They are independent because AP Biology and being a senior are unrelated topics in school.
  2. They are not independent because it proves being a senior causes a student to enroll in AP Biology.
  3. They are not independent because 30% is not a very large percent, so enrollment must depend on grade level.
  4. They are independent because being a senior does not change the chance of being enrolled in AP Biology in this report. (correct answer)

Explanation: This question examines independence of events, using school grade levels and class enrollment to see if one influences the other in probability terms. Events are independent if knowing about one (like being a senior, event B) doesn't change the probability of the other (enrolling in AP Biology, event A). Here, the chance of A is 30% among seniors and also 30% among non-seniors, so the information about B doesn't alter P(A). This matches the report, showing no dependence between grade and enrollment. A misconception is thinking low percentages like 30% imply dependence, but independence is about equal probabilities across groups, not the size. To verify, rephrase: among seniors, what fraction are in AP Bio? It's the same as among non-seniors, confirming independence—always compare conditioned probabilities to check.

Question 25

A bag contains 5 red marbles and 3 blue marbles. Three marbles are drawn without replacement. Since only the set of marbles drawn matters, order does not matter. What is the probability that exactly 2 of the marbles drawn are red?

  1. (52)(31)(82)\dfrac{\binom{5}{2}\binom{3}{1}}{\binom{8}{2}}(28​)(25​)(13​)​
  2. (53)(83)\dfrac{\binom{5}{3}}{\binom{8}{3}}(38​)(35​)​
  3. 5P2⋅3P18P3\dfrac{5\mathrm{P}2\cdot 3\mathrm{P}1}{8\mathrm{P}3}8P35P2⋅3P1​
  4. (52)(31)(83)\dfrac{\binom{5}{2}\binom{3}{1}}{\binom{8}{3}}(38​)(25​)(13​)​ (correct answer)

Explanation: This problem asks about drawing marbles where only the set of marbles matters, not the order drawn. Since drawing {red, red, blue} gives the same set as {red, blue, red}, order doesn't matter and we use combinations. The total possible outcomes are C(8,3) = 56 ways to choose any 3 marbles from 8. For favorable outcomes (exactly 2 red and 1 blue), we need C(5,2) ways to choose 2 red from 5 AND C(3,1) ways to choose 1 blue from 3, giving C(5,2) × C(3,1) = 10 × 3 = 30 ways. Therefore, the probability is [C(5,2) × C(3,1)]/C(8,3) = 30/56 = 15/28. Remember: when drawing marbles into a group, rearranging them doesn't create a new outcome, so use combinations.