Introduction to Planning a Study
Help Questions
AP Statistics › Introduction to Planning a Study
A state transportation agency asks: “Do digital roadside signs that display a driver’s current speed reduce average driving speed?” The population is all drivers on a particular highway segment. The proposed plan is to measure speeds for one week, install the digital signs, then measure speeds for the next week and compare the two weeks’ average speeds. Which aspect is most important to address before collecting data?
Plan to compute both the mean and median speed for each week
Address potential confounding from time effects (weather, enforcement, traffic) by using a concurrent control segment or randomization
Collect data for more weeks so the analysis includes more observations
Increase the number of speed measurements each day so the sample size is very large
Use a more sensitive radar device to reduce measurement error
Explanation
This question tests understanding of confounding in before-after studies. The plan measures speeds before and after installing signs, but many factors could change between weeks: weather conditions, traffic patterns, police enforcement, holidays, or random variation. These time-related confounders could explain any speed differences, not the signs themselves. Using a concurrent control segment without signs (B) or randomizing sign installation across multiple segments would control for these temporal effects. Simply increasing measurements (A) or collecting more weeks (E) doesn't address the confounding. Without controlling for time effects, the study cannot determine if speed changes are due to the signs or other factors that varied between the two weeks.
A fitness app company asks: “Does enabling a daily step-goal notification increase average daily steps?” The population is all current app users. The proposed plan is to compare average steps for users who turn on notifications in the settings to users who do not, using one month of app data. Which aspect is most important to address before collecting data?
Switch from average steps to total steps so the response variable is larger
Randomly assign notification status (or use a design that addresses confounding) because users who opt in may already be more motivated
Collect data for more months so the analysis includes more observations per user
Increase the number of users included so the sample size is extremely large
Plan to use a histogram of steps for each group in the final report
Explanation
This question focuses on self-selection bias and confounding. Users who choose to enable step-goal notifications are likely already more motivated about fitness than those who don't enable them. This self-selection creates confounding - any observed differences in steps could be due to pre-existing motivation levels, not the notifications themselves. Random assignment of notification status (B) would eliminate this confounding by ensuring groups are comparable except for notification status. Simply increasing sample size (A) or data collection time (E) won't address this fundamental design issue. Without randomization or other methods to address confounding, the study cannot determine if notifications actually increase steps or if motivated users both enable notifications and walk more.
A local news station asks: “What proportion of city residents support building a new sports arena?” The population is all adult residents of the city. The proposed plan is to stand outside the arena at a weekend event and interview as many adults as possible about whether they support the new arena. Which aspect is most important to address before collecting data?
Increase the sample size by interviewing people for more hours at the event
Use a Likert scale (strongly oppose to strongly support) instead of a yes/no question
Plan to compute a $95%$ confidence interval for the proportion who support the arena
Ask additional demographic questions to make the survey more detailed
Use a random sample of adult residents (e.g., random digit dialing or address-based sampling) rather than a convenience sample at the arena
Explanation
This question tests understanding of sampling bias in surveys. Interviewing people outside the arena at an event creates severe bias - attendees at arena events are likely sports fans who support arena construction more than typical residents. This convenience sample cannot represent all adult city residents' opinions. To estimate the true proportion of city residents who support the arena, the study needs a random sample of residents (A) through methods like random digit dialing or address-based sampling. Interviewing more people at the arena (B) just increases the biased sample size. Without proper random sampling from the population, the results will overestimate support by capturing mainly arena event attendees rather than representative city residents.
A principal asks: “Does playing instrumental music during independent work time improve math quiz scores?” The population is all 7th graders at the school. The proposed plan is to let each math teacher decide whether to play music in their classes for a month, then compare the average quiz scores of students in music classes vs. no-music classes. Which aspect is most important to address before collecting data?
Decide whether to report the results using medians instead of means
Make sure the quizzes are graded quickly so students get feedback
Randomly assign classes (or students) to music vs. no music to reduce confounding from teacher and class differences
Use a larger sample size by including 6th and 8th graders as well
Increase the number of quizzes so there are more scores to analyze
Explanation
This question addresses confounding in educational research. Letting teachers self-select whether to play music creates multiple confounding issues: teachers who choose music might have different teaching styles, enthusiasm levels, or classroom management approaches. Additionally, different classes may have varying ability levels or dynamics. These teacher and class differences could explain any observed quiz score differences, not the music itself. Random assignment of classes or students to music conditions (A) would control for these confounders and allow causal conclusions. Simply increasing quizzes (B) or sample size (C) doesn't address the fundamental design flaw. Without randomization, the study cannot determine if music actually improves scores or if other factors are responsible.
A restaurant chain asks: “Does a new menu layout increase average spending per customer?” The population is all customers at the chain’s locations. The proposed plan is to introduce the new menu at stores whose managers volunteer to try it, while other stores keep the old menu, then compare average spending across those stores over the next month. Which aspect is most important to address before collecting data?
Collect data for a longer time so there are more receipts to analyze
Randomly assign stores (or time periods within stores) to menu types to reduce confounding from store differences and manager selection
Decide whether to exclude customers who only buy drinks from the analysis
Use a more complex statistical model so the result is more convincing
Increase the number of participating stores so the sample size is larger
Explanation
This question addresses confounding in business experiments. Allowing managers to volunteer their stores for the new menu creates selection bias - managers who volunteer may be more innovative, have better-performing stores, or different customer bases. These store-level differences could explain any spending differences, not the menu layout itself. Random assignment of stores or time periods to menu types (B) would control for these confounders and allow causal conclusions about the menu's effect. Simply increasing stores (A) or collection time (E) doesn't fix this design flaw. Without randomization, the study cannot determine if spending differences are due to the new menu or to systematic differences between volunteer and non-volunteer stores.
An environmental group asks: “Do households that receive a water-conservation brochure reduce monthly water use?” The population is all households in a town. The proposed plan is to mail brochures to households that have emailed the group in the past, then compare their next month’s water use to the townwide average for that month. Which aspect is most important to address before collecting data?
Collect water-use data for more months to increase the number of observations
Increase the number of brochures mailed so the sample size is larger
Define the response variable precisely (e.g., gallons used per household) to avoid ambiguity
Plan to analyze the data with a confidence interval instead of a hypothesis test
Use a randomized comparison group from the town (and avoid selecting only prior supporters) to reduce bias and confounding
Explanation
This question tests recognition of selection bias and confounding. The plan sends brochures only to households that previously emailed the environmental group - these are likely already environmentally conscious households who may have been reducing water use anyway. Comparing them to the town average creates both selection bias and confounding. These motivated households don't represent typical town residents, and their water use patterns may differ for reasons unrelated to the brochure. Using a randomized comparison group from the general town population (C) would eliminate this bias and allow valid conclusions about the brochure's effect. Without proper randomization and avoiding selection of prior supporters, any observed differences cannot be attributed to the brochure itself.
A city health department asks: “Does sending text-message reminders increase flu vaccination rates?” The population is all adult residents of the city. The proposed plan is to recruit volunteers at a community health fair, then randomly assign those volunteers to receive weekly reminder texts or no texts, and compare vaccination within 2 months. Which aspect is most important to address before collecting data?
Use a double-blind procedure so neither group knows whether they received texts
Choose a larger sample size so the margin of error for the vaccination rate is smaller
Increase the number of reminder texts to improve the chance of finding a statistically significant effect
Decide whether to graph vaccination rates with a bar chart or a pie chart in the report
Ensure the volunteer sample is representative of all adult residents, since lack of representativeness limits generalizing to the city
Explanation
This question focuses on sampling bias and generalizability. The plan recruits volunteers at a health fair, which creates a severely biased sample - people attending health fairs are likely more health-conscious than the general population. This volunteer sample cannot represent all adult city residents, making it impossible to generalize findings to the entire city (the stated population). While the random assignment within this biased sample is good for internal validity, the external validity is compromised. Increasing reminders (B) or sample size (C) won't fix this fundamental sampling problem. To answer questions about the city's residents, the study needs a representative sample through random sampling methods, not convenience sampling at a health fair.
A consumer group wants to answer: “What proportion of residents in our state support a proposed ban on single-use plastic bags?” The population is all adult residents in the state. The plan is to call phone numbers from a list of people who previously signed online environmental petitions and ask whether they support the ban. Which aspect is most important to address before collecting data?
Decide whether the analysis will include a confidence interval or only a point estimate
Use a sampling frame that covers the full adult population (not just petition signers) to reduce selection bias
Increase the number of calls made each day to obtain a larger sample size
Use a 5-point scale instead of yes/no to measure strength of support
Plan to summarize results separately for urban and rural respondents
Explanation
This AP Statistics question on introduction to planning a study addresses sampling bias in opinion surveys for population proportions. Calling only prior petition signers creates selection bias toward environmentally inclined individuals, overestimating support for the plastic bag ban. A sampling frame covering all adults, as in choice A, reduces this bias for representativeness. Choice B, more calls for larger samples, is a common distractor but doesn't fix the biased frame. Mini-lesson: Use probability sampling to mirror the population and avoid voluntary or convenience bias. A flawed frame leads to unreliable estimates. Planning inclusively ensures the sample reflects diverse views accurately.
A principal wants to answer: “Does allowing students to listen to instrumental music during independent work improve quiz scores?” The population is all students in the school. The plan is for one teacher who likes music to use music during work time in her classes, while another teacher who prefers silence keeps her usual routine; then they will compare quiz scores between the two teachers’ classes. Which aspect is most important to address before collecting data?
Use a harder quiz so there is more variation in scores
Randomly assign students (or class periods) to music versus no music to avoid confounding with teacher differences
Use a 90% confidence level to make the interval narrower
Increase the number of quizzes given so the sample size of scores is larger
Report both the mean and median quiz score for each group
Explanation
This question in AP Statistics' introduction to planning a study examines experimental design flaws, particularly confounding in non-randomized setups. The plan lets teachers choose music based on preference, confounding the music effect with teacher styles or class compositions. Randomly assigning music conditions, as in choice A, is vital to isolate the music's impact on quiz scores. Choice B, more quizzes for larger samples, tempts as a distractor since it increases data but ignores confounding. Mini-lesson: Randomization in experiments balances groups, reducing bias from extraneous variables like teacher differences. Without it, associations may be spurious. Always design to control for known confounders for reliable causal inferences.
A city transportation office asks: “Is average commute time different for residents who primarily use public transit versus those who primarily drive?” The population is all city residents who commute to work. The plan is to post an online survey link on the city’s public transit social media pages and compare reported commute times of transit users and drivers who respond. Which aspect is most important to address before collecting data?
Plan to use a matched pairs design by pairing each transit user with a driver in the same neighborhood
Increase the number of questions about commute satisfaction to provide more context
Decide whether to report commute time in minutes or in hours
Use a sampling method that reaches a representative sample of all commuters, not just followers of transit pages
Increase the sample size by leaving the survey open for an extra month
Explanation
AP Statistics' introduction to planning a study covers sampling techniques to avoid bias in comparative studies. Posting the survey on transit social media pages introduces selection bias, as it overrepresents transit users and may not reach drivers representatively, skewing commute time comparisons. Addressing this by using a method for a representative sample of all commuters, per choice A, is essential for unbiased estimates. Choice E, extending the survey period for a larger sample, distracts because size alone doesn't correct for a non-representative frame. Mini-lesson: Ensure the sampling frame covers the entire population to prevent undercoverage; stratified or cluster sampling can help with subgroups. Biased samples lead to invalid generalizations. Planning for inclusivity strengthens the study's credibility.