Biased and Unbiased Point Estimates
Help Questions
AP Statistics › Biased and Unbiased Point Estimates
A school wants to estimate the population mean number of minutes students at the school spend on homework per night, $\mu$. A simple random sample of 60 students is selected, and each student reports their minutes. The point estimate is constructed as the sample mean, $\bar{x}$, computed by adding the 60 reported values and dividing by 60. Which statement describes whether the estimate is biased?
The estimate is biased because self-reported data increase variability, so $\bar{x}$ will tend to be too high.
The estimate is unbiased because for an SRS, $E(\bar{x})=\mu$.
The estimate is biased because $\bar{x}$ is only unbiased when the population is normally distributed.
The estimate is biased because samples of size 60 are not large enough to avoid bias.
The estimate is unbiased only if the sample mean equals the population mean exactly.
Explanation
This question tests understanding of unbiased estimators for population means. When we have a simple random sample (SRS), the sample mean $\bar{x}$ is an unbiased estimator of the population mean $\mu$, meaning that if we repeated the sampling process many times, the average of all those sample means would equal $\mu$. The fact that data is self-reported doesn't create bias in the estimator itself - it might affect the accuracy of individual values, but the sample mean from an SRS still targets the true population mean. Choice A incorrectly confuses variability with bias; increased variability affects precision but not whether the estimator is unbiased. Remember: bias is about whether an estimator systematically over- or underestimates the parameter in the long run, while variability is about how spread out the estimates are.
A wildlife biologist wants to estimate the population mean weight $\mu$ of adult fish in a lake. The biologist catches fish using a net with holes large enough that very small fish often escape before being weighed. The point estimate is the sample mean $\bar{x}$ of the weights of the 70 fish that are successfully caught and weighed. Which statement describes whether the estimate is biased?
The estimate is unbiased because $\bar{x}$ is a sample mean, and sample means are always unbiased.
The estimate is biased because the sample mean will not equal the population mean exactly in a single sample.
The estimate is unbiased because catching 70 fish is a large sample, which eliminates bias.
The estimate is biased because the net tends to miss lighter fish, so the sample mean will tend to be larger than $\mu$.
The estimate is unbiased only if the fish weights are normally distributed.
Explanation
This question illustrates selection bias through a flawed sampling mechanism. The net's design systematically excludes lighter fish that slip through the holes, meaning the sample only includes heavier fish. This creates an upward bias where $\bar{x}$ will consistently overestimate the true population mean weight $\mu$. No matter how many times the biologist repeats this sampling method, the average of all sample means will be larger than $\mu$ because lighter fish are systematically excluded. Choice A incorrectly assumes all sample means are unbiased, ignoring how the sample was obtained. Choice C wrongly claims large samples eliminate bias, but catching 700 or 7,000 fish with the same net would still exclude the lighter ones. The bias comes from the systematic exclusion of part of the population, not from random sampling variability.
A school district wants a point estimate of the population proportion $p$ of all high school students in the district who get at least 8 hours of sleep on a typical school night. The district selects a simple random sample of 200 students from the full enrollment list and computes the sample proportion $\hat p$ who report at least 8 hours. The district reports $\hat p$ as the point estimate for $p$. Which statement describes whether the estimate is biased?
The estimate is biased because $\hat p$ will vary from sample to sample.
The estimate is unbiased only if the sample size is much larger than 200; otherwise it is biased.
The estimate is unbiased because $\hat p$ is an unbiased estimator of $p$ for a simple random sample.
The estimate is biased because increasing the sample size would change the expected value of $\hat p$.
The estimate is biased because using a sample instead of a census always underestimates $p$.
Explanation
This question tests understanding of biased versus unbiased point estimates. An estimator is unbiased if its expected value (long-run average) equals the population parameter it estimates. The sample proportion $\hat{p}$ from a simple random sample is an unbiased estimator of the population proportion $p$, meaning that if we took many random samples and computed $\hat{p}$ for each, the average of all these $\hat{p}$ values would equal $p$. Choice A incorrectly confuses variability with bias—having sample-to-sample variation doesn't make an estimator biased. Remember: bias is about whether the estimator is systematically off-target on average, while variability is about how spread out the estimates are.
A company wants a point estimate of the population mean commute time $\mu$ (in minutes) for all its employees. To construct the estimate, the HR manager surveys the first 50 employees who arrive at work on Monday and computes their sample mean commute time $\bar x$, reporting $\bar x$ as the point estimate for $\mu$. Which statement describes whether the estimate is biased?
The estimate is biased because surveying early arrivals is not a random sample and could systematically misrepresent commute times.
The estimate is biased because $\bar x$ has sampling variability, so it cannot be unbiased.
The estimate is unbiased because any sample mean $\bar x$ is an unbiased estimator of $\mu$ regardless of how the sample is chosen.
The estimate is unbiased because bias depends only on the population standard deviation, not on the sampling method.
The estimate is unbiased because the sample size is 50, which is large enough to eliminate bias.
Explanation
This question tests recognizing bias from non-random sampling methods. The HR manager surveys only the first 50 employees to arrive, which is not a random sample of all employees. Early arrivals might have systematically different commute times (perhaps shorter commutes make it easier to arrive early), so this sampling method could produce a biased estimate. Choice B correctly identifies this bias. Choice A is wrong because being an unbiased estimator requires random sampling from the target population. Remember: the formula for $\bar{x}$ doesn't create bias, but the sampling method can—if your sample systematically excludes or overrepresents certain groups, your estimate will be biased regardless of what statistic you calculate.
To estimate the population mean commute time $\mu$ for all employees at a large firm, an analyst takes an SRS of 40 employees and records each commute time. She reports the point estimate $\bar{x}+5$ minutes, explaining that people often underreport commute times by about 5 minutes. Which statement describes whether the estimate $\bar{x}+5$ is biased for $\mu$?
The estimate is biased because $\bar{x}$ has sampling variability.
The estimate is biased low because its expected value is $\mu-5$.
The estimate is biased high because its expected value is $\mu+5$.
The estimate is unbiased for large samples, but biased for small samples like $n=40$.
The estimate is unbiased because adding a constant does not affect the center of the sampling distribution.
Explanation
This question evaluates understanding of biased and unbiased estimators, particularly how modifying the sample mean affects bias in estimating mu. Adding 5 to ar{x} shifts the entire sampling distribution by 5, so the expected value becomes mu + 5, making it biased high; over many SRSs, the long-run average of (ar{x} + 5) would be mu + 5, not mu. This intentional adjustment introduces systematic error, even though ar{x} itself is unbiased. Choice E is a distractor that confuses bias with variability, as the variability of ar{x} exists but doesn't cause bias—the addition does. Bias vs. variability mini-lesson: Bias is a fixed offset in the center of the sampling distribution from the true parameter, independent of sample size, while variability is the dispersion that shrinks as n increases, but here the bias persists regardless of n=40. Hence, ar{x} + 5 is biased high.
A public health team wants a point estimate of the population mean systolic blood pressure $\mu$ for adults in a county. They take an SRS of 80 adults, but only those who show up to the clinic are measured; 25% of selected adults do not show up. The team computes the sample mean of the measured participants and reports it as the estimate of $\mu$. Which statement describes whether the estimate is biased?
The estimate is biased, but the bias disappears if the measured sample size is still at least 30.
The estimate is unbiased because the original selection was an SRS of 80 adults.
The estimate may be biased because nonresponse can make the measured group systematically different from the selected sample.
The estimate is unbiased only if the blood pressure distribution is normal.
The estimate is unbiased because nonresponse only increases variability, not bias.
Explanation
This question evaluates bias from nonresponse in an SRS when estimating mu. With 25% nonresponse, the measured group's mean may differ systematically from the full sample, potentially biasing the estimate if non-respondents have different blood pressures; long-run repetitions could center away from mu due to this self-selection. Nonresponse bias affects the sampling distribution's center. Choice C is a distractor stating nonresponse only increases variability, but it can also introduce bias. Bias vs. variability mini-lesson: Bias systematically shifts the center and isn't fixed by remaining sample size, while variability grows with nonresponse but is separate—here, bias may arise from who responds. Therefore, the estimate may be biased.
A teacher wants a point estimate of the population mean score $\mu$ on a 20-question quiz for all students in the course. She randomly selects 10 quizzes to grade, but she only grades quizzes from students who were present on the day after the quiz (students who were absent are excluded). She computes the mean of the graded quizzes and reports it as the point estimate for $\mu$. Which statement describes whether the estimate is biased?
The estimate may be biased because excluding absent students can systematically change the average compared with all students.
The estimate is biased because the sample mean varies from sample to sample.
The estimate is biased only if the sample size is less than 30; with 10 it is biased, but with 30 it would be unbiased.
The estimate is unbiased because a mean is always an unbiased estimator of a population mean.
The estimate is unbiased because the 10 graded quizzes were randomly selected.
Explanation
This question tests recognizing potential bias from excluding absent students when estimating mu via selected quizzes. Excluding absentees may systematically alter the mean if their scores differ (e.g., lower), so the expected value might not equal mu; over many selections, the long-run average could deviate due to this undercoverage. The method doesn't ensure the sampling distribution centers at mu. Distractor choice E mistakes variability for bias, as variation exists but doesn't cause the systematic shift. Mini-lesson: Bias displaces the center from the parameter due to flawed sampling and persists regardless of size (here 10), whereas variability is spread that reduces with larger n but can't eliminate bias. Thus, the estimate may be biased.
A researcher wants a point estimate of the population mean daily screen time $\mu$ for all adults in a state. She selects a simple random sample of 100 adults from the state and computes the sample mean $\bar{x}$. She then reports the point estimate $2\bar{x}$ (twice the sample mean) to "account for multitasking." Which statement describes whether $2\bar{x}$ is biased for $\mu$?
The estimate is unbiased because multiplying by 2 reduces sampling variability.
The estimate is biased high because its expected value is $2\mu$ rather than $\mu$.
The estimate is biased, but the bias disappears for $n=100$.
The estimate is unbiased because $\bar{x}$ is unbiased and scaling does not change bias.
The estimate is unbiased only when $\mu=0$.
Explanation
This question tests identifying bias in modified estimators, like scaling the sample mean 2ar{x} for mu. Multiplying ar{x} by 2 scales the expected value to 2mu, creating a high bias; over many SRSs, the long-run average would be 2mu, not mu, due to this arbitrary adjustment. The sampling distribution's center is shifted proportionally. Distractor choice C incorrectly states scaling doesn't affect bias, but it does when the factor isn't 1. Mini-lesson: Bias is a systematic deviation in the estimator's expected value from the parameter, unaffected by sample size n, whereas variability measures spread and reduces with larger n—here, n=100 lowers variability but the bias remains 2mu - mu. Thus, 2ar{x} is biased high.
A city wants to estimate the population proportion $p$ of households that have a working smoke detector. Inspectors visit a convenience sample of 120 households by stopping at homes on two nearby streets and record whether each home has a working detector. The point estimate is the sample proportion $\hat{p}$, calculated as (number of sampled homes with working detectors)/120. Which statement describes whether the estimate is biased?
The estimate is biased because using a convenience sample can systematically overestimate or underestimate $p$.
The estimate is unbiased because $\hat{p}$ is always an unbiased estimator of $p$ regardless of how the sample is selected.
The estimate is biased because $\hat{p}$ varies from sample to sample, so it cannot be unbiased.
The estimate is unbiased because the sample size is large, which eliminates bias.
The estimate is unbiased only if the sample proportion equals the true proportion in the population.
Explanation
This question addresses bias from non-random sampling methods. A convenience sample, where inspectors only visit homes on two nearby streets, is not representative of all households in the city. Homes on these particular streets might have systematically different smoke detector rates than the city as a whole - perhaps they're in a newer neighborhood with stricter building codes, or an older area where detectors are less common. This sampling method will cause the sample proportion $\hat{p}$ to systematically over- or underestimate the true population proportion $p$ in the long run. Choice A is wrong because the sampling method matters greatly for bias. Choice C incorrectly claims that large samples eliminate bias, but no sample size can fix the fundamental problem of non-representative sampling. Bias is about the sampling method's systematic tendency, not sample-to-sample variability.
A farmer wants a point estimate of the population mean weight $\mu$ of all pumpkins harvested from a large field this week. The farmer takes a simple random sample of 30 pumpkins and computes the sample mean $\bar x$ of their weights, then reports $\bar x$ as the point estimate for $\mu$. Which statement describes whether the estimate is biased?
The estimate is biased because $\bar x$ will not equal $\mu$ exactly for most samples.
The estimate is unbiased because $\bar x$ is an unbiased estimator of $\mu$ when the sample is selected at random.
The estimate is biased because random sampling reduces variability but not bias.
The estimate is biased because the sample size is only 30, which is too small to avoid bias.
The estimate is unbiased only if the population distribution of weights is approximately normal.
Explanation
This question asks about bias in estimating a population mean. The sample mean $\bar{x}$ from a simple random sample is an unbiased estimator of the population mean $\mu$, regardless of sample size or population distribution shape. This means that over many random samples, the average of all sample means equals the population mean. Choice C incorrectly suggests that because $\bar{x}$ won't exactly equal $\mu$ for most samples, the estimator is biased—but bias is about long-run average behavior, not individual sample results. The key distinction: an unbiased estimator can still produce estimates that differ from the true parameter value; it just doesn't systematically over- or underestimate on average.