Data Compression

Help Questions

AP Computer Science Principles › Data Compression

Questions 1 - 10
1

Read the passage. In a streaming-service setting, data compression reduces the number of bits needed to store or transmit information. Compression works by finding patterns and representing them more efficiently, which helps platforms deliver movies, music, and captions quickly over limited bandwidth. Two broad categories appear: lossless compression preserves every original detail so the decompressed data matches exactly, while lossy compression removes some information to achieve smaller files. For text such as subtitles and chat logs, lossless methods are common because even a small change can alter meaning; Huffman coding assigns shorter bit patterns to frequent symbols, and LZW replaces repeated sequences with short codes. For images and video, lossy methods often dominate because the human eye tolerates small changes; JPEG compresses still images by discarding subtle visual details, and MPEG compresses video by combining spatial compression with frame-to-frame prediction. The trade-off is constant: stronger compression usually means smaller files but more noticeable artifacts, so services adjust settings to balance quality and smooth playback.

Based on the text, what trade-offs are involved in using JPEG compression?

It reduces file size but may introduce visible artifacts.

It increases file size to prevent any streaming delays.

It preserves every pixel while greatly shrinking files.

It replaces repeated words with codes for perfect text accuracy.

Explanation

This question tests AP Computer Science Principles skills in understanding data compression, specifically the trade-offs involved in lossy compression methods like JPEG. Data compression reduces file sizes by removing redundant information, with lossy compression achieving greater reduction by permanently discarding some data that may not be noticeable to users. In the provided passage, JPEG compression is described as discarding 'subtle visual details' to compress still images, while noting that 'stronger compression usually means smaller files but more noticeable artifacts.' Choice A is correct because it accurately captures both aspects of the JPEG trade-off: reduced file size (benefit) and potential visible artifacts (cost) as stated in the passage. Choice B is incorrect because it contradicts the lossy nature of JPEG - the passage clearly states JPEG 'discards subtle visual details,' not preserving every pixel. To help students: Focus on identifying key terms like 'trade-off' and 'artifacts' in compression contexts. Practice distinguishing between the benefits (smaller files) and costs (quality loss) of different compression methods.

2

Read the passage. A media company publishes articles, photos, and videos and uses different compression methods depending on content. Compression reduces the number of bits needed by encoding patterns more efficiently, helping files download faster and cost less to store. Lossless compression preserves exact data and is favored for text, where accuracy matters, while lossy compression discards some information to achieve smaller files and is common for images and video. Huffman coding and LZW are lossless techniques often used for text or general data, while JPEG is a popular lossy format for still images. MPEG is widely used for video compression, often balancing quality against file size. Selecting the right method depends on whether perfect fidelity or smaller size is the priority.

Based on the text, how does Huffman coding differ from JPEG compression?

Huffman increases size; JPEG always eliminates artifacts entirely.

Huffman discards details; JPEG preserves every original bit.

Huffman is a video standard; JPEG compresses text files.

Huffman is lossless for symbols; JPEG is lossy for images.

Explanation

This question tests AP Computer Science Principles skills in understanding data compression, specifically comparing the fundamental differences between Huffman coding and JPEG compression. Data compression uses different methods for different content types, with Huffman coding being a lossless technique for text and symbols, while JPEG is a lossy format designed specifically for images. In the provided passage, Huffman coding is described as a 'lossless technique often used for text,' while JPEG is identified as 'a popular lossy format for still images.' Choice A is correct because it accurately captures this fundamental distinction: Huffman is lossless and works with symbols/text, while JPEG is lossy and designed for images, as explicitly stated in the passage. Choice C is incorrect because it reverses their characteristics - Huffman preserves data (lossless) while JPEG discards details (lossy). To help students: Create comparison tables showing compression methods grouped by lossless vs. lossy and their typical applications. Emphasize that the choice of compression method depends on both the content type and whether perfect accuracy is required.

3

Read the passage. A school archives student essays and wants to reduce file sizes while preserving exact wording. Data compression helps by finding patterns and encoding them with fewer bits, which saves storage and speeds transfers. For text, lossless compression is essential because changing even one character can alter meaning or grading. Huffman coding is a lossless method that compresses by giving frequent characters shorter codes, and LZW is another lossless method that replaces repeated sequences with dictionary codes. JPEG and MPEG are usually lossy and focus on media like images and video, where small changes may be acceptable. The archive therefore chooses lossless methods to maintain integrity.

What is the primary benefit of using lossless compression for archived essays?

It improves image sharpness by discarding extra pixels.

It ensures the restored text matches the original exactly.

It converts documents into MPEG for easier editing.

It always produces the smallest possible file size.

Explanation

This question tests AP Computer Science Principles skills in understanding data compression, specifically why lossless compression is essential for text archival. Data compression reduces file sizes, but when archiving text documents like essays, maintaining exact accuracy is crucial because even small changes can alter meaning or affect grading. In the provided passage, the text explicitly states that 'For text, lossless compression is essential because changing even one character can alter meaning or grading.' Choice A is correct because it identifies the primary benefit: ensuring the restored text matches the original exactly, which is critical for maintaining the integrity of archived essays. Choice B is incorrect because lossless compression doesn't always produce the smallest files - lossy compression typically achieves greater reduction but at the cost of data loss. To help students: Use examples of how a single character change can alter meaning (e.g., 'not' vs 'now'). Emphasize that academic integrity requires perfect preservation of student work.

4

Read the passage. Streaming services rely on compression to deliver content efficiently across networks with varying speeds. Compression reduces the number of bits sent, which can prevent buffering and lower data usage. Lossless compression preserves exact data and is common for text like subtitles, while lossy compression is common for images and video because it can shrink files much more by discarding information viewers are less likely to notice. JPEG is a familiar lossy format for still images, and MPEG is widely used for video, often combining multiple strategies to reduce size. The main challenge is balancing smaller files against visible artifacts or reduced clarity. Services tune compression levels to maintain acceptable quality while keeping playback smooth.

Why is lossy compression preferred in many streaming video applications?

It is required for subtitles because text can change safely.

It ensures every frame is restored with perfect accuracy.

It works by assigning shorter codes to frequent letters only.

It achieves much smaller files, with tolerable quality loss.

Explanation

This question tests AP Computer Science Principles skills in understanding data compression, specifically why lossy compression is preferred for streaming video applications. Data compression reduces file sizes to enable efficient streaming, with lossy compression achieving much greater reduction by removing information viewers are unlikely to notice. In the provided passage, lossy compression for video is described as being able to 'shrink files much more by discarding information viewers are less likely to notice,' with the key benefit being prevention of buffering and smooth playback. Choice A is correct because it captures the essential trade-off: achieving much smaller files (enabling smooth streaming) with tolerable quality loss, as stated in the passage's discussion of balancing file size against artifacts. Choice B is incorrect because it describes lossless compression - the passage clearly states lossy compression discards information. To help students: Discuss real-world streaming constraints like bandwidth limitations and why perfect quality isn't always necessary. Use examples of quality settings in streaming platforms to illustrate the compression trade-offs.

5

Read the passage. A cloud platform stores millions of customer documents and wants to reduce storage costs without changing any file contents. Data compression helps by representing repeated patterns more efficiently, lowering the number of bits needed to store data. Lossless compression is used when exact recovery matters, such as for text files, code, and spreadsheets, while lossy compression is common for media where small imperfections are acceptable. LZW is a lossless algorithm that builds a dictionary of repeated sequences and substitutes short codes, which can be effective when the same phrases or patterns appear many times. Huffman coding is also lossless, using shorter bit patterns for more frequent symbols. JPEG and MPEG, by contrast, are typically lossy and focus on shrinking images and video.

Which scenario best demonstrates the use of LZW compression?

Shrinking a photo by discarding subtle visual details.

Compressing repetitive server logs without altering any characters.

Reducing video size by predicting changes between frames.

Improving audio clarity by adding extra data to the stream.

Explanation

This question tests AP Computer Science Principles skills in understanding data compression, specifically identifying appropriate use cases for LZW compression. Data compression reduces file sizes by finding and encoding patterns, with LZW being a lossless method that builds a dictionary of repeated sequences and replaces them with shorter codes. In the provided passage, LZW is described as 'a lossless algorithm that builds a dictionary of repeated sequences and substitutes short codes, which can be effective when the same phrases or patterns appear many times.' Choice A is correct because server logs typically contain repetitive patterns (timestamps, IP addresses, error messages) that LZW can compress effectively without altering any characters, matching the lossless requirement. Choice B is incorrect because it describes lossy compression (discarding visual details), while LZW is explicitly lossless. To help students: Provide examples of repetitive data (logs, source code, structured documents) where LZW excels. Emphasize that LZW's dictionary approach works best with repeated patterns.

6

Read the passage. A messaging app compresses text to save bandwidth and storage while keeping messages readable and unchanged. Compression reduces file size by encoding common patterns with fewer bits, and for text this must be lossless so every character is restored exactly. Huffman coding is one lossless technique that assigns shorter bit patterns to characters that appear more often, which can shrink many natural-language messages. LZW is another lossless method that replaces repeated sequences with short codes from a growing dictionary, which can be effective for repetitive logs or structured text. In contrast, JPEG and MPEG are typically lossy and are chosen for images and video where small inaccuracies are acceptable. The app therefore favors lossless algorithms for text to avoid altering meaning.

Based on the text, what is the primary benefit of using Huffman coding?

It increases file size to preserve network reliability.

It assigns shorter codes to frequent symbols to reduce size.

It removes subtle pixels to improve photo realism.

It converts text into MPEG for smoother playback.

Explanation

This question tests AP Computer Science Principles skills in understanding data compression, specifically how Huffman coding works as a lossless compression technique. Data compression reduces file sizes by encoding information more efficiently, with Huffman coding being a lossless method that assigns variable-length codes based on symbol frequency. In the provided passage, Huffman coding is described as 'one lossless technique that assigns shorter bit patterns to characters that appear more often, which can shrink many natural-language messages.' Choice A is correct because it accurately describes Huffman coding's primary benefit: assigning shorter codes to frequent symbols to reduce size, exactly as stated in the passage. Choice B is incorrect because it describes lossy compression behavior (removing pixels), while Huffman coding is explicitly identified as lossless in the passage. To help students: Use frequency analysis exercises to demonstrate how Huffman coding works. Show examples of common letters (like 'e' in English) getting shorter codes than rare letters (like 'q').

7

Read the passage. In a text-focused workflow, data compression helps reduce storage and speed up transfers without changing the meaning of documents. Compression works by representing information with fewer bits, often by exploiting repetition or predictable patterns. Lossless compression is essential for text because decompressed output must match the original exactly, while lossy compression intentionally discards some information and is better suited to media where small changes are acceptable. Huffman coding is a common lossless approach that uses shorter bit patterns for more frequent characters, shrinking many text files efficiently. LZW is another lossless method that builds a dictionary of repeated sequences and replaces those sequences with short codes, which can work well for logs and repetitive data. By contrast, JPEG and MPEG are widely used for images and video, where some quality loss is tolerated to achieve much smaller files.

Based on the text, how does lossless compression differ from lossy compression?

Lossless applies only to video; lossy applies only to text.

Lossless exactly preserves data; lossy discards some information.

Lossless always makes files larger; lossy always makes them smaller.

Lossless removes details; lossy preserves every original bit.

Explanation

This question tests AP Computer Science Principles skills in understanding data compression, specifically differentiating between lossless and lossy compression methods. Data compression reduces file sizes by encoding information more efficiently, with lossless compression preserving all original data for perfect reconstruction, while lossy compression permanently removes some information to achieve smaller file sizes. In the provided passage, the text explicitly states that 'Lossless compression is essential for text because decompressed output must match the original exactly, while lossy compression intentionally discards some information.' Choice B is correct because it accurately reflects this fundamental distinction: lossless preserves data exactly while lossy discards information, as directly stated in the passage. Choice A is incorrect because it reverses the definitions - lossless preserves, not removes details. To help students: Create comparison charts showing lossless vs. lossy characteristics and their typical applications. Emphasize that 'lossless' means 'no loss' of data, making it essential for text where accuracy matters.

8

Read the passage. During backup storage, organizations compress data to reduce the amount of disk space required and to speed up copying large archives. Compression works by encoding information more efficiently, often by finding repeated patterns across files. In this context, lossless compression is preferred because backups must restore data exactly, including program files, spreadsheets, and databases. Huffman coding supports lossless compression by assigning shorter codes to common symbols, while LZW builds a dictionary of repeated sequences so the same patterns can be stored once and referenced many times. Lossy compression, used in formats like JPEG for images and MPEG for video, can shrink files further by discarding subtle details, but that risk is unacceptable for most backups. As a result, backup systems typically prioritize data integrity over maximum size reduction.

Why is lossless compression preferred in backup storage?

It works only for photos, not for documents.

It converts files into MPEG to speed up recovery.

It intentionally removes details to maximize shrinkage.

It guarantees exact restoration of the original data.

Explanation

This question tests AP Computer Science Principles skills in understanding data compression, specifically why lossless compression is critical for certain applications like backup storage. Data compression reduces file sizes, but the choice between lossless (preserving all data) and lossy (discarding some data) depends on the application's requirements for data integrity. In the provided passage, backup storage is described as requiring lossless compression because 'backups must restore data exactly, including program files, spreadsheets, and databases.' Choice A is correct because it identifies the key requirement: exact restoration of original data, which is explicitly stated as the reason lossless compression is preferred for backups. Choice B is incorrect because it describes lossy compression's behavior (removing details), which the passage states is 'unacceptable for most backups.' To help students: Use real-world scenarios to illustrate when data integrity is critical versus when some loss is acceptable. Emphasize that backups serve as insurance policies - they must perfectly restore data when needed.

9

Based on the text, how does lossless compression differ from lossy compression?

Lossless preserves all information; lossy may remove some detail

Lossless always makes files larger; lossy always makes them smaller

Lossless discards details; lossy preserves every original bit

Lossless works only for video; lossy works only for text

Explanation

This question tests AP Computer Science Principles skills in understanding data compression, specifically differentiating between lossless and lossy compression methods. Lossless compression reduces file size by removing redundancy without losing any original information, while lossy compression achieves greater size reduction by permanently discarding some data deemed less important. The fundamental distinction is that lossless compression allows perfect reconstruction of the original data, whereas lossy compression results in an approximation of the original. Choice C is correct because it accurately states that lossless preserves all information while lossy may remove some detail, capturing the essential difference between these compression types. Choice A is incorrect because it reverses the definitions - lossless preserves details while lossy discards them, not the other way around. To help students: Create visual comparisons showing how the same file looks after lossless versus lossy compression. Emphasize that the choice between methods depends on whether perfect accuracy or smaller file size is more important for the specific use case.

10

Based on the text, why is lossless compression preferred in backup storage?

It preserves exact data so restored files match originals

It works only on images, making documents easier to search

It removes unneeded pixels, creating smaller but blurrier archives

It converts backups into streaming formats for faster playback

Explanation

This question tests AP Computer Science Principles skills in understanding data compression, specifically why lossless compression is critical for backup storage applications. Backup systems require absolute data integrity because users need to restore files exactly as they were originally saved, without any loss of information or corruption. Lossless compression achieves file size reduction through techniques like pattern recognition and redundancy removal while maintaining the ability to perfectly reconstruct the original data. Choice B is correct because it identifies the key requirement for backups: preserved exact data that allows restored files to match originals perfectly. Choice A is incorrect because it describes lossy compression characteristics (removing pixels, creating blur) which would be unacceptable for backup systems where data integrity is paramount. To help students: Discuss real-world scenarios where data loss would be catastrophic (financial records, medical data, source code) to illustrate why backups must use lossless compression. Emphasize that while lossy compression has valid uses, backup storage is not one of them.

Page 1 of 2