Parallel and Distributed Computing

Help Questions

AP Computer Science Principles › Parallel and Distributed Computing

Questions 1 - 10
1

Refer to the text: in genome sequencing, distributed nodes communicate via network messages, unlike tightly coupled parallel processors. What is a key difference between parallel and distributed computing?

Parallel computing is defined as any computing that uses electricity, unlike distributed computing.

Parallel systems communicate only through the public internet, while distributed systems never communicate.

Distributed systems cannot scale, while parallel systems scale indefinitely without coordination.

Distributed systems usually require message passing, while parallel systems often use shared memory or fast links.

Explanation

This question tests understanding of communication differences between parallel and distributed computing architectures. Parallel computing typically uses shared memory or fast interconnects for processor communication within a single system, while distributed computing relies on network message passing between separate nodes. In this passage, genome sequencing explicitly contrasts distributed nodes communicating via network messages with tightly coupled parallel processors. Choice A is correct because it accurately identifies this key distinction—distributed systems require message passing while parallel systems often use shared memory or fast links. Choice C is incorrect because it makes false claims about communication methods—parallel systems don't exclusively use public internet, and distributed systems must communicate to function. To help students: Create comparison charts showing communication methods for each architecture. Emphasize that communication method follows from physical architecture—close processors can share memory, distant nodes cannot. Watch for: oversimplification of communication patterns or absolute statements about what each system can or cannot do.

2

Refer to the text: genome sequencing pipelines may distribute data across nodes to handle massive read volumes. Which scenario best exemplifies distributed computing?

A single GPU renders one image faster by using many cores on one card.

One laptop increases speed by overclocking its single processor.

A spreadsheet sorts a small list using one thread to avoid coordination.

Multiple lab servers share read alignment tasks and exchange results over a network.

Explanation

This question tests understanding of distributed computing through example identification, specifically recognizing scenarios with multiple networked systems. Distributed computing involves multiple independent computer systems working together over a network to solve a problem or provide a service. In this passage, genome sequencing pipelines distribute data across nodes to handle massive volumes. Choice B is correct because it clearly exemplifies distributed computing—multiple lab servers (separate systems) sharing tasks and exchanging results over a network. Choice A is incorrect because a single GPU with many cores represents parallel computing within one device, not distributed computing across multiple systems. To help students: Emphasize the key identifier of distributed computing—multiple separate computer systems connected by a network. Practice categorizing scenarios based on whether they involve one system or multiple networked systems. Watch for: confusion between many processors in one system (parallel) versus many systems working together (distributed).

3

Refer to the text: genome sequencing uses parallel computing for rapid per-read analysis and distributed computing for cluster-wide throughput. What is a key difference between parallel and distributed computing?

Parallel computing is inherently fault tolerant, while distributed computing fails whenever one node fails.

Parallel computing cannot be used in science, while distributed computing is only for science.

Parallel computing depends on geographic separation, while distributed computing requires one shared cache.

Parallel computing coordinates processors within a single system, while distributed computing coordinates multiple nodes over a network.

Explanation

This question tests understanding of the fundamental distinction between parallel and distributed computing architectures. Parallel computing coordinates multiple processors within a single computer system, typically sharing memory or connected by fast internal links, while distributed computing coordinates multiple independent computer systems (nodes) connected over a network. In this passage, genome sequencing uses parallel computing for rapid per-read analysis on one system and distributed computing for cluster-wide throughput across multiple systems. Choice A is correct because it accurately states this key architectural difference—parallel computing works within a single system while distributed computing spans multiple networked nodes. Choice C is incorrect because it reverses the fault tolerance characteristics—distributed computing is generally more fault tolerant than parallel computing due to node independence. To help students: Always start with the physical architecture distinction—one system versus multiple systems. Create clear visual representations showing the boundary of a single system versus multiple networked systems. Watch for: misconceptions about which architecture provides better fault tolerance or assumptions about application domains.

4

Refer to the text on genome sequencing: parallel computing accelerates read mapping by dividing independent read batches among processors. How does parallel computing improve processing speed?

By reducing internet latency between hospitals sharing patient records.

By replicating the same task on many nodes to prevent any incorrect result.

By always lowering costs through fewer processors and less memory usage.

By dividing read batches into concurrent tasks executed by multiple processors.

Explanation

This question tests understanding of parallel computing's speed benefits, specifically how task division improves performance. Parallel computing achieves speedup by breaking a large task into smaller, independent subtasks that can be executed simultaneously on multiple processors. In this passage, genome sequencing demonstrates parallel computing by dividing read batches among processors for concurrent mapping. Choice A is correct because it accurately describes this process—dividing read batches into concurrent tasks executed by multiple processors directly increases processing speed. Choice B is incorrect because it describes redundancy for error checking, not speed improvement through parallelization. To help students: Emphasize that parallel computing's primary speed benefit comes from simultaneous execution of independent tasks. Use analogies like multiple workers painting different walls simultaneously. Watch for: confusion between parallelization for speed versus replication for reliability.

5

Refer to the text: Genome sequencing pipelines split read alignment into many independent chunks, then merge partial matches; processors must synchronize to avoid duplicate counting. How does parallel computing improve processing speed?

By ensuring node failures never occur, so no time is lost to recovery procedures.

By dividing alignment work into chunks processed simultaneously, then merging results efficiently.

By moving data to distant nodes over the internet, which always accelerates computation.

By replacing algorithmic steps with manual verification to increase accuracy and speed.

Explanation

This question tests understanding of parallel and distributed computing concepts, specifically how parallel computing achieves speed improvements in genome sequencing pipelines. Parallel computing divides a single task into smaller subtasks that can be processed simultaneously by multiple processors, then combines the results to complete the original task faster. In this passage, parallel computing is illustrated through genome sequencing pipelines that split read alignment into many independent chunks processed simultaneously, with processors synchronizing to avoid duplicate counting when merging results. Choice A is correct because it accurately describes how parallel computing improves processing speed by dividing alignment work into chunks processed simultaneously, then merging results efficiently - this is the fundamental speedup mechanism of parallel computing. Choice B is incorrect because it describes moving data to distant nodes over the internet, which relates to distributed computing and actually introduces network latency rather than improving speed. To help students: Use the analogy of multiple workers assembling parts of a product simultaneously versus one worker doing everything sequentially. Demonstrate speedup calculations showing how parallel processing reduces time. Watch for: confusion between parallel speedup (simultaneous processing) and distributed computing characteristics.

6

Based on the passage: A lab uses a supercomputer where many processors share fast internal links for genome alignment; another lab uses a cluster of separate machines that exchange messages and replicate data. Which scenario best exemplifies distributed computing?

A faster algorithm alone replaces computation, removing the need for additional hardware.

Multiple networked nodes store read subsets, exchange messages, and continue if one node fails.

A single processor runs all alignments sequentially to avoid synchronization overhead.

One computer splits alignment across its processors and merges results through shared coordination.

Explanation

This question tests understanding of parallel and distributed computing concepts, specifically identifying which scenario exemplifies distributed computing versus parallel computing. Distributed computing involves multiple independent computers (nodes) connected via network, each with its own processor and memory, communicating through message passing and providing fault tolerance. In this passage, two scenarios are presented: a supercomputer with processors sharing fast internal links (parallel) and a cluster of separate machines exchanging messages with data replication (distributed). Choice C is correct because it accurately describes distributed computing - multiple networked nodes that store read subsets, exchange messages, and can continue operating if one node fails, which are the defining characteristics of distributed systems. Choice A is incorrect because it describes parallel computing - one computer splitting work across its processors with shared coordination, which is characteristic of parallel systems within a single machine. To help students: Emphasize the physical separation of machines in distributed computing versus multiple processors in one machine for parallel computing. Use diagrams showing network connections between separate computers versus internal processor connections. Watch for: students focusing on task division rather than system architecture when distinguishing between parallel and distributed computing.

7

Based on the passage: Distributed genome workflows replicate data across nodes so alignments can resume after failures; parallel workflows focus on coordinated processors within one machine. How does distributed computing enhance fault tolerance?

By preventing failures through faster processors, so recovery mechanisms are unnecessary.

By using a single shared disk, ensuring all nodes depend on one storage device.

By duplicating data and reassigning tasks when a node becomes unavailable.

By merging partial results more frequently, which guarantees nodes cannot crash mid-task.

Explanation

This question tests understanding of parallel and distributed computing concepts, specifically how distributed computing achieves fault tolerance through redundancy and task reassignment. Distributed computing's fault tolerance comes from data replication across independent nodes and the ability to reassign work when nodes fail, unlike parallel computing which typically operates within a single failure domain. In this passage, distributed genome workflows are shown to replicate data across nodes so alignments can resume after failures, contrasting with parallel workflows that focus on coordinated processors within one machine. Choice C is correct because it accurately describes how distributed computing enhances fault tolerance by duplicating data and reassigning tasks when a node becomes unavailable, which are the core mechanisms of fault tolerance in distributed systems. Choice B is incorrect because using a single shared disk would create a single point of failure, eliminating fault tolerance rather than enhancing it - the opposite of distributed computing principles. To help students: Explain redundancy concepts using examples like RAID arrays or backup systems. Demonstrate how task reassignment works when nodes fail in distributed systems. Watch for: students confusing fault tolerance mechanisms with performance optimization or thinking shared resources improve fault tolerance.

8

Based on the passage: Parallel genome alignment uses many processors in one system with tight coordination; distributed systems use separate nodes communicating over a network and can keep running during node failures. What is a key difference between parallel and distributed computing?

Parallel eliminates communication needs; distributed eliminates the need for task distribution.

Parallel relies on local processor coordination; distributed relies on message-based network communication.

Parallel exists only for small data sets; distributed exists only for large data sets.

Parallel uses networked nodes; distributed uses shared memory inside one machine.

Explanation

This question tests understanding of parallel and distributed computing concepts, specifically the key architectural differences between these two computing paradigms. Parallel computing uses multiple processors within one system with shared memory and tight coordination, while distributed computing uses separate networked nodes that communicate through message passing. In this passage, the distinction is illustrated through genome alignment using many processors in one system with tight coordination (parallel) versus separate nodes communicating over a network with fault tolerance capabilities (distributed). Choice B is correct because it accurately identifies that parallel computing relies on local processor coordination within one machine, while distributed computing relies on message-based network communication between separate nodes. Choice A is incorrect because it reverses the characteristics - parallel computing uses shared memory inside one machine, not networked nodes, which is a fundamental misconception students often have. To help students: Create comparison charts showing parallel (one machine, multiple processors, shared memory) versus distributed (multiple machines, network communication, message passing). Practice categorizing real computing scenarios. Watch for: students reversing the characteristics or thinking the difference is only about data size.

9

Refer to the text: In parallel genome alignment, processors must coordinate when merging partial matches; excessive coordination can reduce speed gains. Which statement best captures a limitation implied by the passage?

Fault tolerance is irrelevant in genomics because hardware failures never occur in practice.

Distributed computing eliminates all communication delays by using shared memory across nodes.

Communication and synchronization overhead can constrain parallel speedup as processors increase.

Parallel computing cannot run genome sequencing because it never allows task division.

Explanation

This question tests understanding of parallel and distributed computing concepts, specifically the limitations of parallel computing due to coordination overhead. Parallel computing's speedup is limited by the need for processors to communicate and synchronize, which becomes more significant as the number of processors increases, following Amdahl's Law. In this passage, this limitation is illustrated through parallel genome alignment where processors must coordinate when merging partial matches, and excessive coordination can reduce speed gains. Choice B is correct because it accurately captures that communication and synchronization overhead can constrain parallel speedup as processors increase, which is a fundamental limitation of parallel computing. Choice A is incorrect because it claims parallel computing cannot run genome sequencing or allow task division, which contradicts the passage that explicitly describes parallel genome sequencing through task division. To help students: Introduce Amdahl's Law mathematically and show how even small sequential portions limit speedup. Use examples where adding more processors provides diminishing returns. Watch for: students thinking parallel computing has no limitations or misunderstanding that coordination overhead increases with processor count.

10

Refer to the text: In genome sequencing, distributed computing stores reads across networked nodes and uses message passing; if one node fails, other nodes continue and data can be re-copied. How does distributed computing enhance fault tolerance?

By guaranteeing perfect accuracy in every alignment, eliminating the impact of hardware faults.

By requiring constant shared-memory access, preventing any single component from failing.

By replicating data and rerouting tasks so other nodes continue when one node fails.

By keeping computation on one machine so failures cannot spread across a network.

Explanation

This question tests understanding of parallel and distributed computing concepts, specifically how distributed computing provides fault tolerance in genome sequencing applications. Distributed computing involves multiple independent systems connected via network that can continue operating even when individual nodes fail, unlike parallel computing which typically operates within a single system. In this passage, distributed computing is illustrated through genome sequencing that stores reads across networked nodes using message passing, with the ability to re-copy data and continue when nodes fail. Choice C is correct because it accurately describes how distributed computing enhances fault tolerance by replicating data and rerouting tasks so other nodes can continue when one node fails, which is the fundamental mechanism of fault tolerance in distributed systems. Choice B is incorrect because it describes shared-memory access, which is characteristic of parallel computing within a single system, not distributed computing across networked nodes. To help students: Emphasize that fault tolerance in distributed systems comes from redundancy and independence of nodes. Use real-world examples like cloud storage services that continue working even when servers fail. Watch for: confusion between parallel computing's shared memory and distributed computing's message passing architectures.

Page 1 of 2