Decomposition and Challenges in Parallel Programming: Is It Useful for Cloud Computing?
A recent article from Dr. Dobb’s introduced Fundamental Concepts of Parallel Programming. Richard Gerber and Andrew Binstock, authors of Programming with Hyper-Threading Technology, discussed three different forms of de-compositions for multi-threading:
- Functional decomposition. It’s one of the most common ways to achieve parallel execution. Using this approach, individual tasks are catalogued. If two of them can run concurrently, they are scheduled to do so by the developer.
- Producer/Consumer. It’s a form of functional decomposition in which one thread’s output is the input to a second. Can be hard to avoid, but frequently detrimental to performance.
- Data decomposition, a.k.a. “data level parallelism.” It breaks down tasks by the data they work on, rather than by nature of the task. Programs that are broken down via data decomposition generally have many threads performing the same work, just on different data items.
To make the three forms easy to understand, the authors used gardening as analogy where the threads map to gardners. For exmaple, the fuctional decomposition in gardening is to have one gardner to move the lawn and the other to weed. I find this analogy very intuitive and easy to follow. Even you don’t know multh-threading, you can guess it out from the gardening analogy.
Lost VMs or Containers? Too Many Consoles? Too Slow GUI? Time to learn how to "Google" and manage your VMware and clouds in a fast and secure HTML5 App.
The challenges while working with multi-threading are:
- Synchronization is the process by which two or more threads coordinate their activities. For example, one thread waits for another to finish a task before continuing.
- Resource limitations refer to the inability of threads to work concurrently due to the constraints of a needed resource. For example, a hard drive can only read from one physical location at a time, which deprives threads of the ability to use the drive in parallel.
- Load balancing refers to the distribution of work across multiple threads so that they all perform roughly the same amount of work.
- Scalability is the challenge of making efficient use of a larger number of threads when software is run on more-capable systems. For example, if a program is written to make good use of four processors, will it scale properly when run a system with eight processors?
In cloud computing, we still need multi-threading and the above discussion about multi-threading still hold.
More often than not, we need to think about parallelism in a bigger scope – multiple processes, multiple applications, multiple machine instances, and multiple data centers. Although the scope and granularity is bigger, the above basic forms of decompositions and challenges are still helpful for us to architect applications for the cloud.
What do you think?