31 October 2011

Distributed Computing on Mobile Devices

Since my last year as an undergraduate, I have been enamored by distributed computing and the ingenuity in how it leverages standard desktop PCs to churn vast amounts of information to solve incredibly difficult computing problems. In recent years, computer scientists and scientific researchers have required the use of powerful supercomputing clusters in order to process huge sets of data. By clustering smaller, cheaper computers together, these researchers were able to make faster, cheaper and more reliable supercomputers than their older, larger and more specialized mainframe ancestors. More recently, scientists have begun using distributed computing as a means to further reduce cost while increasing processing power by letting members of the general populace donate idle CPU time to crunching sets of scientific data.

Distributed computing is the ability to distribute discrete pieces of a large computational problem among a large number of smaller processors, who, in turn, return processed data back for evaluation or further processing. It is, in a sense, a method to asynchronously solve complex analytical and computationally intensive problems - a way of taking "Divide-and-Conquer" to massively distributed proportions. Typically in a distributed computing cluster, a central site would break a computationally exhausting problem into smaller chunks that more modest hardware could process. Idle systems would then request some of these small chunks to work on from the central server. The systems would then process the data into more smaller, more meaningful pieces of information based on a prescribed set of rules and finally return the results to the central site.
Distributed computing has seen a lot of success, especially in research projects such as SETI@Home and Folding@Home, two of the most popular distributed computing projects. It is also used in non-academic contexts, such as with Google's search infrastructure or other Map-Reduce systems, such as Hadoop where the distributed computing domain is privatized. However, desktop distributed computing is now being threatened by advancements and shifts in general computer-use paradigms. More and more users are shifting to using low-power portable platforms such as laptops and other mobile devices. As a result, these devices now rely on servers to do a large majority of heavy-processing work, thereby allowing low-power devices to maintain lower loads for battery longevity, cheaper components and overall reponsiveness. Compounding this with the increase in the desire for energy and cost reduction, desktop computer platforms are now being relied upon almost exclusively for high-load work, leaving little left for voluntary distributed computing tasks. Thus, due to changes in computer and energy use, distributed computing is losing a majority of its computing base. But I believe that there may be some hope.
In the last few years, smartphones have seen a huge surge in growth globally as their adoption has increased dramatically among the general populace. This dramatic growth can be largely attributed to increase in the affordability of these devices and huge increases in their techological power. In fact, in the last few months, smartphones such as the iPhone and a multitude of Android devices have become much more powerful, with higher processor speeds and multiple cores as well as hugely expanded memories. These devices, in fact, are becoming quite equivalent to their 5-year-old desktop counterparts. However, like the powerful PCs before them, these devices' processors are largely underutilized as they lay idle the majority the time they are on, particularly while they are sitting next to one's nightstand charging.
Much of the groundwork for allowing some form of background processing has been available to mobile device operating systems for some time now. Therefore, it would not be too much of a stretch to allow scientific number-crunching applications to be installed on a mobile device and begin processing during the device's idle time - a sort of mobile edition of Folding@Home. The application could be set to only activate while the device's screen was off for a certain duration - a great indicator that the device is idle. Furthermore, some would be concerned of battery life as a result of this increased processor use. This too could be tailored for optimum case scenarios where application could only be activated when the device is plugged into utility power. In fact, the application could require a number of factors to be true before processing, such as location, power and screen state, network connectivity type and state, time-of-day, and many, many more. Add this with the ability to allow the user to choose which factors trigger processing and the application could then run almost autonomously for extremely great lengths of time, providing scientists with a small, but valuable computing resource to have crunch their data.
Of course, these devices would not process information at nearly same speed as their desktop counterparts, but where their power is their weakness, the sheer population of these devices is their strength. Though powerful for their form-factor, they will likely never have the same computing capacity that their desktop brethren have afforded. But the analogy of the large number of these devices processing a large, complex problem that is far too complex for one of the devices individually to an ant colony holds true: Suppose a large amount of food is available to a colony of ants. The amount of food available is too great a problem for a single ant. However, a large number of ants can pull apart very small pieces of the food, process those pieces and return for more. The more ants there that participate in the task, the less time it takes to process the entire piece of food. Likewise,  if a large number of devices participated in the distributed computing environment, the less time it would take to process, albeit individually at a slower pace. But even here the problem's solution could be optimized in a number of ways according to the device processing the data. For example, a processing chunk normally destined for a desktop processor could be reduced in size to accommodate for the more restrictive network bandwidths and allow for quicker processing from mobile devices; and/or the method to process the data could be optimized for the devices since their architectures could be extremely similar (e.g. the iPhone), thereby allowing for the leveraging of special instructions available only on those devices.
The applications for this type of distributed computing are far-reaching. From the well-known SETI@Home, Folding@Home and LHC@home scientific projects, to processing weather data, these applications could help solve many intractable problems and provide very real results (aside: mobile devices are now becoming sophisticated enough to, in some ways, help process atmospheric data locally, allowing meterologists access to more accurate and real-time processed information gathered autonomously from mobile data-gathering applications). Unfortunately, many are concerned with privacy, and the threat of it could be a death-blow to the idea of mobile processing. However, I personally believe that it depends on the person: there is a certain amount of trust you must give to an organization requesting voluntary use of your equipment to perform processing duties, just as their is some trust allowing a repairman into your home. For this reason, it is strongly urged that these distributed computing applications be open-sourced to allow individuals the transparency they need to calm their nerves.
Overall, I think the prospect of being able to tap into the idle potential of millions of mobile devices is very promising for distributed computing. My only wish is that I could find the time to work on a proof-of-concept for this type of processing, and a team of other enthusiastic engineers with whom I could work with. I think this would be very interesting and exciting for computer science and other areas of science, and represents an extremely efficient use of idle consumer devices.

1 comment:

Clockworkapps said...

You wish is my command 8-)

Sorry it took so long..... I just released a free app that uses distribtuted mobile computing.

Check out www.clockworkapps.com for more info