Abstract:

The computational world is becoming very large and complex. There is a blast of new raw data being generated everyday, every hour, every single minute. Today, Google receives 4 million search queries per minute according to the stats given in Data Never Sleeps infographic. Off recent, people have started focusing on reducing computing pro- cessors powers and improve system through- out. Ma jor computing problems have come up in various sectors such as IT and ICT which have lead to the evolution of the pre- viously used, traditional computing environ- ments in order to meet the demands, de- mand for more computational power and storage space. With so much going on, any kind of failure/fault is not acceptable and hence, fault tolerance is the prime need to make computing environments reliable, ro- bust, dependable and available. This pa- per aims at exploring various fault toler- ance methodologies in parallel computing which includes grid, clusters and cloud pro- cessing environments and serial computing which includes homogeneous and heteroge- neous computing environments. Along with this, fault tolerance challenges in ubiquitous computing are also described. This paper is a comparative and intensive study on dis- crete advantages, challenges and issues of fault tolerance in cloud computing. Also, it is an attempt to describe the evolution of the computing frameworks with time.


Keywords : Fault tolerance, Computing en- vironment, Cloud computing, Reliability, Ubiquitous computing, Heterogeneous computing, Reactive, Proactive