Skip to content
Antra Tripathi edited this page May 27, 2019 · 1 revision

Welcome to the R-D-Kernel-Level-Implementation-of-Hadoop wiki!

Hadoop has gained its popularity due to its ability to store, analyzing and accessing a large amount of data, quickly and cost effectively through clusters of commodity hardware. But, No one uses kernel alone. “Hadoop” is taken to be a combination of HDFS and MapReduce. To complement the Hadoop modules there are also a variety of other projects that provide specialized services and are broadly used to make Hadoop laymen accessible and more usable, collectively known as Hadoop Ecosystem. All the components of the Hadoop ecosystem, as explicit entities are evident to address particular needs. Recent Hadoop ecosystem consists of different level layers, each layer performing a different kind of tasks like storing your data, processing stored data, resource allocating and supporting different programming languages to develop various applications in the Hadoop ecosystem.

With a rapid pace in the evolution of Big Data, its processing frameworks also seem to be evolving in a full swing mode. The huge data giants on the web has adopted Apache Hadoop had to depend on the partnership of Hadoop HDFS with the resource management environment and MapReduce programming. Hadoop ecosystem has introduced a new processing model that lends itself to common big data use cases including interactive SQL over big data, machine learning at scale, and the ability to analyses big data scale graphs. Apache Hadoop is not actually a single product but instead a collection of several components. When all these components are merged, it makes the Hadoop very user friendly. The Hadoop ecosystem and its commercial distributions continue to evolve, with new or improved technologies and tools emerging all the time.

Hadoop reduces time requirement, provides reliability, scalability and protection from node failure which are handled within the system itself simply utilizing commodity hardware. This project is intended to optimize the current Hadoop implementation on low energy consuming android devices. Since new android devices are becoming powerful with its dual, quad or octa core processors with 1 to 2 GB ram becoming common practice in mobile devices. They utilized this mobile power for industrial purposes so that they can save cost on even commodity hardware. Big data is a big thing today. Analyzing and processing this big data to produce future predictions in order to support automated decision-making system is a main task Hadoop can do with this application they intend to bring that productivity on the Android platform. With this application, it becomes possible to carry the Big Data on android phones.

The main need of this project is to process this Big Data on Android platform. Android Hadoop will provide a mobile Hadoop Environment with a reduction in hardware cost. So, it is energy efficient i.e. Quality of hardware remains the same, hardware cost is reduced. Increasing mobility power of different devices encourages to think about the processing power. This lead, to thinking of providing mobility to existing workforce environment. Hadoop is chosen to Optimized Map reduce Implementation on Android Platform mobility target for transferring desktop workforce into a mobile workforce. Problem definition defines to provide an application which will allow porting of Hadoop to android platform. They have ported that capability to an android device.

With the use of Android Hadoop, they have provided a limited installation of Linux, just to support Hadoop installation. With its distributed processing, to understand this “Hadoop”, they needed an efficient system which can teach and train the working of Hadoop to students and learners without investing in the formation of large-scale clusters. Small scale industries also cannot afford to have built large scale clusters. Today mobile and mobile based applications have become a part of our day to day life. With the revolution in mobile computing, many great features and technologies were added to the field and mobiles got smaller, faster and better as the decade passed. It gave rise to the introduction of new mobile based operating systems where the programmers were presented with an open source operating system named Android.

Android and Hadoop are two technologies which we decided to select to go further with our research and development of our project. These individual technologies solve many real time problems. So how about combining them? This was the question which interests us during this research work and also if we look in the future development of technology. Could it be possible to run Hadoop on a cluster of mobile phones? This can be considered as the future scope of Hadoop. Recently a team at Google has pushed Hadoop to the limits by creating a cluster whose size is on the order of 1,00,000 nodes, running on the recently released Nexus One mobile phone hardware, powered by Android. By pushing computation out to these devices, the Nexus.

So, with this thinking of porting Hadoop on Android makes a lot of sense if one looks a step for future technology. Mobile devices are getting more and more powerful and advance with octa-core devices just around the corner. Also, modern phones or tablets have lots of GBs of RAM., preferably, 3-4GB with GPU & ARM powered devices consume less power than comparable Intel-based devices. This topic “Kernel Level Implementation of Hadoop” is mostly theoretical and not much knowledge can be gained about it. Kernel Level Implementation of Hadoop is almost next to impossible till now.

Clone this wiki locally