It is always interesting when two hot technologies come together. The CA based software house, GridGain, has brought together In-Memory and Hadoop by offering a high-performance in-memory file system. It is even 100% compatible with HDFS. GridGain File System (GGFS) is a plug-and-play alternative to the disk-based Hadoop HDFS. Enabling up to 10x faster performance for IO and network intensive Hadoop MapReduce jobs running on 10s and 100s of computers in any Hadoop cluster.
Unlike any other file system, GGFS can work as either a standalone file system in Hadoop cluster or work in tandem with HDFS, providing a primary caching layer for the secondary HDFS. As a caching layer it provides highly configurable read-through and write-through behavior. In either case GGFS can be used as a drop-in alternative for, or an extension of, standard HDFS providing an instant performance boost.
Interestingly GridGain’s In-Memory Hadoop isn’t yet another proprietary Hadoop distribution, rather, it is a Hadoop accelerator that works with any choice of Hadoop distribution (including the highly reliable, 100% open source CDH from Cloudera). The architecture behind In-Memory Hadoop gives you the freedom to not only chose any Hadoop distribution but to also use any of the dozens of Hadoop-based tools that you currently utilize. Whether you use the standard tools such as HBase, Hive, Mahout, Oozie, Flume, Scoop, Pig, or any of the commercial BI, data visualization, or data analytics platforms – you can continue to use them without any change while enjoying an instant performance boost.
Another hot feature with In-Memory Hadoop is the GUI-based management and monitoring tool called Visor. It provides deep development and operational capabilities including an operations & telemetry dashboard, data grid and compute grid management, as well as GGFS monitoring and file management between HDFS, local and GGFS file systems.
If you want to read more or need faster Hadoop, see GridGain.com.
Related Reading:
Amazon rolls out High Memory Instances for in-memory databases
H2O: An in-memory product delivering the fastest possible analytics speeds
Microsoft Focuses Big Data Efforts on Hadoop
Starting at the Basics: What is Hadoop and what problems does it solve?