Cloudera working with Hadoop: An update

Today, Apache Hadoop is one of the most popular open source software frameworks for Big Data. There are many reasons for this, but key among them is the ability to process data in scalable ways. Hadoop can execute commands over data fast. What once was slow or even impossible can now be done fast.

The best way to get started with Hadoop is to leverage the 100% open source distribution of the Apache Hadoop capability provided by Cloudera (more here). Cloudera provides enterprise support and enterprise ready versions of Apache Hadoop. This approach offers several key benefits for Hadoop users:

  • They provide services to help install, configure, optimize, and tune Hadoop for large-scale data processing and analysis.
  • They provide production support for Hadoop so that users do not have to rely on the OSS community for critical bug fixes or enhancements.

  • They provide extensive training and certification programs for Hadoop.
  • They assist and certify other vendors that are creating product offerings to work with Hadoop.
  • They have over 700 hardware and software partners including leading storage, business intelligence, reporting and business application technologies.

  • They enhance the Hadoop solution with key features like batch processing, interactive SQL, security and interactive search as well as enterprise-grade continuous availability. This also includes a supplement to Hive (a Hadoop component) called Impala which solves many of the performance problems native to Hive.

Impala is one of the most recent pieces of technology that Cloudera has added to the Hadoop platform. Impala is an open source Massively Parallel Processing query engine for Apache Hadoop. Impala enables users to issue low-latency SQL queries to data stored in HDFS (Hadoop Distributed File System) and Apache HBase without requiring data movement or transformation. By doing this, Impala allows processes to perform faster analytics on data stored in Hadoop and removes the need to migrate data sets into specialized systems, or proprietary formats, to simply perform analytics. Compared to Hadoop’s Hive, Impala is able to take full advantage of hardware resources and typically generate less CPU load. One important thing to remember is that Impala is not a replacement of MapReduce or Hive, it simply works with these technologies to solve speed and efficiency issues.

To learn more about Impala and Hadoop: click here to watch a webinar or click here to read about it.


Sign up for your free CTOvision Pro trial today for unique insights, exclusive content and special reporting.

CTOvision Pro Special Technology Assessments

We produce special technology reviews continuously updated for CTOvision Pro members. Categories we cover include:

  • Analytical Tools - With a special focus on technologies that can make dramatic positive improvements for enterprise analysts.
  • Big Data - We cover the technologies that help organizations deal with massive quantities of data.
  • Cloud Computing - We curate information on the technologies enabling enterprise use of the cloud.
  • Communications - Advances in communications are revolutionizing how data gets moved.
  • GreenIT - A great and virtuous reason to modernize!
  • Infrastructure  - Modernizing Infrastructure can have dramatic benefits on functionality while reducing operating costs.
  • Mobile - This revolution is empowering the workforce in ways few of us ever dreamed of.
  • Security  -  There are real needs for enhancements to security systems.
  • Visualization  - Connecting computers with humans.
  • Hot Technologies - Firms we believe warrant special attention.


Recent Research

What The Enterprise IT Professional Needs To Know About Git and GitHub

3D Printing… At Home?

Tech Firms Seeking To Serve Federal Missions: Here is how to follow the money

Creating The New Cyber Warrior: Eight South Carolina Universities Compete

Mobile Gamers: Fun-Seeking but Fickle

Update from DIA CTO, CIO and Chief Engineer on ICITE and Enterprise Apps

Pew Report: Increasing Technology Use among Seniors

Finding The Elusive Data Scientist In The Federal Space

DoD Public And Private Cloud Mandates: And insights from a deployed communications professional on why it matters

Intel CEO Brian Krzanich and Cloudera CSO Mike Olson on Intel and Cloudera’s Technology Collaboration

Watch For More Product Feature Enhancements for Actifio Following $100M Funding Round

Navy Information Dominance Corps: IT still searching for the right governance model

About Kimberly Kelly

Kimberly Kelly has been involved in entrepreneurial activities on the Internet since she was 8 years old when she created an organization raising over $2000.00 for charities. Since that time she has continued to immerse herself in technology for positive change. She writes at and and the new analysis focused Analyst One.