Hadoop and Business Intelligence

Like my colleague Alex Olesker, I too attended Cloudera Day 2012.  While there were many panels of interest, perhaps one of the most important was Amr Awadallah‘s talk about big data applications to business intelligence. Many CTOVision readers with backgrounds in the intelligence community may think of corporate espionage when the phrase “business intelligence” is uttered, but I assure you that this is definitely not the case. Business intelligence is different from competitive intelligence, which is primarily based on open-source analysis of competitors and markets. Rather, business intelligence is quantitative analysis of internal data using advanced analytics techniques.

As we’ve noted before on CTOLabs, business intelligence is a changing field that is increasingly awash in information. Analysts face three core problems:

  • More information – Critical information now comes from new sources, such as online customer reviews and Facebook updates.
  • More change – The pace of competition, changes in customer preferences and organizational changes are all accelerating.
  • More questions – More people are asking new questions, such as: Why did that happen?How do these trends relate?

Adding to the analytical difficulties is the fact that complex and unstructured data, as opposed to relational data, is exploding. The question increasingly becomes–how do you manage the data and categorize it?

Awadallah talked about the unfavorable use/price dimension involved in archiving data. Once you’ve archived it, you’ve more or less lost it because it becomes too expensive to retrieve on demand. The solution? A combined computer/storage layer with Hadoop applications. Awadallah discussed opportunities to employ HDFS and MapReduce in business analytics. Hadoop offers three “core values”: scalability to grow nodes, flexibility in data storage and analysis, and data longevity. No transformation–implied under the previous model–is necessary, as data can start flowing any time with Hadoop applications. Awadallah contrasted what he called the slow “Data Council” model with a new forward-thinking approach he dubbed “Data Scientist” built on Hadoop products. Hadoop can grow too without requiring developers to re-architect their applications and algorithms. Both structured and raw data can be logged from multiple applications, and Hadoop offers centralized logging across all execution platforms.

All of these approaches, Awadallah argued, offered a much better RoB (Return on Byte) for business intelligence analysts looking to use big data to optimize their enterprise.

Sign up for your free CTOvision Pro trial today for unique insights, exclusive content and special reporting.

CTOvision Pro Special Technology Assessments

We produce special technology reviews continuously updated for CTOvision Pro members. Categories we cover include:

  • Analytical Tools - With a special focus on technologies that can make dramatic positive improvements for enterprise analysts.
  • Big Data - We cover the technologies that help organizations deal with massive quantities of data.
  • Cloud Computing - We curate information on the technologies enabling enterprise use of the cloud.
  • Communications - Advances in communications are revolutionizing how data gets moved.
  • GreenIT - A great and virtuous reason to modernize!
  • Infrastructure  - Modernizing Infrastructure can have dramatic benefits on functionality while reducing operating costs.
  • Mobile - This revolution is empowering the workforce in ways few of us ever dreamed of.
  • Security  -  There are real needs for enhancements to security systems.
  • Visualization  - Connecting computers with humans.
  • Hot Technologies - Firms we believe warrant special attention.

 

Recent Research

USN Quarterly Industry Day at Charleston: What you need to know to compete

Request Your Invite to the 20 May 2014 Andreessen Horowitz Fed Forum in DC

Amazon Hopeful that Fire TV will Spread

What The Enterprise IT Professional Needs To Know About Git and GitHub

3D Printing… At Home?

Tech Firms Seeking To Serve Federal Missions: Here is how to follow the money

Creating The New Cyber Warrior: Eight South Carolina Universities Compete

Mobile Gamers: Fun-Seeking but Fickle

Update from DIA CTO, CIO and Chief Engineer on ICITE and Enterprise Apps

Pew Report: Increasing Technology Use among Seniors

Finding The Elusive Data Scientist In The Federal Space

DoD Public And Private Cloud Mandates: And insights from a deployed communications professional on why it matters

solid
About AdamElkus

Adam Elkus is a PhD student in Computational Social Science at George Mason University. He writes on national security, technology, and strategy at CTOvision.com and the new analysis focused Analyst One, War on the Rocks, and his own blog Rethinking Security. His work has been published in The Atlantic, Journal of Military Operations Foreign Policy, West Point Counterterrorism Center Sentinel, and other publications.

  • InnovativeSoftware(.info)

    “Adding to the analytical difficulties is the fact that complex and unstructured data, as opposed to relational data, is exploding.”

    You call it unstructured data. But, actually the data is structured. It’s just not been given a structure upfront by a systems analyst.

    “The question increasingly becomes–how do you manage the data and categorize it?”

    Correct and that is the *right* question to ask! When you force structure on data you lose information. Even if you don’t lose data you lose information about the data.

  • InnovativeSoftware(.info)

    “Adding to the analytical difficulties is the fact that complex and unstructured data, as opposed to relational data, is exploding.”

    You call it unstructured data. But, actually the data is structured. It’s just not been given a structure upfront by a systems analyst.

    “The question increasingly becomes–how do you manage the data and categorize it?”

    Correct and that is the *right* question to ask! When you force structure on data you lose information. Even if you don’t lose data you lose information about the data.