Hadoop and Business Intelligence

Like my colleague Alex Olesker, I too attended Cloudera Day 2012.  While there were many panels of interest, perhaps one of the most important was Amr Awadallah‘s talk about big data applications to business intelligence. Many CTOVision readers with backgrounds in the intelligence community may think of corporate espionage when the phrase “business intelligence” is uttered, but I assure you that this is definitely not the case. Business intelligence is different from competitive intelligence, which is primarily based on open-source analysis of competitors and markets. Rather, business intelligence is quantitative analysis of internal data using advanced analytics techniques.

As we’ve noted before on CTOLabs, business intelligence is a changing field that is increasingly awash in information. Analysts face three core problems:

  • More information – Critical information now comes from new sources, such as online customer reviews and Facebook updates.
  • More change – The pace of competition, changes in customer preferences and organizational changes are all accelerating.
  • More questions – More people are asking new questions, such as: Why did that happen?How do these trends relate?

Adding to the analytical difficulties is the fact that complex and unstructured data, as opposed to relational data, is exploding. The question increasingly becomes–how do you manage the data and categorize it?

Awadallah talked about the unfavorable use/price dimension involved in archiving data. Once you’ve archived it, you’ve more or less lost it because it becomes too expensive to retrieve on demand. The solution? A combined computer/storage layer with Hadoop applications. Awadallah discussed opportunities to employ HDFS and MapReduce in business analytics. Hadoop offers three “core values”: scalability to grow nodes, flexibility in data storage and analysis, and data longevity. No transformation–implied under the previous model–is necessary, as data can start flowing any time with Hadoop applications. Awadallah contrasted what he called the slow “Data Council” model with a new forward-thinking approach he dubbed “Data Scientist” built on Hadoop products. Hadoop can grow too without requiring developers to re-architect their applications and algorithms. Both structured and raw data can be logged from multiple applications, and Hadoop offers centralized logging across all execution platforms.

All of these approaches, Awadallah argued, offered a much better RoB (Return on Byte) for business intelligence analysts looking to use big data to optimize their enterprise.

CTOvision Pro Special Technology Assessments

We produce special technology reviews continuously updated for CTOvision Pro members. Categories we cover include:

  • Analytical Tools - With a special focus on technologies that can make dramatic positive improvements for enterprise analysts.
  • Big Data - We cover the technologies that help organizations deal with massive quantities of data.
  • Cloud Computing - We curate information on the technologies enabling enterprise use of the cloud.
  • Communications - Advances in communications are revolutionizing how data gets moved.
  • GreenIT - A great and virtuous reason to modernize!
  • Infrastructure  - Modernizing Infrastructure can have dramatic benefits on functionality while reducing operating costs.
  • Mobile - This revolution is empowering the workforce in ways few of us ever dreamed of.
  • Security  -  There are real needs for enhancements to security systems.
  • Visualization  - Connecting computers with humans.
  • Hot Technologies - Firms we believe warrant special attention.

 

Recent Research

Mobile Gamers: Fun-Seeking but Fickle

Update from DIA CTO, CIO and Chief Engineer on ICITE and Enterprise Apps

Pew Report: Increasing Technology Use among Seniors

Finding The Elusive Data Scientist In The Federal Space

DoD Public And Private Cloud Mandates: And insights from a deployed communications professional on why it matters

Intel CEO Brian Krzanich and Cloudera CSO Mike Olson on Intel and Cloudera’s Technology Collaboration

Watch For More Product Feature Enhancements for Actifio Following $100M Funding Round

Navy Information Dominance Corps: IT still searching for the right governance model

DISA Provides A milCloud Overview: Looks like progress, but watch for two big risks

Innovators, Integrators and Tech Vendors: Here is what the government hopes they will buy from you in 2015

Navy continues to invest in innovation: Review their S&T efforts here

MSPA Unified Certification Standard For Cloud Service Providers: Is This A Commercial Version of FedRamp?

solid
About AdamElkus

Adam Elkus is a PhD student in Computational Social Science at George Mason University. He writes on national security, technology, and strategy at CTOvision.com and the new analysis focused Analyst One, War on the Rocks, and his own blog Rethinking Security. His work has been published in The Atlantic, Journal of Military Operations Foreign Policy, West Point Counterterrorism Center Sentinel, and other publications.

  • InnovativeSoftware(.info)

    “Adding to the analytical difficulties is the fact that complex and unstructured data, as opposed to relational data, is exploding.”

    You call it unstructured data. But, actually the data is structured. It’s just not been given a structure upfront by a systems analyst.

    “The question increasingly becomes–how do you manage the data and categorize it?”

    Correct and that is the *right* question to ask! When you force structure on data you lose information. Even if you don’t lose data you lose information about the data.

  • InnovativeSoftware(.info)

    “Adding to the analytical difficulties is the fact that complex and unstructured data, as opposed to relational data, is exploding.”

    You call it unstructured data. But, actually the data is structured. It’s just not been given a structure upfront by a systems analyst.

    “The question increasingly becomes–how do you manage the data and categorize it?”

    Correct and that is the *right* question to ask! When you force structure on data you lose information. Even if you don’t lose data you lose information about the data.