Hadoop Adoption Proves Slow But Steady

By Michael Vizard  |  Posted 04-29-2013

Hadoop Adoption Proves Slow But Steady

What Is the Status of Your Hadoop Project?  Only 24% of Hadoop projects are in production, while 15% have a project in pilot and 11% are running a project in a sandbox.

Hadoop Adoption Proves Slow But Steady

Hadoop Adoption Proves Slow But Steady

What Are Your Reasons for Using Hadoop?  53% identified low cost of scaling as a key reason to move to Hadoop. But 44% also identified better analysis capabilities as a key driver.

Hadoop Adoption Proves Slow But Steady

Hadoop Adoption Proves Slow But Steady

How Much Data Are You Managing in Hadoop Today?  Only 19% have more than 500TB in a Hadoop system and only 26% have more than 50TB.

Hadoop Adoption Proves Slow But Steady

Hadoop Adoption Proves Slow But Steady

How Much Data Are You Managing in Production?  Once Hadoop goes into production, the amount of data starts to scale rapidly. About half the Hadoop systems in production have more than 500TB.

Hadoop Adoption Proves Slow But Steady

Hadoop Adoption Proves Slow But Steady

What Challenges Have You Faced With Hadoop?  37% identified a lack of real-time capabilities as the biggest challenge, 26% cited the amount of time it takes to reach production and 25% cited manual coding.

Hadoop Adoption Proves Slow But Steady

Hadoop Adoption Proves Slow But Steady

What Hadoop Query Methods Do You Use Today?  68% said Hive, the native MapReduce interface garnered 57%, and Pig and Native SQL received 34% and 15%, respectively.

Hadoop Adoption Proves Slow But Steady

Hadoop Adoption Proves Slow But Steady

What’s Your Hadoop Responsibility?  Most of the respondents were developers (37%), but it’s clear that other job functions, such as data scientist (24%) and IT architect (20%), wield influence.

Hadoop Adoption Proves Slow But Steady

Hadoop Adoption Proves Slow But Steady

What Is Hadoop?  Hadoop is a framework made of a variety of components that allows for the distributed processing of large data sets across a fault-tolerant cluster of servers. The core Hadoop project includes Hadoop Common, the utilities that support the other Hadoop modules; Hadoop Distributed File System, a distributed file system that provides high-throughput access to application data; Hadoop YARN, a framework for job scheduling and cluster resource management; and Hadoop MapReduce, a YARN-based interface for parallel processing of large data sets.

Hadoop Adoption Proves Slow But Steady