IBM System x Reference Architecture for Hadoop: IBM InfoSphere BigInsights Reference Architecture

An IBM Redpaper publication

Note: A new draft version of this publication is now available



Abstract

The IBM® System x® reference architecture is a predefined and optimized hardware infrastructure for IBM InfoSphere® BigInsights™ 2.1, which is a distribution of Apache Hadoop with value-added capabilities that are specific to IBM. The reference architecture provides a predefined hardware configuration for implementing InfoSphere BigInsights 2.1 on System x hardware. The reference architecture can be implemented in two ways to support MapReduce workloads or Apache HBase workloads.


  • MapReduce is a core component of Hadoop that provides an offline, batch-oriented framework for high-throughput data access and distributed computation.
  • Apache HBase is a schemaless, No-SQL database that is built upon Hadoop to provide high throughput random data reads and writes and data caching.


The predefined configuration provides a baseline configuration for an InfoSphere BigInsights cluster and provides modifications for an InfoSphere BigInsights cluster that is running HBase. The predefined configurations can be modified based on the specific customer requirements, such as lower cost, improved performance, and increase reliability.

Table of contents

Introduction
Business problem and business value
Reference architecture use
Requirements
InfoSphere BigInsights predefined configuration
InfoSphere BigInsights HBase predefined configuration
Deployment considerations
Customizing the predefined configurations
Predefined configuration bill of materials
References



Profile

Publish Date
10 June 2013


Rating:
(based on 1 review)


Author(s)

IBM Form Number
REDP-5009-00

Number of pages
40