Skip to main content

Governing and Managing Big Data for Analytics and Decision Makers

Redguide

thumbnail 

Published on 26 August 2014

  1. .EPUB (0.5 MB)
  2. .PDF (0.8 MB)

Share this page:   

IBM Form #: REDP-5120-00


Authors: Mandy Chessell, Ferd Scheepers, Nhan Nguyen, Ruud van Kessel and Ron van der Starre

    menu icon

    Abstract

    It is estimated that a staggering 70% of the time spent on analytics projects is concerned with identifying, cleansing, and integrating data. Acquiring data for analytics in an ad hoc manner creates a huge burden on the teams that own the systems supplying data. Often the same type of data is repeatedly requested and the original information owner finds it hard to keep track of who has copies of which data. As a result, many organizations are considering implementing a data lake solution. A data lake is a set of one or more data repositories that have been created to support data discovery, analytics, ad hoc investigations, and reporting. The data lake contains data from many different sources. People in the organization are free to add data to the data lake and access any updates as necessary.

    However, without proper management and governance, such a data lake can quickly become a data swamp. A data swamp is overwhelming and unsafe to use because no-one is sure where data came from, how reliable it is, and how it should be protected. IBM® proposes an enhanced data lake solution that is built with management, affordability, and governance at its core. This solution is known as a data reservoir.

    This IBM Redguide™ publication discusses the value of a data reservoir, discusses how it fits into the existing business IT environment, and identifies sources of data for the data reservoir. It also provides a high-level architecture of a data reservoir and discusses key components of that architecture. It identifies key roles essential to creating, supporting, and maintaining the data reservoir and how information integration and governance play a pivotal role in supporting the data reservoir.

    Table of Contents

    Introduction

    A view from ING

    What is a data reservoir

    Analytics in the business world

    Working with a data reservoir

    Data reservoir in the business environment

    Architecture of the data reservoir

    Organizational change and impact

    Rolling out a data reservoir

    How IBM helps make it happen

    Summary

     

    Others who read this also read