Apache Spark Implementation on IBM z/OS

An IBM Redbooks publication

Published 13 August 2016

cover image

ISBN-10: 0738414961
ISBN-13: 9780738414966
IBM Form #: SG24-8325-00
(142 pages)

More options

Rate and comment

Authors: Lydia Parziale, Joe Bostian, Ravi Kumar, Ulrich Seelbach, Zhong Yu Ye


The term big data refers to extremely large sets of data that are analyzed to reveal insights, such as patterns, trends, and associations. The algorithms that analyze this data to provide these insights must extract value from a wide range of data sources, including business data and live, streaming, social media data.

However, the real value of these insights comes from their timeliness. Rapid delivery of insights enables anyone (not only data scientists) to make effective decisions, applying deep intelligence to every enterprise application.

Apache Spark is an integrated analytics framework and runtime to accelerate and simplify algorithm development, depoyment, and realization of business insight from analytics. Apache Spark on IBM® z/OS® puts the open source engine, augmented with unique differentiated features, built specifically for data science, where big data resides.

This IBM Redbooks® publication describes the installation and configuration of IBM z/OS Platform for Apache Spark for field teams and clients. Additionally, it includes examples of business analytics scenarios.

Table of contents

Chapter 1. Architectural overview
Chapter 2. Components and extensions
Chapter 3. Installation and configuration
Chapter 4. Spark application development on z/OS
Chapter 5. Production integration
Chapter 6. IBM z/OS Platform for Apache Spark and the ecosystem
Chapter 7. Use case patterns
Appendix A. Sample code to run on Apache Spark cluster on z/OS
Appendix B. FAQ: Frequently asked questions, and answers

Follow IBM Redbooks

Follow IBM Redbooks