Cataloging Unstructured Data in IBM Watson Knowledge Catalog with IBM Spectrum Discover

A draft IBM Redpaper publication

Updated 17 July 2020

cover image

ISBN-10: 073845902x
ISBN-13: 9780738459028
IBM Form #: REDP-5603-00

More options

Rate and comment

Authors: Joseph Dain, Abeer Selim, Anil Patil, Christopher Vollmar, Flavio de Rezende, PhD, Frank Greco, Frank N. Lee, PhD, Isom Crawford, PhD, Ivaylo B. Bozhinov, Joanna Wong, PhD, Joshua Blumert, Larry Coyne

Abstract

This IBM® Redpaper publication explains how IBM Spectrum® Discover integrates with the IBM Watson® Knowledge Catalog (WKC) component of Cloud Pak for Data (CP4D) to make the enriched catalog content in IBM Spectrum Discover along with the associated data available in WKC and CP4D. From an end to end IBM solution point of view, CP4D and WKC provide state of the art data governance, collaboration, and AI and analytics tools and IBM Spectrum Discover compliments this by adding support for unstructured data residing on large scale file and object storage systems on premises and in the cloud. Several in-depth use cases are used that show examples of healthcare (including COVID-19), life sciences, and financial services.

IBM Spectrum Discover’s integration with Watson Knowledge Catalog enables storage administrators, data stewards, and data scientists to efficiently manage, classify, and gain insights from massive amounts of data. The integration improves storage economics, helps mitigate risk, and accelerates large-scale analytics to create competitive advantage and speed critical research.

Table of contents

Chapter 1. IBM Spectrum Discover overview
Chapter 2. IBM Watson Knowledge Catalog and Cloud Pak for Data Overview
Chapter 3. IBM Spectrum Discover Integration with Watson Knowledge Catalog Architecture and Benefits
Chapter 4. Curating Unstructured Data for Watson Knowledge Catalog with IBM Spectrum Discover
Chapter 5. Healthcare and life sciences use cases
Chapter 6. Financial services use case - PII detection and data governance
Chapter 7. Conclusion

These pages are Web versions of IBM Redbooks- and Redpapers-in-progress. They are published here for those who need the information now and may contain spelling, layout and grammatical errors. This material has not been submitted to any formal IBM test and is published AS IS. It has not been the subject of rigorous review. Your feedback is welcomed to improve the usefulness of the material to others.

Follow IBM Redbooks

Follow IBM Redbooks