IBM aims to make it easier for businesses and governments to access their mainframe data with the new z / OS platform for Apache Spark and analyze it locally without offloading and time-saving. This should open up new opportunities for data analysts and developers. They should be able to use advanced analytics tools for the rich data base on the mainframe and gain more insight in real time.
The z / OS platform for Apache Spark allows Spark, an open source analytics framework, to run natively on the z / OS operating system. The new offer allows experts to analyze data where it originated. Removing the link between the analytics library and the underlying file system eliminates the need to extract, transform, and load data as is the case with the well-known ETL (Extract, Transform, Load) method.
In the cognitive age, where data is the new natural resource that can understand, evaluate and learn computer systems, companies need to be responsive to changing developments and exploit knowledge from information before it becomes irrelevant. With the new offering, which also includes accelerators from z Systems business partners, organizations can leverage mainframe data and resources even more easily. This should enable them to better understand market changes as well as individualized customer needs and to adjust their market activity in real time and with a shorter payback period.
z Systems mainframes handle mission-critical data and transactions for many of the world’s largest banks, insurance companies, and retail and transportation companies. They say they own the industry’s fastest commercial microprocessors and the ability to conduct analytics during transactions. Predictive models are included in a transaction in two milliseconds or less. Spark organizations can now leverage these resources using advanced in-memory analytics without having to download data from the mainframe. This saves time and money, and limits risks.
“As companies of all sizes transform into digital organizations, they need to be able to get a clear picture of all existing business data. The time wasted and the risks of data offloads can not be compromised, “says Rod Smith, IBM Fellow, Emerging Internet Technologies. “Apache Spark, which is now operational on IBM platforms, including the mainframe, allows customers to conduct analytics on the transaction systems hosting the critical data. At the same time context-related insights can be integrated from other data sources. This allows users to better serve their customers and generate more revenue in real time. ”
The IBM z / OS platform for Apache Spark includes Spark’s open source resources consisting of Apache Spark Core, Spark SQL, Spark Streaming, the Machine Learning Library (MLlib), and Graphx, combined with the industry’s only mainframe based data abstraction solution from Spark. The new platform empowers organizations with multiple capabilities to gain more effective and secure insights.
Developers and data analysts can leverage their existing know-how with programming languages such as Scala, Python, R and SQL. This reduces the time required for actionable findings. Optimized data abstraction services address complexity by providing seamless access to legacy enterprise data such as IMS, VSAM, DB2 z / OS, PDSE, or SMF using familiar tools through Apache Spark APIs. Apache Spark uses an in-memory approach to processing data for quick results. The platform includes data abstraction and integration services that enable z / OS Analytics applications to leverage standard Spark APIs. This allows organizations to analyze on-site data and avoid offload / ETL expensive processing and security considerations.
IBM also works with its three partners, DataFactZ, Rocket Software, and Zementis, to develop custom solutions using the IBM z / OS Apache Spark platform. DataFactZ is a new partner and is partnering with IBM Spark-SQL and MLlib to develop spark analytics for data and transactions processed on the mainframe. Rocket software and IBM have a long history of collaboration that now extends to z / OS Apache Spark. For example, Rocket’s new launchpad solution will allow customers to try out the platform and use data on z / OS. Zementis is complementing its in-transaction predictive analytics offering for z / OS with a standards-based execution engine for Apache Spark. The solution allows users to apply and execute advanced predictive models.
The new z / OS platform for Apache Spark and partner solutions will enable data analysts and analysts who need to gather data from multiple sources to use preferred formats and tools.
IBM announced last year its commitment to Spark. This includes more than 3500 researchers and IBM developers working on related projects. As part of the commitment to move open source analytics technologies to the mainframe, z Systems has created a new GitHub developer organization to work together to build tools around z / OS on Spark. For example, a combination of the “Project Jupyter” and a NoSQL database can provide a flexible and extensible data processing and analysis solution.
This approach can help make modern open source tools more accessible by enabling developers to choose their tools as well as languages themselves, providing new visual aids to monitor analytics results across disparate data environments, and advanced data processing techniques and capabilities be possible.