In a 2021 PwC survey of US corporate executives, 34 percent of respondents said that using data analytics to make better business decisions was a factor in their decision to move to the cloud. This was tied for the most popular reason for a migration. But most long-running enterprises have on-premises data centers, and any data contained in those centers is siloed off and not available to cloud-based analytics tools.
Data from traditional line of business applications running on IBM Power hardware is separated from data from newer applications running on x86 (which may be in the cloud already). Breaking these well-established data silos is a motivation for cloud migration. But refactoring these long-running, complex, often highly customized applications (think of an airline’s flight reservation system or an Oracle or SAP ERP system) is a large and complex project that many enterprises don’t have the expertise, budget or risk tolerance to take on.
Luckily, there are solutions to lift and shift IBM i, AIX and Linux on Power workloads to the public cloud and run them there unchanged. Then, companies can use cloud-native services to run modern data analytics and business intelligence (BI) on the comprehensive data set. Extending workloads in this way can still provide a great deal of value, even if companies don’t want to modernize those traditional workloads to run natively in the cloud (or plan to do so in the future but want to run data analytics on them in the meantime).
Here’s an example of this use case based on a recent customer deployment using Microsoft Azure and Azure Synapse.
Business intelligence and IBM i applications, step by step
Step One: Prepare
First, build IBM i Virtual Machines (VM) or logical partitions (LPAR) to receive the data. This requires using a specialized solution to run IBM i workloads unchanged (each major public cloud provider offers one). The destination environment in Azure should match the on-premises environment exactly – the same number of VMs/LPARs, same memory/disk/CPU allocations, same file system structures, same IP addresses, same network subnets, etc.
Step Two: Migrate
Next, move the data into the destination VM/LPARs. Methods for doing this include Azure ExpressRoute, a fast FTP connection, or a physical storage device for large volumes of data (much more could be written about how to manage a successful migration, but that’s beyond the scope of this article). This is a straight lift and shift without modifying the workloads being moved.
Step Three: Connect
Next, connect the IBM i DB2 databases that are now running in the cloud to a data lake in that cloud. In Azure Synapse, create a new DB2 connector and fill in all the required info (server name, database name, etc.). This does not require coding or building pipelines from scratch. The transfer can be run all at once or scheduled for the future in multiple stages and there are many options for filtering and modifying data during transit (like choosing delimiters and using the first row as the header), only bringing in specific tables, and selecting or building destination folders and subfolder. Once the pipeline has been deployed, all the tables from the IBM i application will be available for analytics!
Step Four: Modify
Now, users can use the Business Intelligence/big data tool to perform operations on the data like joining, filtering, sorting, selecting and more. In cloud native tools, these operations require little to no coding, so they’re available to less-technical employees (and just easier to do overall). Most importantly for our example, users can also build a pipeline to combine data from the IBM i application with data from cloud-native applications or data pulled into the lake through other methods. Subsequent analysis can now just be run once, on all relevant data, to produce more useful results.
Step Five: Analyze
Now, with everything in place and connected, it’s possible to extract valuable information from the combined data. Most cloud-native tools will offer a way to run SQL queries against the gathered data to do things like find the most common times customers order a specific product, find all customers that ordered a particular product, check when they placed an order, and much more. Like building pipelines, these operations can usually be done without provisioning resources manually – the tool should take care of that, so non-technical employees can also work with the data. Depending on the tool of choice, organizations may also be able to run predictive analytics against data tables using ML models, and to export reports to share the results of their work.
Cloud benefits without going cloud-first
Security and privacy are important when working with customer data. Azure Synapse offers the ability to restrict access to certain data to specific roles (as defined in Azure Active Directory), and data masking to blur out details like credit card numbers so analysts can work with data while maintaining customer confidentiality.
Overall, this process allows IBM i users to apply modern data analytics and BI capabilities to previously siloed IBM i applications without the need for refactoring. Depending on the capabilities of the exact solutions used, it may even allow less-technical employees to get hands-on with the data. For organizations that aren’t prepared to move to a cloud-first model, this process offers a low-risk way to extend IBM Power workloads and to combine data from different silos. Ultimately, it will enable BI systems to provide better insights based on more complete data.