Automation has become an invaluable resource that drives business agility and modernizes data center operations. It’s prudent for organizations to leverage any type of cloud, application, or infrastructure automation available to them, as it can increase operational flexibility and relieve stress on their internal IT team.
Service delivery can be streamlined through automating the configuration and provisioning of network, security, and infrastructure components that support application development. There are many benefits to reducing complexity and eliminating repetitive, manual tasks by embracing an approach that utilizes automation.
To maximize time spent supporting innovation and business growth in order to stay competitive, enabling automation becomes necessary. Especially when you consider the risks of continuously performing these tedious manually tasks which could result in errors. Most of these processes are probably running on older, hardware-based infrastructure that could be difficult to automate and may require specialized processes. In order to truly take the next step, organizations need to adopt a software-defined approach that provides a single control panel across clouds. This integrated cloud management gives IT the chance the automate the delivery of composable infrastructure and app services, with self-service capabilities. There is also less risk across private and hybrid clouds since automation can enable self-driving operations for continuous performance optimization, proactive capacity management, and intelligent troubleshooting and remediation.
Ready to Download the White Paper: A Practical Guide to Reducing Complexity with Automation?
Adopting automation and a software-defined approach is a large undertaking, one that involves corporate culture and communication. Creating a company mindset where automation is encouraged requires the refining of job roles. A software-oriented team that understands how to develop code and shifts towards incorporating DevOps into their mission and responsibilities is a vital part. In addition, there should be a clear roadmap of what automation should be implemented on what parts of the infrastructure and when. It begins with identifying repeatable processes and areas where automation can help pay off the “process debt”. Then determining what is doable and what is not with the resources available.
There is something referred to as the “Six Sevens”, developed by VMware, that outlines the best practices for automating multicloud infrastructures. Six process steps of proper data orchestration and seven key foundations to optimize operations across multiple clouds. In multicloud environments, most data orchestration scenarios can be boiled down to a set of basic requirements:
- A user will need to copy or move a specific partition of data, which may contain sensitive information, from a VMware cloud to a public or developer cloud, for a specific purpose.
- The output of the data processing is extracted and placed into production via containers.
- The original data is no longer needed and must be wiped on decommissioning.
This data movement can be broken down into six process components:
- Data Description: Defining a data description is the first step in data orchestration. Data description focuses on understanding the nature of data and the effect of its loss, either through destruction or loss of security. Once organizations understand the nature of data, it can be partitioned, moved, used, or destroyed as needed for a variety of operations.
- Data Partitioning: Partitioning data involves setting up governance, security rules, or other guardrails for data that specify which extraction or aggregation can (or should) occur. For example, if an organization is seeking to extract part of a large data lake for training or other purposes, properly classifying the data will minimize risk and speed up the process. Data partitioning should occur as part of the process of preparing data for movement. The data description will play a key role in helping IT apply governance or security requirements.
- Data Placement: Once data has been sufficiently partitioned, it can be placed. This step includes selecting the right storage and security profiles—such as at-rest encryption and cloud endpoint—for the selected storage type. For example, if an organization is streaming data from a data lake cloud to a processing cloud, it may wish to set up local storage outside the data lake. Data placement should consider criteria like sovereignty, data gravity, and speed of data access. Again, a thorough data description process can help organizations better understand the nature of their data and how to place and process it.
- Connectivity and Access: All data movement requires some form of communications mechanism for its transfer. When thinking about connectivity and access, organizations should consider both network access and application programming interface (API) access. APIs can be used to help establish the right type of data storage and support for accessing it.
- Processing: Once data is placed, and storage, connectivity, and access are set up via networking, it’s time to process the data. Machine learning (ML) is a key part of data processing. Organizations must define the nature of the code that will be executed, specifying criteria like workload type (VM or container), and the best tools for creating, deploying, configuring, and managing the data processing software. Cloud-native application models are not required for data processing, but they can help expedite the development and deployment of applications across multiple clouds.
- Cleanup: Once processing is complete, organizations will decommission cloud compute, storage, and network resources. Cloud operators often set up default decommissioning processes, such as disk wipes and network destruction, but default processes may not be enough to safeguard sensitive information. Organizations should consider regulatory and business compliance policies as they specify data cleanup processes.
Now there are seven key foundations to optimize operations across multiple clouds. These foundations aim to enable business agility and operational consistency and compliance.
- Data Classification: Classifying data creates a mechanism to describe data, and the effect of loss of any security objective. When automating multi-cloud data orchestration, organizations should consider issues like:
- What data is involved?
- What are its qualities?
- Is it growing at a high rate, creating potential future issues?
- Are there legal or policy restrictions around its location?
Organizations should also consider how the description of the data influences requirements for storage, security, instrumentation, networking, PaaS choices, and other variables.
- Governance: Governance is the mechanism that lets IT form and monitor policies that control data access. Data access control also includes its movement, such as passing parameters to APIs. To determine how best to automate processes, organizations should consider governance questions such as:
- What government, industry, or business policies affect processing data?
- Are there key auditing requirements?
- What service levels are promised, and what are the impacts of missing them?
- How are certificates, identities, authorization, and authentication handled on each cloud endpoint?
- Instrumentation: The ability to observe software execution is key to operations, and instrumentation enables it. Instrumentation provides mechanisms for projecting and sensing the operating characteristics of the processes (executable data) acting on other data or tracking its movement. Logging movement, network creation, access, and other processes is critical to understanding failures, as well as meeting compliance reporting requirements. Instrumentation solutions should encompass all processing metrics storage and analytics facilities, as well as logging services. When automating instrumentation, IT should ask whether their solution can:
- Observe the key service level indicators for all services.
- Monitor for security breaches and ensure all data and processing are complying with policy.
- Identify issues in the system and predict future failover or scaling needs.
To help safeguard the integrity of data and processes, logging services must also conform to proper security and corporate compliance standards.
- Networking: Networking is the connectivity and access needed to move data across or within clouds. As cloud-friendly and cloud-native architectures at the software level have emerged, such as software-defined networking (SDN), and SD-WAN solutions, multi-cloud network setup has become more dynamically driven than traditional, manual networks. Nonetheless, networking will always involve a physical setup, increasingly by cloud providers. As organizations seek to automate their processes, they should consider networking questions like:
- Is sufficient network capacity available to run a workload in a given cloud or transfer data to a particular site, without causing unintended saturation?
- Can the network ensure inflight encryption of data, based on governance and data classification?
- Is the networking correctly implementing the needs informed by security requirements?
- How do networking and dynamic provisioning processes for SDN or SD-WAN resources impact instrumentation, security, and governance?
- PaaS: Organizations can choose from a variety of PaaS platforms. To successfully automate their operations, organizations must consider their software delivery strategy and how it will support the processing stage discussed above. Some key questions to address when forming and implementing a PaaS strategy include:
- Are unique platforms required to accomplish processing goals and integrate specific applications?
- Is the PaaS platform available in the clouds intended for use?
- Can applications be tailored to a generalized PaaS to easily support multi-cloud deployments? Does the PaaS have built-in instrumentation, governance, security, and other services?
- Security: Security is all about assuring integrity of, and access to, data through micro-segmentation, encryption, and other technologies. It spans most every foundation and process across multi-cloud environments. When considering what mechanisms should be in place to automate the security layer of all data, organizations should ask themselves questions like:
- What level of security is needed for data in use, whether executable or content?
- What are the consequences of compromising any key security objectives, such as availability, confidentiality, or integrity?
- What methods are needed to detect and protect against compromise?
- What are the policies for handling and preventing compromise?
- Storage: Storage encompasses not only basic criteria like disk availability, but also how data gets stored. For example, storing data in a Hadoop Data File System (HDFS) for use by ML algorithms also introduces data access and privilege issues. When evaluating storage, organizations should consider issues like:
- Are role-based access controls in place at the appropriate layer to assure proper limitations?
- Does storage have the required speed or durability for the application?
- Is the needed storage available in preferred clouds, or is a new provider necessary?
- Does the storage need to be encrypted at the file system or block layer?
- Does storage need to be wiped after purging of data?
Storage is inherently a fundamental issue that addresses multiple areas in an infrastructure. Organizations should consider its impact on security, governance, networking, and instrumentation—and how to automate those areas to properly use the storage without compromising data.
Following the “Six Sevens” is a failsafe way to ensure your organization is being thorough when adopting and implementing automation into your infrastructure. If you or your team needs extra support along the way please reach out to us and one of our VMware expert engineers can help you along your automation journey.