Cloud adoption brings about significant changes in cost management. In a traditional on-premises environment, cost management is based on renewal cycles, host refreshes, and periodic maintenance. These costs can be predicted, planned, and refined to align with annual investment budgets.
In practice, we observe that IT engineers generate cloud costs manually or automatically through code, without involving finance and procurement. At some point, it becomes apparent that the costs are much higher than expected or budgeted, leading to cost reduction actions. Within FinOPS methodologies, the focus often lies on gaining insight into and optimizing expenditures for existing services. There is relatively little emphasis on upfront estimation of cloud costs.
To address this, we have developed an approach that prevents cloud costs before they are incurred. In our approach, each workload being migrated to or initiated in the cloud requires an approved budget. The budget is obtained through a business case or cost case. This business/cost case examines the proposed solution from various perspectives: alignment with required service levels, architecture, security, and costs. To determine the expected costs, we leverage calculators provided by cloud providers.
We achieve this by conducting workshops to develop the business case, involving stakeholders from business, finance, procurement, and IT. These workshops facilitate discussions on the business needs, service levels (such as availability), and the proposed architecture. Combined with cost estimation for different scenarios, a business case is ultimately created, incorporating cost-saving methods such as Reserved Instances (RIs) and optimal sizing.
The advantages of this approach are as follows:
- Overall expenses are lower
- The business gains a better understanding of the levers that can influence costs
- IT engineers are prompted to think more about costs, thereby increasing cost awareness
- Cost forecasting is improved
- Stakeholders collaborate more efficiently on anomalies / incidents in their day-to-day activities due to existing familiarity.
Case Study
An organization intends to move its ERP environment to the cloud. Currently, the entire ERP environment is managed by an external party, which will also oversee the transition to the cloud as a project and continue to manage the cloud solution.
Cost Calculation
The external party used the Azure calculator to estimate the costs of the entire intended environment, including all services (e.g., Azure Backup). Subsequently, discussions took place between a cloud architect and a cost engineer from the client. In these discussions, the architecture, sizing, assumptions, and costs were reviewed. The following findings were discussed by the cost engineer and cloud architect:
- The intended VMs are relatively expensive (accepted due to the requirement for certification by the ERP vendor; these VMs are slightly oversized)
- Test and acceptance environments should only be active from 7:00 AM to 6:00 PM on workdays
- Purchase Reserved Instances for production environments
- Leverage Azure Hybrid Benefit for Windows Servers
- Prepay for SUSE Linux
- Switch from Premium SSD to Standard SSD in non-production environments
Implementing these adjustments has the following effect on the (expected) running costs.
Additionally, the handling of costs during the transition phase was discussed. Initially, the provider intended to provision the necessary machines in Azure and keep them running. However, after consultation, it was agreed to critically assess when the VMs are active and automatically turn them off during idle periods. As the client possesses sufficient Windows Server Datacenter licenses, Azure Hybrid Benefit was always applied.
Comparing the costs to the organization’s usual practices, it was observed that €35,000 in costs during the project phase were prevented.
The justification for this is as follows:
- Typically, no optimization occurs during the project phase. In the optimal scenario, machines are turned off more frequently, and Azure Hybrid Benefit is already activated.
- At the go-live in July, both scenarios are in PAYG. In the optimal scenario, the acceptance and test environments are automated to turn on and off, and Azure Hybrid Benefit is applied. In the normal scenario, all environments run 24×7 without further optimization.
- After 2 months of live operation, the sizing of the environment in the optimal scenario is verified, and RI and SUSE licenses are purchased. In the normal scenario, action is only taken after 4 months by turning off the test and acceptance environments when they are not in use.
- Obligations for RI and SUSE licenses are purchased six months after the go-live since procurement and finance need to be involved.
In this process, we saw that during the preparation of the business case, the attitude of IT engineers changed. At first they found it annoying to investigate / discuss potential cost-saving measures, but as the process progressed they also came up with creative initiatives to save costs.
In conclusion, throughout the business case development, this case study demonstrates how the approach encourages IT engineers to think more proactively about costs, leading to more cost-conscious decision-making. By involving stakeholders from multiple departments and leveraging the expertise of cloud architects and cost engineers, organizations can achieve better cost control, forecast expenses more accurately, and avoid unnecessary spending in the cloud environment.