Our project shows how to create a dedicated infrastructure that translates into cost optimization by adapting computing power to actual demand at a given time.
Implementing an innovative solution that automatically scales virtual machine farms based on the list of tasks and processing time is an extremely complex process. In addition, if this is done in the Microsoft Azure environment, and the target area is the production of videos, then the whole takes on a completely new meaning, requiring an analytical approach and extensive knowledge. Using the example of the mechanism developed by us, we will show that despite the huge scale of the challenge and the resulting limitations, implementation can be carried out efficiently, in a short time and, most importantly, without fail.
Our client was a company responsible for the production of personalized videos and automation of sales processes and marketing activities within the framework of individual communication with the client. Its solution operates on a large amount of individual data, which was intended to be in the cloud-hence the search for experts specializing in the Azure environment. We were recommended by Microsoft, which participated in the project to migrate the solution to the cloud. We have been working in cloud technology for many years, so we decided to face this challenge. The idea was to create the ability to automatically scale entire virtual machine farms according to the number of tasks waiting in the queue and their average expected processing time. Since the addition of a new machine takes a relatively long time, and scaling had to be fast, the mechanism should not work on the principle of simple removal and creation of new machines, but additionally keep a certain pool of machines stopped, which if necessary can be started in much less time than needed to create a new one. It is worth adding that at the time of the start of work on the project, Microsoft Azure did not have a ready-made solution to achieve the implementation goals.
We had to choose between two possible options. The first, and at the same time obvious, is the use of Azure VM Scale set. Unfortunately, significant defects disqualified her at the start:
– scaling machines are always removed and re-created, they cannot be stopped;
– there is no control over the order of removal of machines.
The second option was to manage individual machines through Azure Resource Manager’s REST API. The choice was obvious-option two. As time has shown, we made the right decision. On this basis, we created our own solution that recognized which machines are currently active, which are stopped, which are in the process of creation, etc. As a result, on this basis, we have developed a dedicated application that can create a new machine, start it from the stopped pool, stop the active one, and even delete it. Thus, we gained full control over individual machines, which was not provided to us by the scale set.
Did the app we developed work? Yes, although its implementation turned out to be much more complicated than using a scale set (if possible). However, it was not the development of the software that gave us the greatest satisfaction, but its trouble-free operation immediately after implementation and during further use.
Thanks to our changes, the solution makes much better use of the capabilities of the cloud than was initially assumed. Our research and developed implementations have allowed the company to deliver materials much faster and more conveniently and, importantly, significantly reduce costs.
Would we decide to make a difference by getting on the DeLorean DMC-12, which will take us back to the beginning of the project? No, because if you need to control individual virtual machines, using the resource manager API directly is the best solution, which in no way limits the operations performed on the machines. In addition, you should not force anything to change in a smoothly working implementation, which works great. Although it would be interesting to test the results of Service Fabric in terms of cost and ease of implementation.
Authors: Tomasz telepko – Senior LAB developer at billennium and Kamil Wysocki-cloud solutions architect at billennium.
1A Sportyvna sq, Kyiv, Ukraine 01023
1608 Queen St, Wilmington, NC, 28401