What if data could help in optimizing your IT environment?

Digital. Innovative, groundbreaking insights. Discovering new business opportunities. That is what often comes to mind when thinking of data. To a certain extent this is true – data can indeed be used to uncover interesting opportunities you have not thought of before. However, the benefits of using data can also be found closer to home. For instance, when it comes to optimizing your IT environment and gaining insights into its performance regarding costs, efficiency, and security. It can be challenging to stay in control and safeguard optimal performance. How can we use data for this?

In control of IT – using all data you can get your “hands on”

At Itility we manage quite some IT environments. Our passion is to automate everything away, and to use analytics to work smarter and stay in control. Staying in control of our IT environment requires collecting data from multiple sources, combining them, and using statistics, machine learning, and algorithms to generate useful insights

In a typical IT environment, you will find physical servers, storage, virtual machines, network devices, databases, applications, and so on. All these systems collect their own data in the form of logs and metrics. And since these systems are interconnected, correlations exist among them. For example: a lack of storage can cause deteriorated application performance, which can have a business impact for business-critical applications. Combining multiple data sources helps in quickly finding the root cause of such an issue.

To provide opportunities for analysis, all this data needs to be stored into a data lake. Consider, for example, the application that crashed due to a lack of storage. Instead of having to dive into this multifaceted problem manually, we have a monitoring solution in place that generates automated alerts based on the multiple occurring events: the lack of storage, an unresponsive virtual machine, and degraded application performance. It then decides to automatically contact one of our DevOps engineers to solve the issue. This is still a ‘responsive’ or reactive way-of-working, but that is often the starting point.

More important to us is that we also use all that available data to prevent this problem from happening again in the future. To tackle this, we turn to advanced analytics and combine the historic data of the storage component with data science techniques to make predictions of its future state. This predictive or proactive way-of-working enables us to react to this issue before it actually occurs. The main benefit of this proactive alerting is that it ensures that IT issues do not result in business impact for our customers. A nice side benefit for our own team is that no DevOps engineer must be awoken at night during stand-by because of an issue that could have been prevented.

But there are more benefits in using data for predictive IT-maintenance. It enables us to transform unplanned work into planned work, thus changing from fire-fighting mode into fire-prevention mode which is more reliable and less stressful for those involved. Using data also allows us to improve capacity planning, rightsize assets automatically, and simulate changes before we actually carry them out. This helps us predict the impact of a given change on user performance and enables us to plan additional changes along that one change.

This is how you actively use an IT data lake to stay in control of your IT environment!