Cloud Data Centres Monitoring

Monitor and correlate application performance, hardware and data centers utilization and predict failures!

Global Information Services (GIS) departments in large enterprises are in charge of maintaining complex systems such as Data Centres or Distributed server networks and improve their performance. It is important for GIS to monitor end-user’s response time and the overall system performance. Response time, however, is a lagging indicator of most performance problems. Many other factors need to be monitored such as depletion of work threads, memory leaks, changes to the application infrastructure, error detection, internal failures, and the application interaction with back-end systems. It is also very important to identify actual problems before they result in performance and availability issues. With the increase in the complexity of IT networks, their components cannot be analysed in isolation. Root-cause analysis, finding frequent behavioural patterns or predicting possible failure points require explaining the complex relationships between the components in the network as well as the metrics collected from these components and their attributes. This requires the use of continuous queries over massive monitoring streams (100 metrics per second per node in data centres with tens of thousands of nodes) and an efficient way to handle them to reduce operational costs. Although products in the market cover this spectrum, new technology is necessary to efficiently combine streaming events and information stored in different data stores and formats and especially with lower resource consumption. Visualization of such complex systems usually results in visual layouts, which are difficult to interpret. This makes it more difficult to analyse actual performance problems and detect the root causes. We seek to use new visualization tools based on emerging technologies in the area of human-machine interaction, which make system analysis and management more natural, efficient and straightforward.

LeanBigData Enablers as Differentiators:

  • Absorption and storage of large amounts of metrics (ten million metrics per second) and combining these data streams with the data warehouse information and other big data stores.

  • Management of continuous queries in new efficient ways.

  • Analysis of complex systems through new visualization methods and mechanisms to allow for IT management through efficient human interaction.

  • Measurement of energy consumption within the big data managers will lead to measurable improvements in resource consumption of the Big Data system. The ability to measure and control “leanness” will enable a trade-off between “leanness”, robustness, scalability and query execution time requirements.

  • Creating controls within a Big Data system that would enable measurable change in energy consumption. This will enable to measure the energy consumption impact when deploying a new version of an application or data management technology (e.g., a new concurrency control algorithm).