Introduction

Automated reasoning tool for networked incidents

Summary

Modern cloud-based applications have complex inter-dependencies on both distributed application components as well as network infrastructure, making it difficult to reason about their performance. We are designing Murphy, an automated performance diagnosis system, that can work with commonly available telemetry in practical enterprise environments, while achieving high accuracy. Murphy utilizes loosely defined associations between entities obtained from commonly available monitoring data. Its learning algorithm is based on a Markov Random Field (MRF) that can take advantage of such loose associations to reason about how entities affect each other in the context of a specific incident.

Researchers

2020 Interns

External Researchers

  • Brighten Godfrey

Related Publications

Category

  • Active Research Projects

Research Areas

  • Network diagnosis
  • Network management