Skip to Main Content

Site Reliability Engineers

History

Benjamin Treynor Sloss, a software engineer at Google, is credited with originating the term site reliability engineering in 2003. Google established a reliability engineering team to help its product development team (which launches new products and studies how users utilize them) and its operations team (which makes sure that existing products continue working). The goal of the reliability team was to create highly scalable and extremely reliable products and automate as many processes as possible to meet these goals more effectively. The use of site reliability engineering practices at Google (which started with seven site reliability engineers and now has 2,500 SREs) were so successful that other tech companies such as Netflix and Amazon adopted them, while also reimagining them to fit their particular products and methods of operation.

Some people believe that the fields of DevOps and site reliability engineering are the same. While workers in these fields share some responsibilities, the fields are distinct. “DevOps focuses on engineering continuous delivery to the point of deployment; site reliability engineering focuses on engineering continuous operations at the point of customer consumption,” according to Jayne Groll, CEO of the DevOps Institute, in an article about the two fields at InfoWorld.com.

Related Professions