REDMOND, WA, USA
6 days ago
Microsoft Build Operations Engineer
Job Seekers, Please send resumes to resumes@hireitpeople.com

The .NET Core team employs a wide variety of build technologies and runs hundreds of builds per day, many of which employ dozens of individual legs.  With so many moving parts, there can be lots of failed builds (especially during any sort of outage). The Engineering Services team creates and maintains build infrastructure, promotes best practices, and lives on the front line of analysis when things go wrong. We’re looking for someone with organizational skills, build experience, and a general knack for diagnosing failures, who can be the front line of build failure investigation.

 

Incoming Dot Net/core-eng issue triage:

Monitor newly created issues in our GitHub repo, and determine if they should go to the DDFUN team (create tickets), the DncEng team (issues with components we own), an Azure support ticket, or to various .NET Core product teams (in the case the build failure is theirs)Apply labels to issues to track types of problems encountered, regularly provide status on this.  Coordinate with team members to make sure assignments are accurate.

Build Failure investigation work includes:

Monitor a specified of builds .NET Core teams (surfaced on https://mc.dot.net), acting whenever a given build fails:Determine root cause(s) of failure using VSTS logs and other artifacts.  This may include but is not limited to:Source code history analysisConnecting to build machines and directly investigating themObtaining a local repro of the failureEnsure GitHub issue(s) are created or routed to the appropriate owners. Associate build failures with their matching GitHub issue in mc.dot.net.Recommend areas for investment to increase reliability (e.g. suggest where better logging, more retries, etc. could reduce the incidence of failure)Communicate with the individual or team which caused a build failure to ensure that they are working to unblock it (if applicable).Regularly communicate a summary of build failure analysis to the DDFUN and DncEng teams.

QUALIFICATIONS:


A minimum set of skills to be successful in this role include:

MSBuild familiarity: Be able to understand MSBuild output and project file format with the ability to track down the source of an error.  Ideally should have at least some experience with both desktop and .NET-Core-based MSBuildVSTS familiarity:  Be able to use VSTS online to look through build history, agent pools, and other aspects that may help understand a specific build failure reason.GitHub familiarity: Be able to navigate issues in GitHub, using the ZenHub Kanban plugin, updating, assigning, and labelling issues as appropriate.Operating Systems:  Be able to use systems running MacOS, Windows, and Linux enough to troubleshoot common file system issues (file ownership, file handles, disk space, etc) and find system logs.  Investigations where the problems appear to be related to machine state will require investigation of the machines used for builds.Scripting languages: Python, Powershell (both Desktop and Core versions), Bash, and cmd scripts are all used as part of our builds.  While you won’t be expected to write in these languages, being able to understand and debug them is essential for understanding build issues. 

PREFERRED SKILLS:


These skills are not mandatory but will help:

Basic use of command line debuggers: Be able to attach a debugger, get a managed call stack, and create a minidump using windbg/cdb on Windows, lldb / others on *Nix systems.VSTS Yaml-based definitions: Be able to parse the new .yaml build definition format and debug issuesGit: Be able to use GitHub or other sites to track down change histories, “blame” specific changes on users, and in general be able to navigate through Git as a source control system.
Confirm your E-mail: Send Email