Engineering - DXR Engineering - Systems Engineer - Associate - Dallas
Goldman Sachs
Procmon Platform delivers a highly scalable and reliable ecosystem for scheduling business critical jobs across Goldman Sachs.
Our platform is responsible for scheduling tens of millions of daily jobs for Global Banking & Markets, Asset & Wealth Management, Risk and other business and engineering functions.
The ecosystem includes a number of high availability, very large scale systems including
Job schedulingEvent streamingLog shippingData warehousesSecurity infrastructure
RESPONSIBILITIES
Own technical operations for systems that manage hundreds of thousands of compute coresBuild observability for new deployments to ensure robustness from day one, as well as mature deployments to identify and implement improvementsTroubleshoot and resolve issues with block devices, file descriptors, and packet lossLead real-time outage investigations and present postmortems to senior managementDefine SLIs and SLOs and partner with development teams to ensure system are sufficiently well designed and instrumentedPartner with our development team throughout development and operationsPlan and manage deployments and migrations (including end-of-life programs)Plan and implement robust business continuity and security programsProvide regional coverage for the Procmon platform and participate in the on-call support
REQUIREMENTS
Excellent problem-solving and automation skillsStrong Linux fundamentals and system administration skillsGood networking fundamentals (familiarity with TCP/IP, IP routing, firewalls, secure tunneling protocols)Experience working with distributed computing systems and Cloud computing environmentsProficiency in at least one programming language; the team uses a mix of Go, Python and ErlangAble to operate effectively in a mission critical, highly regulated financial services environment
Confirm your E-mail: Send Email
All Jobs from Goldman Sachs