The AI2NE Org strives to be global leaders in the RDMA cluster networking domain and enable seamless, accelerated High-Performance Compute (HPC), Artificial Intelligence and Machine Learning advancements. We envision a future where artificial intelligence and machine learning revolutionize industries, reshape societies, and unlock limitless possibilities. Our vision is to be a pioneering force, driving the development and design of state-of-the-art RDMA clusters tailored specifically for AI, ML, HPC workloads.
We strive to be the go-to experts in RDMA cluster architecture, leveraging our deep understanding of the unique demands of AI/ML and HPC applications. By staying at the forefront of technological advancements, we aim to redefine the boundaries of what is possible, pushing the envelope of computational capabilities and unlocking unprecedented performance.
We’re looking for a hands-on leader with strong management experience to help us build new features and grow our team. The role will be leading a team of network development engineers in a fast-paced environment that requires agility and the drive to deliver.The team will be responsible for provisioning, securing, scaling & operating the network stack required to run distributed AI workloads across a cluster spanning thousands of GPUs. The candidate should be comfortable with building complex distributed systems involving the management and control of hundreds of thousands of network devices.
Career Level - M3