Timisoara, UK
3 days ago
HPC Application Support Engineer

Eviden is an Atos Group business with an annual revenue of circa € 5 billion and a global leader in data-driven, trusted and sustainable digital transformation. As a next generation digital business with worldwide leading positions in digital, cloud, data, advanced computing and security, it brings deep expertise for all industries in more than 53 countries. By uniting unique high-end technologies across the full digital continuum with 57,000 world-class talents, Eviden expands the possibilities of data and technology, now and for generations to come.

 

HPC Application Support Engineer:

 

An Application Support Engineer in a High-Performance Computing (HPC) environment maintains software applications and systems used in computational tasks. Their role involves ensuring applications run efficiently on HPC infrastructures, troubleshooting issues, and providing technical support to key users.

 

This role is critical for leveraging HPC resources to achieve optimal computational performance and support advanced research and development activities.

 

Role Expectations:

 

Maintain software stack of HPC applications.

Application Deployment & Configuration – Install, configure, and manage software applications on Linux servers.

Roll out automated deployment scripts using tools like Ansible, Chef, or Puppet.

System Monitoring & Maintenance – Monitor application performance, system logs, and resource usage using tools like Nagios, Prometheus, or Grafana.

Incident Management – Investigate and resolve application crashes, slow performance, or connectivity issues by analyzing logs and system behavior.

User Support & Issue Resolution – Assist end-users with application-related issues, permissions, and access requests.

Keep HPC software stack updated, including libraries, application dependencies including also diagnose and resolve issues related to software applications and system performance.

Support customer to fine-tune HPC applications to leverage HPC resources efficiently, ensuring maximum performance and resource utilization.

Support and maintain technology standards, processes and policies related to on prem/cloud Infrastructure in scope.

Produce and maintain appropriate documentation and diagrams describing system setups and overall inventory.

 

    Capabilities and Expertise:

 

Strong working knowledge with Linux server operating systems.

Experience with automated software configuration like Ansible or Chef for deployment.

Experience with Linux package managers (yum, apt, rpm, zypper, etc.)

Familiar with application performance monitoring tools like Prometheus, Grafana, or Nagios.

Log analysis and diagnostics: Proficient in reading and interpreting log files (e.g., application logs, system logs, web server logs).

Familiar with authentication and authorization frameworks (e.g., LDAP, OAuth, Kerberos, Active Directory integration).

Knowledge on version control tools such as Git.

Familiar with backup tools (e.g., rsync, tar, Bacula, Amanda) and strategies for application-specific backups.

 

   Nice to have:

 

Knowledge of networking basics (DNS, HTTP, TCP/IP, Load Balancers).

Firewall and Security Tools: Familiarity with tools like iptables.

Disaster Recovery: Basic understanding of backup and restore processes for critical applications and services.

 

What we offer: 

Training and Certifications: Access to continuous learning and career development opportunities.

 Flexible working environment

Competitive salary and benefits package.

 Reimbursement: Get a yearly fixed amount for reimbursement.

 Performance Bonus: Earn an annual performance bonus based on your achievements.

 Career Advancement: Explore numerous opportunities for professional growth and career advancement.

 Extra Vacation Days: Take advantage of additional vacation days to relax and recharge. 

 

 

Let's grow together. 

Confirm your E-mail: Send Email