Job Title: Solutions Integration Engineer/ Systems HPC Systems Engineer
Location: Houston TX 77042
Duration: 6+ Months
• *This is an onsite position, no hybrid opportunity**
Summary:
• As Solutions Integration Engineer, mission is to support the Geo-units, internal users and solutions teams in ensuring the proper cloud solution in DELFI is deployed for the customer (internal or external) and adoption is addressed. He/she provides the HQ technical support for the field locations to position and deploy the various solutions.
• Develop the necessary Solution Design capabilities and competencies for ensuring successful deployment of cloud solutions.
• Continuous knowledge capture on best practices.
• Technology watch: keeping track of technology trends to improve our Cloud designs and deployments and share knowledge with Solution teams and Engineering.
• Ensure successful client onboarding to their DELFI environment.
• Manage Infrastructure and Image Customizations in line with Cloud Adoption processes.
• Engage with Geo units across the sales cycle to provide technical advisory (client presentations, workshops, deployment assessment, solution design) and contribute to the technical proposal.
• Find, Diagnose and Troubleshot complex issues with web applications, backend services and IaaS deployments across Azure, Google, and other Cloud Service Providers.
• Aid in the translation of technical issues and complex problems into non-technical summaries for management up to and including president level.
Specific Work Requirements:
• A minimum of 5 years’ experience working in a large HPC enterprise environment comprising thousands of servers, large storage solutions, tape and tape automation.
• Proficient in the installation, configuration and management of Linux based operating systems, preferably using RHEL, CentOS, Rocky Linux.
• Experience with IBM’s xCAT distributed computing management software.
• Experience with installation and maintenance of computer hardware including servers, tape drives, robotic tape libraries, GPGPU, SSD, disk arrays.
• Experience with containerization.
• Knowledge of networking and datacentre technologies, switching, routing, high-availability, LAN / WAN / WLAN topologies and system configuration for Ethernet, InfiniBand, and Fiber Channel SAN.
• Experience with HPC Storage Solutions, for example configuration and operation of HPE ClusterStor systems, NetApp, Dell Isilon, and Pure Storage.
• Ability to write and troubleshoot Bourne, Bash and C Shell, Perl, Python, Ruby and MRTG scripts.
• Experience with PostgreSQL and database installation and support.
• Experience with Google Cloud Platform and Azure public clouds. Able to provision and manage instances, build images, write installation scripts.
• Experience with configuration tools like Ansible and Terraform.
• Experience with backup and recovery tools, IBM Spectrum, Dell Networker.
• Good knowledge of Linux security, including configuration of endpoint security tools.
• Ability to evaluate HPC system environments and make recommendations for improvement in performance and manageability.
• Ability to investigate, debug and diagnose system level issues.
General Work Requirements:
• Conform to local change management philosophies, including full testing on non-production systems, prior to production deployment.
• Effectively communicate all change activities to all affected parties including a clear description of the change, related service outages and possible effects on the different environments we support.
• Ensure IT deployment standards are maintained, with verification through reporting systems.
• Meet KPO requirements for InTouch support processing, including full documentation of problem resolution, creation of knowledge content and best practice items.
• Show a good understanding of computer equipment, and its care and maintenance.
• Work with other internal support groups, systems, networking, programming, desktop support, computer operations, and facilities as required to complete administration functions.
• Work with a variety of vendors in technical environments and in the reporting and investigation of system problems.
• Provide a written weekly status report to the team manager and be prepared to present and discuss this with the team at a weekly status meeting.
• Prepared to work outside of normal hours as system maintenance often must be performed outside of prime time; provide 24/7 support to computer operations; work with other remote support locations, for example Kuala Lumpur, backing follow the sun support.
• Participate in support on-call schedule and in weekend power outages, normally two per year and in emergency data center activities.
• Peer-review all major projects, as part of the normal deployment philosophy.
• Ensure compliance with all quality assurance, best practice procedures and QHSE requirements, as defined by job position.
Personal Traits:
• Self-motivated, able to work with minimum direction.
• Able to work as part of a team, either in small groups, or as part of the Data Center support team as a whole and accomplish this in either a lead or reporting role.
• Able to demonstrate good written, phone and face to face communication skills when working with a peer group and with internal and external customers and with vendors.
• Adhere to industry standard systems administration techniques and procedures.
• Document standard user and operational requirements.
• Willingness to train others.
Beneficial Experience:
• Experience with Client software including Omega2 and Petrel.
• Experience in Omega2 infrastructure software, OPM, RDM, OSM, OCM, WHSM, MMS, etc.
• Experience in other client software, ECL, IX, Visage, PetroMod.
Qualifications:
• Bachelor's degree from a four-year college in computer science studies or 5-10 years equivalent work experience, and current industry recognized training and certification, for example from Cisco, RedHat or Microsoft.