PRINCIPAL CLOUD ENGINEER

San Francisco 10 days agoFull-time External
Negotiable
The Cloud Center of Excellence (CCoE) is responsible for mission-critical infrastructure used San Francisco wide to provide essential public services. The CCoE provides a centrally managed infrastructure service that is offered to 50+ City and County of San Francisco departments. The service offering includes both design and architecture, as well as daily operational support to ensure optimal performance, security, availability, resiliency, and cost for commercial cloud providers and services. The ideal candidate is excited about bringing their technical savviness to serve San Franciscans through improvement and modernization of IT infrastructure, as well as advocacy of public cloud services. Position: The City and County of San Francisco, Department of Technology (DT), is seeking a highly experienced Principal Cloud Engineer to help design, develop, and maintain commercial cloud infrastructure servicing multiple City and County of San Francisco departments. This position will have a focus on Disaster Recovery and increasing resiliency of City services by leveraging of public cloud infrastructure. Job Responsibilities: • Act as a cloud team lead. • Contribute to the thought-leadership for building cloud infrastructure for resiliency. • Proactively work with the DT Disaster Recovery (DR) team and city departments to lead the cloud infrastructure development of application and system DR including data migration. • Assist leadership in the documentation of process and procedures for system DR recovery during an incident or emergency. • During any emergency or outage, provide technical leadership, troubleshooting, and system recovery. • Contribute cloud expertise and knowledge for the development of city commercial cloud services that are optimized for multiple city departments and applications. • Consult with and advise business partners on best practices, efficiencies, and economies of cloud technologies. • Architect, build, operate, deploy, and maintain secure, scalable, and highly available commercial cloud infrastructure. • Migrate business systems and data to commercial cloud infrastructure for production and disaster recovery environments. • Develop and maintain software solutions/frameworks to automate cloud configuration and administration in commercial cloud platforms. • Design and implement cloud-native solutions for rapid but reliable feature enhancements and conduct cloud capacity planning. • Collaborate with city development, applications, operations, and security teams to ensure solutions meet functional and non-functional requirements. • Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding. • Configure and deploy cloud cost management, cloud budgeting, and cloud capabilities at scale. • Create and implement cloud infrastructure best practices and guidelines to support audits and various compliance. • Develop and leverage automation tools to ensure consistent, efficient, transparent, and secure operational and fiscal management of commercial cloud environments. • Enhance and maintain disaster recovery and business continuity plans, including identifying critical systems and designing and implementing backup/restore processes. • Implement and maintain operations monitoring and alerting systems to proactively identify and resolve issues/outages and serve as 24x7 lead for escalating problem resolution with cloud providers. • Create, manage, and maintain Operations Scope of Procedures (SOPs) consisting of standard operating procedures, configurations, lessons learned, root cause analysis, diagnostic steps, and solutions to resolve incidents. • Research and evaluate industry trends to continuously improve infrastructure solutions. • Provide 24-hour on-call support to ensure rapid recovery from software or hardware problems for mission-critical systems and networks. Job Type: The Permanent Exempt - Full Time position is excluded by the Charter from the competitive civil service examination process and shall serve at the discretion of the appointment officer. The anticipated duration of this project position is thirty-six (36) months and will not result in an eligible list or permanent civil service hiring. Nature of Work: Incumbent must be willing to work (specify the work schedule; for example, a 40-hour week) as determined by the department. Travel within San Francisco may be required. The incumbent must be a resident of the State of California or be willing to relocate within 4 weeks of beginning employment with the City and County of San Francisco. Work Location: Incumbent will conduct the majority of work at the Department of Technology, (1 S Van Ness, Ave San Francisco, CA 94103). However, there may be situations where the incumbent will be required to work at other sites throughout the City of San Francisco as necessary. This position does not support fully remote work. Employees may be permitted to work a hybrid schedule with supervisor approval, after which they must work at least two days in the office every two weeks.