hero

Careers in Crypto and Blockchain

Multicoin Capital
Multicoin Capital
95
companies
406
Jobs

Director Production Support Engineering

Bakkt

Bakkt

Customer Service
Alpharetta, GA, USA · Alpharetta, GA, USA · Georgia, USA
Posted on Tuesday, April 2, 2024
About Us

Founded in 2018, Bakkt builds technology that connects commerce.

Our vision is to connect the digital economy by offering one ecosystem for cryptocurrency and digital assets, loyalty, and commerce. We enable our partners and clients to deliver new opportunities to their customers through SaaS and API solutions that unlock crypto and drive loyalty, powering engagement and performance.

Come build with us.

As the Director of Production Support Engineering, you will be responsible for hands-on management and monitoring of our production environments, swiftly addressing issues, and applying creative solutions to ensure the seamless operation of our Loyalty platforms. You will utilize your natural curiosity, people management and strong problem-solving skills to investigate and resolve technical issues across our applications, services, databases & infrastructure.

Responsibilities

  • Observability:
  • Implement and manage robust monitoring systems to continuously track the functional and non-functional health and performance of our production systems.
  • Proactively identify anomalies and potential issues before they impact our clients.
  • Client Support:
  • Partner with software engineering, project management and customer success teams to respond to client requests and support inquires.
  • Work closely with our clients to provide support during integration, and ensure a positive experience.
  • Incident Management:
  • Lead escalation remediation's by working across multiple teams such as software engineering, devops, and project management for web applications and services running in a 24/7, always on, cloud platform environment.
  • Manage an on-call coverage schedule to address and resolve critical incidents outside of regular business hours.
  • Operations:
  • Execute and develop operational procedures necessary for service requests and incident response.
  • Ensure documentation of incident root causes including permanent mitigations and knowledge base updates
  • Maintain critical platform support knowledge, such as customer contact lists, vendor escalation procedures, scheduled job inventories, and operational playbooks.
  • Support planning and execution of production changes and software releases.
  • Automation:
  • Develop scripts and tools to automate repetitive tasks, streamline workflows, and improve the efficiency of the production support process.
  • Assist in the automation of customer operational tasks and ensures alignment with business requirements regarding customer facing processes such as customer order reconciliation.
  • Ensure timely execution of scheduled and repeatable processes such as periodic system validations, daily triage, system monitoring and event log management.
  • Continuous Improvement:
  • Actively drive process improvement initiatives, implementing enhancements to observability, logging strategies, incident response procedures, and support workflows.

Requirements

  • A bachelor’s degree in Computer Science, Information Technology or equivalent
  • 10+ years of application support and production support experience supporting cloud-based platforms using an SRE support model.
  • 5+ years of people leadership experience in a direct manager role or equivalent
  • Proven track record in a production support/SRE role, demonstrating your ability to monitor and troubleshoot complex systems in highly available production environments.
  • Clear ability to bring order and focus to high-pressure incident resolution across multiple stakeholder teams, experts and platforms
  • Experience with GCP, Google Kubernetes Engine, Google Compute Engine
  • Experience with common development tools and practices, including Java-based, Springboot environments and source control tools, such as GIT in a team environment
  • Demonstrated ability to understand application logs and and supporting various monitoring and visualization tools (e.g. Alertsite, LogStash, DataDog)
  • Excellent communication skills, both written and verbal, for effective interaction with technical and non-technical stakeholders.
  • Self-starter who can work independently and effectively across functional team environments.
  • Proven ability to learn new IT technologies and disciplines.

Preferred

  • Experience with n-tier web and services application architectures and in Java-based, Springboot and Tomcat Environment.
  • Ability to read and interpret Java, Angular, SQL and other software coding languages
  • Working knowledge of SQL Server
  • Experience with JIRA or other Service Desk tools
  • Experience with multiple OS platforms (Linux, Windows)
  • Experience with Mongo and scripting language like python

Bakkt is devoted to having diversity in its workforce and is proud to be an equal opportunity employer. Bakkt does not make any employment decisions based on race, color, religion, sex, national origin, veteran status, disability, age, sexual orientation, gender identity of any other characteristic protected by law