<- All Jobs

Tech Lead Manager, Site Reliability Engineer, Product - USDS

The USDS TikTok Product Engineering SRE team works with engineering and product teams to build, maintain and run large-scale, globally distributed, observable, fault-tolerant systems. SREs on this team will deliver on production ownership and be responsible for observability and automation across complex, large-scale service mesh architectures.

In order to enhance collaboration and cross-functional partnerships, among other things, at this time, our organization follows a hybrid work schedule that requires employees to work in the office 3 days a week, or as directed by their manager/department. We regularly review our hybrid work model, and the specific requirements may change at any time.

Responsibilities:
- Provide technical leadership and mentorship to a team of Site Reliability Engineers focused on building observable, fault-tolerant systems
- Drive architectural decisions for large-scale, globally distributed service mesh architectures
- Establish and maintain production ownership models, incident response protocols, and service level objectives
- Develop strategic roadmaps for observability and automation initiatives that enhance system reliability
- Balance technical contributions with people management responsibilities, including career development, performance evaluations, and team growth
- Foster a culture of reliability, continuous improvement, and knowledge sharing within your team and across the organization
- Lead security initiatives to safeguard critical assets, partnering with security and compliance teams to implement robust protocols that ensure data protection and regulatory compliance across all services
Share job