<- All Jobs

Site Reliability Engineer Graduate (TikTok Product - USDS) - 2025 Start (BS/MS)

About the Team
Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed and fault-tolerant systems. Product SREs help ensure the reliability and uptime for the services underpinning the TikTok product. Our team pays great attention to optimizing existing systems, working closely with cross functional teams and eliminating toil through automation.

In order to enhance collaboration and cross-functional partnerships, among other things, at this time, our organization follows a hybrid work schedule that requires employees to work in the office 3 days a week, or as directed by their manager/department. We regularly review our hybrid work model, and the specific requirements may change at any time.

In the USDS TikTok Product Engineering SRE team you鈥檒l have the opportunity to manage complex challenges at scale, while utilizing your knowledge in coding, algorithms, troubleshooting, complexity analysis and large-scale, distributed system design. We embrace a culture of diversity, intellectual curiosity, openness and problem solving. We encourage close collaboration while promoting self-direction.

What You'll Do:
- Gain a solid understanding of the various components and services that power the TikTok experience
- Utilize computer science fundamentals to troubleshoot and resolve real world problems across our globally distributed microservice-based architecture that can handle petabytes of data
- Help maintain services to meet service-level-agreements (SLAs) or service-level-objectives (SLOs) by measuring and monitoring availability, performance, and overall system health
- Work with teams across different pillars to continuously find new and creative ways of operating at scale via software solutions and tooling
- Utilize modern software development best-practices to eliminate toil, automate production updates, and build robust services to improve observability (metrics, traces, logs).
- Participate as part of a global team to support site-up issues to ensure we deliver best-of-breed services for TikTok's business.
Share job