Craft Ventures Portfolio Job Board

Search

My job alerts

Senior Site Reliability Engineer

ClickUp

This job is no longer accepting applications

See open jobs at ClickUp.See open jobs similar to "Senior Site Reliability Engineer" Craft Ventures.

Software Engineering

Canada

Posted 6+ months ago

ClickUp is the world's only all-in-one productivity platform that flexes to the way people want to work. It replaces all individual workplace productivity tools with a single, unified platform including project management, document collaboration, spreadsheets, chat, goals, and more. On a mission to make the world more productive, ClickUp is headquartered in San Diego and scaling remotely and internationally. As one of the fastest-growing SaaS companies in the world, ClickUp helps millions of users to be more productive and save at least one day every week. 🦄

We are looking for driven and innovative software engineers with strong site reliability engineering (SRE) discipline or interest in this area to help us make ClickUp the "one app to rule them all". As an SRE at ClickUp, your primary roles will be improving the stability, availability and reliability of our globally distributed and cloud-based infrastructure that powers our app for thousands of users daily. If you are a rockstar engineer with an entrepreneurial and high-paced mindset who are ready to own, drive and tackle some of the most complex problems there are out there we would love to hear from you!

What you'll do:

Build a deep understanding of how ClickUp's systems behave, scale, interact and fail, and use that insight to identity risks and opportunities for remediation
Own, drive and improve the incident management process across engineering org and participate in the team's follow-the-sun model
Define SLOs and SLIs for all of our services and introduce error budgeting
Own and improve our observability on all of our services
Build software solutions to enable reliability and operability of large scale distributed systems handling petabytes of data and serving
Build tools and automation to eliminate toil and reduce operational overhead. Create frameworks, processes and best practices to be used across ClickUp Engineering
Automate critical portions of ClickUp engineering processes, to minimize risk and maximize the speed of innovation
Manage capacity and performance to help scale our infrastructure both on public and private clouds around the world

What we’re looking for:

Software engineering: At the very core, we are looking strong software engineers with operational, infrastructural or SRE mentality who can design and build systems for platform and infrastructure layers
Cloud experience: Production working experience in a major cloud environment around doing CI/CD deployments, using managed services, bootstrapping and provisioning services via infrastructure-as-code (IAC) systems, automations and operations
Infrastructure Management: You have worked with and managed production grade infrastructure with IaC tools or configuration management tools
Operating systems: Strong knowledge of *nix based operating systems, their internals and advanced troubleshooting commands
Compute: Experience of working with VMs, containers and container orchestration systems
Database: Experience of working with RDBMS and NoSQL storage solutions within production capacity and know your way around running and inspecting queries. A good understanding of indexing, locking, replication and sharding are a bonus!
Observability: You have worked with logging, monitoring and alerting tools before and you know how logs are collected, aggregated and injected. You have set up monitors and alerts for production services and know your way around concepts such as SLOs and SLIs
Bonus points: We believe strong engineers can pick up any technologies and tools fast and hit the ground up running. Therefore, we avoid listing specific technologies. However, if you have worked with at least one of the technologies we have in our stack that would definitely be a bonus point.
- CloudFormation/CDK, ECS, ElasticBeanstalk
- PostgreSQL, DynamoDB, AuroraDB
- Typescript or any JavaScript based framework

#LI-RS1
#LI-REMOTE

ClickUp was founded on a culture of hard work, consistent growth, and a desire to break norms. We’re a values-driven company and hire based on ambition, merit, and a willingness to do what it takes to succeed. We don’t care where you’re from, what you look like, or who you’re in a relationship with—we hire the best people for the job, and create an environment that supports employees on their journey to do the most exciting work of their lives! ClickUp is an Equal Opportunity Employer, and qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, or national origin.

This job is no longer accepting applications

See open jobs at ClickUp.See open jobs similar to "Senior Site Reliability Engineer" Craft Ventures.

See more open positions at ClickUp

Privacy policy Cookie policy