Over a million designers use Sketch to transform their ideas into incredible products, every day. Would you like to join us and help take the infrastructure that supports this leading design tool to the next level? We're looking to expand our team with a full-time Site Reliability Engineer.
At Sketch, we work with a unique technology blend: a cloud platform and macOS and iOS applications. Our cloud stack is based on a mix of serverless and traditional server applications built on Elixir and Go, along with other cloud services like RDS PostgreSQL, S3, SQS, ... Most pieces are deployed on AWS ECS and automated through Terraform; we use Chef for configuration management where it's needed. Our SRE team usually codes with Python whenever we need to write a small program or script.
As a Site Reliability Engineer at Sketch, you will focus on shaping our cloud infrastructure and make sure all the pieces work well together: development environments, metrics processing and observability, security policies, network design, deployment strategies, high availability, etc. You will work closely with backend, frontend, Mac developers and product managers to guarantee product focussed, smooth engineering processes.
As an example of one complex project we have worked on lately, we recently migrated our production database from MariaDB to PostgreSQL using streaming replication to minimise the potential downtime and have replicated environments to adapt and test our backend APIs properly.
We look for someone who has experience with different stacks (mainly Linux based), technologies and production models and has participated actively on the build of essential pieces of a cloud platform.
Someone that knows how to conduct a technical operation that potentially affects users and at the same time can code small applications and scripts to automate the platform and also debug problems in other people's code.
You care about security, code quality, scalability, performance, and simplicity. Above all, you seek operational excellence and apply the best engineering practices possible. Not everything that you or your team do can be perfect, but you make sure that you always know the trade-offs. You back your decisions with arguments. You don't care for hype and always try to find the best solution and technology for the job and its context.
Sketch is a 100% remote company, and your colleagues are distributed around the globe. Being remote adds great flexibility, and helps us build a more diverse team. We put respect for each other above everything else.
Besides being remote we work asynchronously as often as we can. This means that our team communicates mostly using Slack and GitHub. When we need it, we also have video calls.
Our Technology team has more than 50 people today, split between Mac, Backend, Frontend, Infrastructure and QA. In particular, the Infrastructure team has 6 members. We work in multidisciplinary squads: people from different roles, including members of the Product team, work together on solving problems and delivering functionality to our users.
Professional experience managing Linux-based and cloud-native distributed systems in the past
Experience coding with high-level programming languages like Python
Experience with Infrastructure as Code tools such as Terraform, and configuration management tools to automate manual operations
A good understanding of the HTTP protocol and the behavior of production web services
Excellent communication skills and a good written and spoken English
You're based in European / African timezones
We care about your well-being and your professional success, so we offer you
Flexibility to organize your own time, no set hours
As many vacation days as you need
Whatever training you need to develop in your job
A powerful laptop
The option to work anywhere in European/African timezones