Due to continued business growth this is a fantastic opportunity for an experienced and talented Hadoop Administrator with strong hands on level experience with the Hadoop ecosystem to join our team.
The ideal candidate will have a passion for the Hadoop ecosystem. This individual will have a proven track record of working with leading-edge data technologies and be a dependable, hard-working, and creative problem-solver.
Our business involves managing data platforms, architected to store large volumes of data, be highly available and deployed across distributed infrastructures. We provide expertise in technologies such as Hadoop, Apache Kafka, Apache Cassandra, Apache Spark, PostgreSQL, Elasticsearch, Kubernetes, AWS, Google Cloud and a variety of complementary technologies.
Our engineering approach is to automate and instrument all aspects of deploying and managing these systems to provide a 24x7 always on deployment. Although the primary skill for this role is Hadoop, there will be the opportunity to learn and work with a wide variety of other technologies and platforms.
Automation and configuration management is critical to how we work and engineer our systems - this role will require developing and enhancing our automation on a daily basis. It’s great if you have this experience already, but if you don’t, we will help you learn it.
This role is remote, based in Europe and involves working with a team that is distributed across Europe. The candidate must be comfortable working remotely, and communicating over instant messenger, slack etc. and video communications.
Responsibilities • Responsible for implementation and ongoing administration and configuration of Hadoop data platform and Linux environments. • Responsible for Maintaining, Upgrading and Managing high availability for all components in the Hadoop stack. • Responsible for Analysing, Monitoring, and Tuning Hadoop Services: Spark, Hive/LLAP, Kafka, Ranger, HDFS, etc. • Administer, Upgrade, Patch, PostgreSQL/MySQL/MariaDB databases used for Hadoop services • Responsible for Hadoop user management (new accounts, permissions access). • Responsible for New Development of Bash and/or Python scripts to monitor and manage the Hadoop cluster • Responsible for reviewing performance stats and query execution/explain plans and recommend changes for tuning • Responsible for implementation of Hadoop security, including Kerberos, Ranger & Knox • Responsible for Deploying and Swapping out SSL certs Across Hadoop services • Responsible for backup and recovery tasks apart from implementing controls on Hadoop platform • Responsible for resolving open support tickets, in a timely manner • Responsible for Interacting and dealing with internal Teams and working with vendors as needed. • Responsible for providing documentation and ongoing training to the engineering teams for the Hadoop platform. • Responsible for contributing to the production support role via an on-call rotation
Mandatory Skills. • Bachelor’s degree in Computer Science or the equivalent through a combination of education and related work experience in Computer Engineering, Computer Science, Data Sciences, or a related science or engineering field. • Experience managing and maintaining Hadoop clusters, in particular, Hortonworks Data Platform, Hortonworks Data Flow or Cloudera Data Platform. • Experience with managing and tuning CentOS or RedHat operating systems. • Experience in advanced relational and Big Data database systems including Hive/LLAP, MySQL, MariaDB, PostgreSQL, Oracle, etc. • Experience Applying Security across the Hadoop stack. Kerberos, SSL, Knox, etc… • Experience with Ranger, creating policies and troubleshooting authorization issues. • Previous consultancy experience a big plus. • Experience with deploying Hadoop monitoring. Grafana, Prometheus, etc.. • Knowledgeable about integration of third-party client tools to connect to Hadoop Platform. • Experience with programming and scripting languages including Python, BASH, sed/awk. • Must be very comfortable with reading and writing Python and Java code. • Strong development/automation skills. Familiarity with Ansible, RunDeck, Ansible Tower, Jenkins • Demonstrated research, analytical, critical thinking, decision-making, consulting, and problem-solving skills. • Ability to work with limited direct supervision. • Ability to have effective working relationships with all functional units of the organization • Excellent written, verbal and presentation skills • Excellent interpersonal skills
Desirable Skills • Experience managing commodity hardware as well as cloud-based infrastructure such as Amazon Web Services. • Experience with virtualization tools like VMWare ESXi and KVM. • Experience with non-relational data technologies; such as MongoDB, Elasticsearch, Redis, Cassandra, DynamoDB, etc. • Previous experience with IBM Db2 Big SQL. • Experience with container technologies such as Docker and Kubernetes. • Familiarity with data visualization tools such as DBeaver, Jupyter, etc. • Experience researching and driving the adoption of new technology platforms and Proof of Concepts. • Hadoop certification is a huge plus • The job entails sitting as well as working at a computer for extended periods of time. Should be able to communicate by telephone, video conferencing, email or face to face. Travel may be required as per the job requirements. • Experience in Hadoop patches and upgrades and troubleshooting Hadoop job failures. • Experience in fixing the issues by interacting with dependent and support teams based on the priorities