Job Description
Job Purpose
The Data Integration Engineer is responsible for:
1. Designing/optimization of KRA’s data warehousing and big data platforms.
2. Leveraging on appropriate technologies to deliver robust processes for real-time/near real-time data ingestion, job automation and model deployments.
3. Creating reproducible lean engineering processes for better memory and space management of data management solutions.
4. Proactively building and implementing services, including end to end monitoring of data pipelines and platforms, scripting and automation of data lifecycle and quality processes.
5. Problem resolution for recurrent incidents escalated by support teams around data analytics solutions.
6. Co-ordination and supervision of assigned development teams.
7. Development or enhancements to existing data services in line with procedures and standards.
Key Responsibilities
1. Identifying different sources of data and building a roadmap for real-time / near real-time data collection.
2. Responsible for Data Integration into the Enterprise Data Warehouse and Big Data Platforms in projects, and supporting business teams in data quality automation.
3. Responsible for planning, research, design and implementation of new data analytics platforms and technologies to address the organization’s analytics demands including big data platforms.
4. Responsible for automating big data lifecycle management, big data storage systems, data security and data governance.
5. Responsible for design and implementation of processes to ensure data reliability, efficiency, quality, and continuous improvement.
6. Responsible for eliminating tool redundancy and ensuring timely data availability.
7. Responsible of performing analytics infrastructure sizing based on requirements and design in projects.
8. Responsible of creating data pipelines using both proprietary and emerging technologies (like Apache Nifi and Kafka among others).
9. Identification of the correct analytics technology stacks to use as per project requirements.
10. Technical responsibility for working with business in identification, development, piloting and scaling of ML and AI use cases.
11. Working closely with BI support and application support teams to make sure that all the big data applications and pipelines are highly available and performing as expected.
12. Reviewing design and architecture to guarantee service availability, performance and resilience.
13. Reviewing application development tasks allocated to supervised staff to ensure that they are accomplished within the set requirements and that they meet highest standards of quality.
14. Ensuring that solutions built comply with quality assurance (including fixing of functional and non-functional issues) and release guidelines; and have the requisite documentation.
15. Planning for solution demos for delivered solutions/enhancements to get stakeholder feedback and for adoption.
16. Reviewing analytics domain coding standards and recommends/implements improvements.
Academic qualifications
• Bachelor’s degree in Computer Science, Information Systems, Information Technology, Analytics or other related fields from a recognized university.
Professional Qualifications/Certifications
1. Data Warehousing Solutions design, setup and optimization.
2. Big Data Platforms design, setup and optimization.
3. Structured and Unstructured Database Systems.
4. ETL / ELT Jobs design, optimization and tooling.
5. Machine Learning and AI an added advantage.
6. Data Architecture and Design an added advantage.
Work experience required
1. Four (4) years of hands-on experience of which one (1) should be at Supervisory level working with Java and experience in Python, R, SQL and Scala in the analytics field within a busy environment processing large and high velocity data sets.
2. Proven experience in design, development and implementation of big data processing architectures and data ingestion techniques.
3. Demonstrated experience in big data querying techniques and tools.
4. Working experience in solutions development and delivery using agile frameworks will be an added advantage.
Functional and Technical Skills
1. Hands-on experience in implementing, managing, monitoring and administering overall Hadoop infrastructures as well as development and monitoring of Hadoop jobs.
2. Hands-on experience in structured and unstructured databases administration and development.
3. Hands-on experience supporting installation and code deployments into Hadoop clusters
4. Experience in sizing & capacity planning of data platforms as per data requirements.
5. Hands-on experience monitoring and reporting on Hadoop resource utilization and troubleshooting.
6. Hands-on experience in doing data backup and recovery tasks.
7. Experience in data lifecycle management (including data retention and purging strategies)
8. Experience in ETL, big data jobs, data streams processing and database performance optimization.
9. Hands-on experience in installation and performance maintenance of Apache Nifi, Apache Kafka, Airflow and ML flow.
10. Experience with ML/AI model deployments will be an added advantage
11. DevOps and infrastructure automation experience (containerization technologies, Ansible, etc.) will be an added advantage.
12. Working knowledge of Linux/Unix and Windows operating system platforms.
Behaviours and Competencies
1. Demonstrated analytical, technical, and problem-solving skills, with high-levels of creativity and a practical approach that is self and principle-driven.
2. Ability to balance the long-term and short-term implications of individual decisions and effective at driving short-term actions that are consistent with long-term goals.
3. Ability to interact confidently with users to establish real problems and explain the solutions while prioritizing competing work commitments and delivering on time.
4. Excellent written and verbal communications skills, able to distil complex technical concepts into simple terms, with strong persuasion skills to gain support for and establish principles, standards, and change.
5. Excellent relationship building, teamwork, and collaboration skills that enables the provision of effective support and guidance across programs as well as negotiation.
6. Vendor and technology neutral –driven primarily by long-term business outcomes rather than personal preferences.
7. Be resilient, focused, results oriented and a team player.