Job Properties
  • Job Type
    Full-time Position
  • Background
  • Languages
  • Experience Required
    1 - 2 Years
  • Degree Required
    • Province
    • Date Posted
      November 20,2020
    • VISA
    • IMG_6430
    • Career Consultation
    • CV CHECK
    • 周年特惠 IP

    Lead Data Engineer - Enterprise Data Management

    Since started facilitating bookings in 1996, the amount of data produced and consumed has increased in unimaginable proportions (~20TB/day), certainly from the perspective of our founders. The last decade, open source data tools (Hive, Spark, Cassandra and Kafka) running on large internal server parks enabled hundreds of colleagues working closely with data to produce various data products, e.g. in Machine Learning and Analytics. As the community has grown, so have the number of challenges around working with data. Providing flexible compute resources introduced the onset of clouds in parallel to a heavily utilized on-premise environment. Governments introduce standards for personal data protection. A growing, physically disconnected employee-base is less able to share tribal knowledge regarding data finesses. Hence, the establishment of 'Enterprise Data Management', a group governing the production and consumption of data, for it to be trusted and understood.

    The Lead Data Engineer is a technical leader who drives broad data engineering strategies and delivery across a business area. You will lead solution envisaging, technical designs, hands-on implementation as well as provide operational support across multiple data domains. You need to influence, differentiate, and guide the business and technology strategies in your area, as they relate to data, through constant interaction with various teams. You ask the right questions to the right people in order to align data strategy with commercial strategy, demonstrating deep technical expertise and broad business knowledge. You play an active role in identifying data engineering skill gaps within your area and support development of tools, materials, and training to bridge these gaps.


    • Support the data requirements of new and existing solutions by developing scalable and extensible physical data models that can be operationalised within the company’s workflows and infrastructure
    • Drive efficiency by mapping data flows between systems/workflows across the company
    • Ensure standardisation by following patterns in line with data governance requirements
    • Manage and automate the entire life-cycle of data processing systems and platforms
    • Support varied business requirements by building extensible data pipelines spanning different data encodings, protocols and unstructured data across different systems
    • Support real-time internal and customer-facing actions by developing real-time event-based streaming data pipelines
    • Enable rapid data-driven decisions by developing efficient and scalable data ingestion
    • Drive high-value data by connecting different disparate datasets from different systems into a well-managed unified solution
    • Own end-to-end data applications by defining, monitoring and adjusting SLIs and SLOs
    • Handle, mitigate and learn from incidents in a manner that improves the overall system health
    • Ensure accuracy by developing criteria, automation, and processes for data production, transport, transform, and storage
    • Drive data validity by defining criteria and ongoing validation strategies
    • Ensure ongoing reliability of data pipelines by developing and implementing standards for end-to-end testing
    • Improve failure detection by evolving the maturity of monitoring systems and processes
    • Ensure compliance with data-related requirements by building solutions in line with all applicable standards and regulations
    • Ensure ongoing resilience of data processes by monitoring system performance and proactively identifying bottlenecks, potential risks, and failure points that degrade quality
    • Build software applications by using the relevant development languages and applying in-depth knowledge of the systems, services and tools used by the specific business area
    • Write readable and reusable code by applying standard patterns and standard libraries
    • Choose the right technology by researching and understanding requirements
    • Continuously evolve your craft by keeping up to date with the latest developments in data engineering and related technologies, introducing them to the community and promoting their application in areas where they can generate impact
    • Actively contribute to Data Engineering at through training, exploration of new technologies, interviewing, onboarding and mentoring colleagues
    • Push for improvements, scaling and extending data engineering tooling and infrastructure, collaborating with central teams

    Level of Education

    • Bachelors degree in Computer Science or related field
    • Masters degree in Computer Science or related field

    Relevant Job Knowledge

    • 7+ years of experience in a data engineering or related field using a server side programming languages, preferably Scala, Java, Python, or Perl
    • 5+ years of experience building data pipelines and transformations at scale, with technologies such as Hadoop, Cassandra, Kafka, Spark, HBase, MySQL
    • 3+ years of experience in data modelling
    • 3+ years of experience handling data streaming

    Requirements of special knowledge/skills

    • Strong knowledge of data modelling methods, e.g. Relational, Data Vault, Dimensional
    • Strong understanding of data architecture best practices, DAMA, TOGAF
    • Strong knowledge of data warehouses
    • Intermediate knowledge of data governance requirements based on best practices, e.g. DAMA, and tooling for continuous automated data governance activities
    • Strong knowledge of data quality requirements and implementation methodologies
    • Excellent English communication skills, both written and verbal.
    Open Positions from
    Related positions