At Maersk, we have many opportunities to work with data each and every day. In this role as Data Engineering & Architecture Lead on the Maersk Global Data Analytics (GDA) team, and your primary responsibility will be to partner with key stakeholders, data product leads, and data scientists to support and enable the continued growth priorities critical to Maersk’ end to end integrator strategy
We are looking for a Data Engineering Lead with Software Engineering chops to not only build data pipelines but also to build the data tools to enable us to take full advantage of this data. In this role, you will get opportunities to expand GDA’ impact, enrich the strategic signal of our data, work more closely with new areas of Data Lake infrastructure, Applied Science, etc.. You are an inquisitive, out-of-the box thinker who’s continually on the lookout for opportunities to solve some of the most interesting data challenges with efficiency as we continue to expand & grow to newer markets with newer services and offerings
You will be responsible for creating the robust, scalable, extensible data architecture that will work in unison with Platform architecture such that our stakeholders (internal and external) can drive business critical decisions, identify efficiency opportunities using extensible data models/marts and serve insights, predictions and recommendations via consumable API’s etc.. This exciting role will bridge knowledge and experience from data engineering and software engineering, combining the best of all to evaluate ML products at scale. The ideal candidate will have a passion for working in white space and creating impact from the ground up in a fast-paced environment.
· Build conceptual, logical, and physical architecture with high standards by collaborating with multiple stakeholders (eg: EA, Platform Architecture, etc..) in line with defined roadmaps and milestones.
· Proactively drive the Engineering excellence and vision for Data Engineering and BI across multiple Squads in GDA by defining processes needed to achieve operational excellence in all Data product development and ML engineering including system reliability, scalability, extensibility, etc....
· Architect and build high-performance scalable data lakehouse, data marts, data models for for optimal storage, retrieval, inclusive data quality checks, domain-related data pipelines.
· Partner with leadership, data engineers, program managers and data scientists to understand data needs from OP and Non-OP priorities and design innovative solutions
· Contribute to a variety of aspects from low-level data processing runtime & storage, to ML training & inference infrastructure and to knowledge serving subsystems & APIs.
· Communication and leadership experience, with experience initiating and driving project
· Build and lead a high-caliber team of engineers & provide technical guidance, career development.
· BS/MS in Computer Science with 10+ of Architecture and design experience working with cloud or hybrid cloud focused on Big Data/MPP analytics platform (i.e. Databricks on Azure Data Lake
· Strong experience with ML evaluation methodologies applied to large-scale production deployments with strong understanding of algorithms and software design.
· 5+ years of Python or other modern programming development experience (Scala, Golang, etc.)
· 5+ years of exp with any flow management (i.e. Azure Data Factory, Airflow, Luigi, etc.)
· 3+years of exp with ML tools (Eg: PyTorch, etc.) & logging and debugging tools(Eg: Grafana, etc.)
· Solid experience with CI/CD automation tools
· Designing and optimizing e2e real-time pipelines (SQL performance optimization)
· Experience querying massive datasets using Spark, and applying anomaly/outlier detection frameworks .