Data Engineer

Data Engineer

Petrofac | Chennai, TN, IN

Posted a month ago

Apply Now

Description


Petrofac is a leading international service provider to the energy industry, with a diverse client portfolio including many of the world’s leading energy companies.




We design, build, manage and maintain infrastructure for our clients. We recruit, reward, and develop our people based on merit regardless of race, nationality, religion, gender, age, sexual orientation, marital status or disability. We value our people and treat everyone who works for or with Petrofac fairly and without discrimination.




The world is re-thinking its energy supply and energy security needs, planning for a phased transition to alternative energy sources. We are here to help our clients meet these evolving energy needs.




This is an exciting time to join us on this journey.




We support flexible working requests and have adopted a hybrid approach for most of our office-based roles. We ask employees to be present in the office at least three days per week.




Are you ready to bring the right energy to Petrofac and help us deliver a better future for everyone?




Job Title: Data Engineer




Key Responsibilities:



Architecting and defining data flows for big data/data lake use cases
Excellent knowledge on implementing full life cycle of data management principles such as Data Governance, Architecture, Modelling, Storage, Security, Master data, and Quality.
Collaborates with analytics and business stakeholders to improve data models that feed BI tools, increasing data accessibility, and fostering data-driven decision making across the organization.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability.
Responsible for estimating the cluster size, core size, monitoring, and troubleshooting of the data bricks cluster and analysis server to produce optimal capacity for computing data ingestion.
Lead master data cleansing and improvement efforts; including automated and cost-effective solutions for processing, cleansing, and verifying the integrity of data used for analysis.
Expertise in securing the big data environment including encryption, tunnelling, access control, secure isolation.
To guide and build highly efficient OLAP cubes using data modelling techniques to cater all the required business cases and mitigate the limitation of Power BI in analysis service.
Deploy and maintain highly efficient CI/CD devops pipelines across multiple environments such as dev, stg and production.
Strictly follow scrum based agile approach of development to work based on allocated stories.
The above is an outline of key duties and accountabilities, rather than an exclusive or exhaustive list of responsibilities. The post holder is expected to undertake any tasks which may reasonably be expected within the scope of the position.
The postholder is expected to adhere to the 9 Life-Saving Rules and the Petrofac Values and Behaviours.





Essential Qualifications and Skills:



Bachelor’s degree (masters’ preferred) in Computer Science, Engineering, or any other technology related field
5+ years of experience in data analytics platform and hands-on experience on ETL and ELT transformations with strong SQL programming knowledge.
5+ years of hands-on experience on big data engineering, distributed storage and processing massive data into data lake using Scala or Python.
Proficient knowledge on Hadoop and Spark eco systems like HDFS, Hive, Sqoop, Oozie, Spark core, streaming.
Experience with programming languages such as Scala, Java, Python and Shell scripting
Proven Experience in pulling data through REST API, ODATA, XML, Web services
Experience with Azure product offerings and data platform.
Experience in data modelling (data marts, snowflake/Star, Normalization, SCD2).
Strong project management skills and the ability to balance multiple priorities without losing momentum
Architect and defining the data flows and building highly efficient, scalable data pipelines.
Strong decision making, troubleshooting, problem solving skills of any issues stopping business progress.
Coordinate with multiple business stake holders to understand the requirement and deliver.
Understand the physical and logic plan of execution and optimize the performance of data pipeline.
Extensive background in data mining and statistical analysis
Able to understand various data structures and common methods in data transformation