Data Mining Lead

Data Mining Lead

FactSet | Hyderabad, TG, IN

Posted a month ago

Apply Now



FactSet combines hundreds of databases into a single, powerful information system. It is a one-stop source for financial information and analytics for business analysts, portfolio managers, investment banker’s / management firms and other financial professionals to analyze companies, portfolios, markets & economies. FactSet was formed in 1978 and operates out of 48 locations worldwide. FactSet, with over $1.1 Billion in annual revenues, is headquartered in Norwalk, Connecticut and employs nearly 10,000 people worldwide. Our operations extend within North America as well as Europe and the Pacific Rim. Since 1996, the Company has been publicly traded on the New York Stock Exchange under the symbol FDS.


This team is primarily responsible for the design and development of the Publisher Service, FactSet’s premiere cloud-based reporting platform. This service drives the reporting and batching functionality behind Publisher Manager (PUMA), Portfolio Guide (PG), and Portfolio Reporting Batcher (PRB), and pushing the bounds of what is possible for our clients to design, consume, and interact with their custom designed reports. One of the top corporate goals is the Portfolio Life Cycle, and this team is directly responsible for the last stage, reporting, in the Portfolio Life Cycle. We are focused on expanding into the “Digital Strategy” market with portals, components, and interactive data, and the Publisher Service is the heartbeat of that functionality. One of the best parts of this team is that its code base is very stable. Stability is the key to a balanced work life, and while this team is on-call, the curator rarely gets a page. You will be joining a team that is growing functionality.


We are unified by the spirit of going above and beyond for our clients and each other. We look to foster a globally inclusive culture, enabling our people to be themselves at work and to join in, be heard, contribute, and grow. We continually seek to expand our workforce with diverse perspectives, backgrounds, and experiences. We recognize that our best ideas can come from anyone, anywhere, at any time and help us provide the best solutions for our clients around the globe.

Our inclusive work environment maximizes our diversity values, engagement, productivity, and ultimately makes FactSet a fun place to work.


Factset is looking for leader who leads Data Mining Team to increase company ability to automate with wide variety of sources of data. He/She has to collaborate with Data Lake team and contribute in building the pipeline by enhancing or enriching the data using machine learning techniques.

Manage a science agenda that balances short term deliverable's with measurable business impact with long term projects. Proficient with Statistical analysis, standard machine learning techniques and ML model deployment engineering best practices.

Work with data scientists, engineers, and cross functional teams to produce end-to-end production-ready solutions

Drive a culture of quality, performance, scalability, and reliability


Total Experience: 7 to 9 years

Must have

Computer skills

Practice of programming in Python

Comfortable working in Linux and windows environment

Practice of source control, code review, testing frameworks

Practice of Big Data frameworks, like Hadoop, Spark

Knowledge of both Sql and noSql databases (Columns, Documents, Key-Value)


Basic knowledge of cloud environment, constraints and opportunities


Basic statistics knowledge, probability, correlation, regression, linear algebra, stochastic process

Practice of data analysis, data cleanup, data investigation, how to detect and handle unbalance, bias and noise

Proficient in data structures and algorithms, in particular: lists, queues, trees, graphs, and sorting, searching, traversing, dynamic programming, and map-reduce pattern

Knowledge and practice of ML and NN algorithms and frameworks

Knowledge and practice of Natural Language Processing


Practice of data mining and data science projects, from understanding the problem to presenting results and deploying in production

Communication skills, project presentation, capacity to popularize technics and results

Knowledge of Data Mining / Data Science projects and publications

Nice to have

Computer skills

Practice of programming in other languages like R, java, Julia, Matlab

Practice working on Jupyter notebook/lab or VS Code

Practice working on Git/Github 


Practice on AWS cloud / SageMaker

Returning from break –

We are here to support you! If you have taken time out of the workforce and are looking to return, we encourage you to apply and chat with our recruiters about our available support to help you relaunch your career.