Description

As a Principal Data Scientist, you will build and model digital product and platforms that bring Amgen’s AI/ML and GenAI solutions to life.

This role has to collaborate with Amgen’s Technical Architect, Product Manager, UX designers, and Back-end engineers to design secure, scalable, and user-centric products that accelerate discovery, manufacturing, and commercial analytics, corporate functions products

**Key Responsibilities**

+ Strong experience with statistics and machine learning, including deep learning, natural language processing (NLP) and, experience in building cloud-scale systems and working with open-source stacks for data

+ Experiment with large language models (LLM), Artificial intelligence (AI) for code or related fields, Generative AI, Foundational Models, Supervised and Unsupervised Learning

+ Collaborate with cross-functional teams to understand the requirement and design solutions that meet business needs. Also with Data Architects, Business SMEs, and Technical Architect to design and develop to meet fast-paced business needs.

+ Explore new tools and technologies that will help rapid development of solutions

+ Participate in sprint planning meetings and provide estimations on technical implementation 

+ Embed responsible-AI and security-by-design controls

**Required Qualifications**

+ 6 –12 years of experience in Data Science (eg: managing structured and unstructured data, applying statistical techniques and reporting results)

+ Doctorate in Computer Science, Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science or a related field

+ Leverage cloud platforms (AWS preferred) to build scalable and efficient solutions 

+ Strong background in Deep Learning, Machine Learning, NLP, Data Mining.

+ Excellent communication and stakeholder management skills.

**Preferred Skills**

+ Experience in Generative AI, Foundational Models, LLM’s, Feature Engineering, Selection & Extraction, BI & Automation, Predictive Modelling, Data Visualization, CNN, RNN, GNN, Transformers, Exploratory Data Analysis

+ Familiarity with Python packages (Pytorch, TensorFlow, Hugging FaceScikit-learn, Pandas, NumPy, Matplotlib, Cloud Vision API, RAG, TensorBoard, OpenCV, NLTK)), programming languages ( C, C++, Java, CUDA,SQL, NoSQL, PHP, HTML, JS, CSS etc.)

+ Prior exposure to pharma / life sciences AI environments is preferred.

+ Strong understanding of Responsible AI and model validation principles.