Research Associate in Machine Learning for Drug Discovery

University of Sheffield
Sheffield, South Yorkshire, United Kingdom
Job Type:
  • Postdoctoral (Up to 3 years)
Degree Level Required:
Apply Now

Research Associate in Machine Learning for Drug Discovery

We have an exciting opportunity at the NIHR Sheffield Biomedical Research Centre and the Department of Computer Science for someone with a passion for machine learning looking to use their skills in developing transfer learning models to predict drug response using high-dimensional genomic measurements.

You will join the Bioinformatics and Machine Learning group in our large academic department and world leading research institute in translational medicine. You will also work in a well-connected team collaborating with pharmaceutical companies, hospitals and research institutes in Canada, USA, and Europe.

We are seeking candidates with a demonstrable knowledge of a wide range of machine learning techniques, in particular, probabilistic modelling and practical experience genomic or dose-response data. You will hold a PhD, or have equivalent experience, in a computational discipline with a solid background in mathematics/statistics, excellent scientific programming skills and eagerness to contribute to open source software.

If you are passionate about the practical impact of machine learning on healthcare, then we would love to hear from you.


● Propose and develop novel machine learning concepts and models to allow the creation of a tool-kit which enables dose-response prediction, modelling and analysis of genomic datasets across diseases. ● Use best-practice software development methodologies (code repositories, unit testing, etc.) to provide software implementations of high-dimensional pharmacogenomic models, inference procedures and optimal drug screening algorithms. ● Maintain up-to-date knowledge of the relevant literature and organise time to ensure very good knowledge of the background of the research area. Particularly, machine learning and statistical models for pharmacogenomic datasets in cancer and neurodegenerative diseases. ● Assess and develop pre-processing tools for a variety of data generated by collaborators. ● Apply the prediction models to cancer cell line datasets (eg. GDSC, DepMap, NCI60). ● Validate the performance of the methods developed with experimental data from collaborators and compare them against current state-of-the-art design and analysis methods. ● Plan own research activities in discussion with supervisor, incorporating issues such as the availability of resources, deadlines, project milestones and overall research aims. ● Attend international project meetings (when possible) and training events, collaborate and communicate with overseas researchers and other project sites to ensure project progress is maintained. ● Coordinate with other members of the Machine Learning and Bioinformatics groups to ensure objectives are met. ● Write papers to be presented at conferences and publication in journals. ● Write supporting documents to contribute to and support the work of the research groups. ● Continuously monitor and check results. The unpredictability of research means that daily planning needs to accommodate new developments. ● You will make a full and active contribution to the principles of the ‘Sheffield Academic’. These include the achievement of excellence in applied teaching and research, and scholarly pursuits to make a genuine difference in the subject area and to the University’s achievements as a whole. Further information on the underpinning values of the Sheffield Academic can be found at: Sheffield Academic. ● As a member of staff you will be encouraged to make ethical decisions in your role, embedding the University sustainability strategy into your working activities wherever possible. ● Any other duties, commensurate with the grade of the post.


You will be able to demonstrate knowledge of a wide range of machine learning techniques (in particular probabilistic modelling) and practical experience handling genomics data. You will hold a PhD in quantitative discipline with a solid background in mathematics/statistics.

Additional Information

PI Dr Wang has been a computational biologist with nearly 15 years of experience in the area of translational genomics and bioinformatics. He is a Senior Lecturer in Genomic Medicine in the departments of Neuroscience and Computer Science at the University of Sheffield. He has previously identified comparable genomic features in preclinical cancer models that can facilitate transfer learning and developed both supervised and unsupervised machine learning approaches for modeling drug response. He leads the Genomics and Bioinformatics theme of the NIHR Sheffield Biomedical Research Centre, which supports short (Illumina) and long read (Oxford Nanopore) sequencing capabilities for validating biomarkers. Dr Wang has helped the University of Sheffield attract more than £1.8M of funding for translational research in the past 3.5 years. Having worked in technology, pharmaceutical and healthcare sectors, he has managed multi-disciplinary projects with clinicians, biologists and computer scientists. Co-I Dr Mauricio Álvarez is a Senior Lecturer in the Machine Learning (ML) group at the Department of Computer Science. Dr Álvarez’s main research interests are the development of new probabilistic models and their application in different engineering and scientific areas. He is internationally known for his work on multi-task Gaussian processes. Dr Álvarez has served as an Area Chair and Senior Programme committee for conferences such NeurIPS, UAI and ICLR. He is also an Editor of the Statistics & Computing journal. He has co-organised several international workshops on Gaussian processes (Vancouver, 2009; Manchester, 2009; Long Beach, California, 2017; Sheffield, 2017 to 2020). He will provide expertise on which multitask and transfer learning methods to use and expertise on developing new models that can cope with the characteristics of the datasets provided by the collaborators. Co-I Dr Ferraiuolo is a Reader in Translational Neurobiology at the University of Sheffield, and the work of her research group has demonstrated that oligodendrocytes and astrocytes play an active role in neuronal degeneration. She is currently investigating gefitinib and nilotinib, two inhibitors commonly used in cancer found to be effective in motor neurone models and collaborates with BenevolentAI to develop these drugs for motor neurone disease In this project, she will provide expertise in neuro-disease biology to optimise ML models for MND and validate our predicted genetic markers of response using her in vitro cell models of MND. Also relevant to this project, she manages the onsite high-content screening and imaging facility (Opera Phenix) capable of screening 4000 compounds at a time. Staff Researcher Dr Mark Dunning is part of the Sheffield Bioinformatics Core and will assist in processing and managing the pharmacogenomic datasets used for developing the methods. He is an experienced cancer informatician familiar with correcting for batch effects and has been involved in several high-profile cancer genomics projects.

The Department of Computer Science, established in 1982, has since attained an international reputation for its research and teaching. In the 2014 Research Excellence Framework (REF) exercise, we were ranked 5th out of 89 departments in the UK for computer science research. Much of our research spans the boundaries between engineering, medicine and the life sciences. Research in the Department is organised into a number of groups: Machine learning, Algorithms, Complex systems modelling, Natural language processing, Speech and hearing, Organisations, information and knowledge (OAK), Verification and testing, Visual computing, and Security of Advanced Systems. We attract substantial external funding from UK Research Councils, the European Union and industry. More information on the Department of Computer Science can be found at