Research Interests

I consider myself a generalist as opposed to a specialist. In short, a specialist tends to focus on a relatively narrow domain and become an expert in it. A generalist, however, broadens the domain at the expense of losing depth. While this might hinder a generalist from becoming a true expert in one domain, it allows him/her to connect and join ideas from different disciplines and act as a bridge between specialists. As a generalist, I try to be topic- and method-agnostic, but I have priorities in my research, nonetheless.

Since 2018, my advisors (Prof. Nick Street and Prof. Barrett Thomas) and I started exploring the application of Reinforcement Learning (a branch of machine learning that deals with finding the best course of action in sequential decision making settings) in Health Analytics. More specifically, we look at the dosing of a commonly used anticoagulant called warfarin.

In the first part of this work, we showed that one can achieve a better dosing protocol using Deep Q-Networks (DQN). However, the model acts as a black box and the user might not trust its recommendations. In the second phase, we implemented the work using a Policy Gradient method (PPO). The idea is to learn a limited set of doses and then try to make it easier to understand and use. The end goal is to have a dosing protocol that performs better and is explainable and individualized.

My other areas of interest include:

  • Artificial Intelligence

  • Machine Learning

  • Health Analytics

  • AI in Optimization

  • Ethics of AI, Fairness, Interpretability, Explainability

Current Research

  • Anzabi Zadeh, S., Street, W., & Thomas, B. "An Explainable Deep Reinforcement Learning Model for Warfarin Dosing". (working paper)


  • Anzabi Zadeh, S., Street, W., & Thomas, B. "Optimizing Warfarin Dosing using Deep Reinforcement Learning." Journal of Biomedical Informatics, (in press), DOI: 10.1016/j.jbi.2022.104267.

  • Ashrafi, M., & Anzabi Zadeh, S. (2017). "Lifecycle risk assessment of a technological system using dynamic Bayesian networks." Quality and Reliability Engineering International, 33(8), 2497-2520, DOI: 10.1002/qre.2213.


  • Anzabi Zadeh, S., Street, W., & Thomas, B. (October 2022). Optimal Dosing of Warfarin Using Policy Gradient Methods, INFORMS Annual Meeting, Indianapolis, IN.

  • Anzabi Zadeh, S. (March 2022). Optimizing warfarin dosing using deep reinforcement learning, Tippie Advisory Council, Iowa City, IA.

  • Anzabi Zadeh, S., Street, W., & Thomas, B. (October 2021). Optimal Dosing Protocol For Warfarin Using Deep Reinforcement Learning, INFORMS Annual Meeting, Anaheim, CA.

  • Anzabi Zadeh, S., Street, W., & Thomas, B. (November 2020). Application Of Deep Reinforcement Learning In Optimal Warfarin Dosing. INFORMS Annual Meeting, Virtual.

  • Anzabi Zadeh, S., Street, W., & Thomas, B. (October 2019). Optimal Warfarin Dosing Using Reinforcement Learning. INFORMS Annual Meeting, Seattle, WA.

  • Namdar, J., Zhao, K., Anzabi Zadeh, S., Blackhurst, J. (October 2019). Modeling and Analysis of Cascading Disruptions in Chain Network: A Cascading Simulation Model, INFORMS Annual Conference, Seattle, WA.


For my dissertation, I developed a Reinforcement Learning package in Python called ReiL. It is not the best, or the fastest, or the most efficient, or even the right way of doing RL, but it is available for use under MIT License. You may install it from PyPI (pip install ReiL), or fork the source code from the repository here:

Most of the code is fully type annotated and documented, and I am dedicated to continue improving and supporting it.




  • The Effect of Integration on Supply Chain Performance (Supervisor: Prof. S.H. Zegordi).

  • Solving Fuzzy Decompositions (SoT and ToS) using Electromagnetic Metaheuristic Method (Supervisor: Prof. M.H. Fazel Zarandi).

  • Member of MASAF Strategic Management Research Group, Amirkabir University of Technology (2004 - 05).