Artificial Intelligence and Big Data Analytics for Public Health (AIBDA) Certificate

The Artificial Intelligence and Big Data Analytics for Public Health (AIBDA) Certificate trains social, behavioral, and health scientists to master Python programming for data science, learn state-of-the-art machine learning and deep neural network models, and implement responsible, ethical artificial intelligence (AI) applications to solve real-world social and health problems.

AI is fundamentally transforming how the world operates. Social, behavioral, and health scientists with modern data science skills are in surging demand and highly valued on the job market. Brown School is uniquely positioned to offer this training. Our interdisciplinary programs, deeply-rooted strength in quantitative modeling, and extensive real-world research opportunities provide a rich environment for students to learn and apply modern data analytic skills.

The AIBDA Certificate consists of two 3-credit courses and two 1-credit skill labs. Participants receive rigorous, comprehensive, hands-on training on Python programming, machine learning, deep learning, and data ethics. The Certificate has an undivided focus on practicality and application, directly addressing the needs of public health and social work practitioners and social/behavioral scientists.


Applied Machine Learning Using Health Data

This course teaches popular machine learning (ML) models using Python and their applications on health data and beyond. The topics include (1) Python programming basics (coding with Python and essential Python modules such as NumPy, Pandas, Matplotlib, and Scikit-learn); (2) Classification ML models; (3) Regression ML models; (4) ML model training and validation; (5) Support vector machines and decision trees; (6) Ensemble methods; (7) Dimensionality reduction; and (8) Unsupervised learning techniques. Students who complete this course will: (1) Understand the mathematical/statistical algorithms and computer programming routines for ML models widely adopted in health and social sciences; (2) Proficiently apply ML models to analyze real-world data; and (3) Appraise the pros and cons of alternative ML models in the contexts of problem-solving. The course will build a solid foundation for enthusiastic students who want to learn deep learning models, a subdomain of ML models based on artificial neural networks with representation learning. (3 credits)

Applied Deep Learning Using Health Data

This course teaches a wide range of deep learning (DL) models using Python and their applications on health data and beyond. The topics include (1) Introduction to deep learning, Python, and NumPy; (2) Introduction to PyTorch and neural network; (3) Computer vision (image classification, object detection, image segmentation, keypoint detection, audio classification, and video classification); (4) Natural language processing (text preprocessing, text classification, text generation, text summarization, and text question answering); (5) Time series forecasting; (6) Recommender system; (7) Generative adversarial networks; and (8) Synthetic data generation. Students who complete this course will: (1) Gain a deep understanding of the key concepts and elements of AI, ML, and DL; (2) Familiarize themselves with a vast pool of popular, state-of-the-art DL models and their applications in health and beyond; (3) Understand the strengths, limitations, and tradeoffs of different DL models and best practices in implementing them; (4) Use Python in conjunction with popular APIs and cloud platforms (e.g., PyTorch, PyTorch Lightning/Flash, fastai2, IceVision, Hugging Face, spaCy, Haystack, Synthetic Data Vault, Google Colab, and Kaggle) to implement DL models (e.g., convolutional neural networks, recurrent neural networks, transformers) on various data types (e.g., text, image, video, audio, tabular). (3 credits)


Skill Lab: Data & Algorithmic Bias

This skill lab focuses on critically thinking about data practices that have the potential to amplify rather than reduce the racial, economic, gender, age, and other biases found in society today. Students will learn to build sophisticated ethical reasoning skills to address and recommend concepts of right and wrong conduct, with the transparency and defensibility of actions and decisions driven by AI concerning data in general and personal data in particular. (1 credit)

Skill Lab: Introduction to Python for Public Health Data Analysis

This skill lab will introduce students to the fundamentals of the Python language, common Python modules for data manipulation and analysis, and the Jupyter notebook environment. The skill lab will begin with acquiring data from publicly available sources and databases, cleansing and transforming data, and creating descriptive statistics and graphics. The skill lab will also introduce Python’s natural language processing and machine learning modules for basic data classification and predictive modeling applications. (1 credit) Students may pursue select external Python courses to fulfill this requirement with approval from their Advisor and the Certificate Chair.

Ruopeng An

Certificate Chair

Ruopeng An is an Associate Professor at the Brown School. He conducts research to assess environmental influences and population-level interventions on weight-related behaviors and outcomes throughout the life course. His research aims to develop a well-rounded knowledge base and policy recommendations that can inform decision-making and the allocation of resources to combat obesity. He has over 190 peer-reviewed journal publications and has been named in the Stanford University – Elsevier list of the top 2% most cited scientists. His research has been widely featured in media, such as TIME, New York Times, Los Angeles Times, Washington Post, Chicago Tribune, Boston Globe, Reuters, USA Today, Bloomberg, Forbes, Harvard Health, Atlantic, Guardian, CBS, FOX, ABC, NPR, NBC, and CNN. In 2018, he was elected as a Fellow of the American College of Epidemiology.


Kim Johnson, Brown School
Jenine Harris, Brown School
Randi Foraker, School of Medicine

“Data are now available in a way and quantity that has never existed before, presenting unprecedented opportunities for advancing research and practices through state-of-the-art data analytics. Dealing with extensive, complex, and unconventional data requires revolutionary analytic tools only made available during the past decade. Artificial Intelligence (AI) is regarded as the new electricity of the century, applied to almost every sector and transforming our physical and virtual world at an accelerated rate. AI has become increasingly recognized as an indispensable tool in health and social sciences, with relevant applications expanding from disease outbreak prediction to medical imaging and patient communication to behavioral modification.”