EN.601.727 Machine Programming

Johns Hopkins University — Fall 2025

Instructor: Ziyang Li | Email: ziyang@cs.jhu.edu

Course Readings

This page contains relevant papers, courses, and webpages organized by topic.

Topic: Code Large Language Models

Code Llama - [2023]
StarCoder - [2023]
DeepSeek Coder - [2024]

CodeSage - [2024]

Llama - [2023]
Llama 2 - [2023]
CodeFuse - [2023]
Casual Masking - [2022]

SantaCoder - [2023]
The Stack - [2022]
8K token context length - [2024]
Fill-in-the-middle - [2022]
Multi-Query-Attention - [2019]
BERT - [2018]

DeepSeek Coder Repo
CodeFuse - [2023]
Fill-in-the-middle - [2022]
CodeGen - [2022]
BPE - [2015]
RoPE - [2021]

SimCSE - [2021]
UniXCoder - [2022]
DOBF - [2022]

Large Language Models Meet NL2Code - [2023]
A Survey on Language Models for Code - [2023]
Deep Learning for Source Code Modeling and Generation - [2020]
CodeT5+ (Encoder-Decoder Models) - [2023]
CodeFusion (Diffusion Models) - [2023]
DALL-E 2 - [2022]

Topic: Evaluation of Code Models

LiveCodeBench - [2024]
SWE-bench: Can Language Models Resolve Real-World GitHub Issues? - [2023]

LiveCodeBench Repo
HumanEval/Codex (Accuracy) - [2021]
ReCode: Robustness Evaluation of Code Generation Models (Trustworthiness) - [2022]
MBPP - [2021]

ReCode: Robustness Evaluation of Code Generation Models (Trustworthiness) - [2022]
CodeXGLUE - [2021]
XLCoST - [2022]
APPS - [2021]
CodeContest/AlphaCode - [2022]
DS-1000 - [2022]
xCodeEval - [2023]
BigCode Eval Harness
BigCodeBench
LMSYS Coding

Topic: Agents

MASAI - [2024]
AutoCodeRover - [2024]

TBD

Topic: Improving Code Generation

CODEDPO - [2024]
LintSeq - [2024]
GAD - [2024]
PIE - [2024]

Topic: Interpretability of Code Models

Explainable AI - [2021]

References

This course draws inspiration from the following sources:

Generative Model for Code -- COMS 6998 by Baishakhi Ray, Columbia University
Introduction to Program Synthesis -- 6.S981 by Armando Solar-Lezama, MIT
Program Synthesis -- CSE 291 by Nadia Polikarpova, UCSD
Program Synthesis for Everyone -- CS294 by Ras Bodik and Emina Torlak, University of Washington