EN.601.727 Machine Programming

Johns Hopkins University — Fall 2025

Instructor: Ziyang Li   |   Email: ziyang@cs.jhu.edu

Final Project

The final project is a significant component of this course, worth 35% of your grade. It provides an opportunity to explore advanced topics in machine programming and apply the concepts learned throughout the semester.

Project Overview

Students will pick a research problem in program synthesis, analysis, or related areas. We provide a list of project ideas below, but you are encouraged to propose your own ideas based on your interests and background. The project can be theoretical, experimental, or a combination of both.

Working in a team

The project should be completed with at most 3 students. In the proposal, you should describe the project idea, the team members, and the division of labor.

Project Requirements

Project Timeline

Project Ideas

Here are some example project areas to consider:

  1. Construct (either manually, by crawling, or by existing models) a mid-to-large-scale dataset studying a specific programming language or domain
  2. Pick a relatively low-resource language (Haskell, Prolog, Datalog, Scallop, CodeQL, LEAN, Coq, Z3, Racket, Clojure, LaTeX, TikZ, Processing, PDDL) and construct a small-scale dataset and evaluate LLMs' performance on synthesizing programs within the language
  3. Pick a high-resource language but a specific domain (Kernel driver within C, CUDA drivers, Unreal Engine Programming within C++, Neural Network Modules within Python/PyTorch/JAX) and evaluate LLMs' performance on the domain
  4. Design a small domain-specific language (for robot control, games, theorem proving, databases, maths, logic, visualization, animation, and etc.) and demonstrate LLMs' capability generating programs within the language
  5. Implementing a conversational coding agent that can help users make edits or feature requests to a codebase
  6. Implementing a downstream application that involves synthesizing of programs
  7. Using LLMs to synthesize formal specifications for existing programs, and properly evaluate the generated specifications
  8. Identifying security vulnerabilities within LLM generated programs
  9. Implementing the technique in existing papers (refer to Readings)

Note: These are suggestions only. You are encouraged to propose your own ideas based on your interests and background.