Final Project

The final project is a significant component of this course, worth 35% of your grade. It provides an opportunity to explore advanced topics in machine programming and apply the concepts learned throughout the semester.

Project Overview

Students will pick a research problem in program synthesis, analysis, or related areas. We provide a list of project ideas below, but you are encouraged to propose your own ideas based on your interests and background. The project can be theoretical, experimental, or a combination of both.

Working in a team

The project should be completed with at most 3 students. In the proposal, you should describe the project idea, the team members, and the division of labor.

Project Requirements

Proposal (5%): A 2 page document outlining your project idea, methodology, and expected outcomes
Implementation (20%): Working code and/or theoretical development
Report (5%): A comprehensive written report describing your work
Presentation (5%): A 10-minute presentation to the class

Project Timeline

Week 13 (Nov 18): Project proposal due
Finals week (Dec 10-18): Final project presentations
End of Finals week: Final report due

Project Ideas

Here are some example project areas to consider:

Construct (either manually, by crawling, or by existing models) a mid-to-large-scale dataset studying a specific programming language or domain
Pick a relatively low-resource language (Haskell, Prolog, Datalog, Scallop, CodeQL, LEAN, Coq, Z3, Racket, Clojure, LaTeX, TikZ, Processing, PDDL) and construct a small-scale dataset and evaluate LLMs' performance on synthesizing programs within the language
Pick a high-resource language but a specific domain (Kernel driver within C, CUDA drivers, Unreal Engine Programming within C++, Neural Network Modules within Python/PyTorch/JAX) and evaluate LLMs' performance on the domain
Design a small domain-specific language (for robot control, games, theorem proving, databases, maths, logic, visualization, animation, and etc.) and demonstrate LLMs' capability generating programs within the language
Implementing a conversational coding agent that can help users make edits or feature requests to a codebase
Implementing a downstream application that involves synthesizing of programs
Using LLMs to synthesize formal specifications for existing programs, and properly evaluate the generated specifications
Identifying security vulnerabilities within LLM generated programs
Implementing the technique in existing papers (refer to Readings)

Note: These are suggestions only. You are encouraged to propose your own ideas based on your interests and background.