Final Project
The final project is a significant component of this course, worth 35% of your grade. It provides an opportunity to explore advanced topics in machine programming and apply the concepts learned throughout the semester.
Project Overview
Students will pick a research problem in program synthesis, analysis, or related areas. We provide a list of project ideas below, but you are encouraged to propose your own ideas based on your interests and background. The project can be theoretical, experimental, or a combination of both.
Working in a team
The project should be completed with at most 3 students. In the proposal, you should describe the project idea, the team members, and the division of labor.
Project Requirements
- Proposal (5%): A 2 page document outlining your project idea, methodology, and expected outcomes
- Implementation (20%): Working code and/or theoretical development
- Report (5%): A comprehensive written report describing your work
- Presentation (5%): A 10-minute presentation to the class
Project Timeline
- Week 13 (Nov 18): Project proposal due
- Finals week (Dec 10-18): Final project presentations
- End of Finals week: Final report due
Project Ideas
Here are some example project areas to consider:
- Construct (either manually, by crawling, or by existing models) a mid-to-large-scale dataset studying a specific programming language or domain
- Pick a relatively low-resource language (Haskell, Prolog, Datalog, Scallop, CodeQL, LEAN, Coq, Z3, Racket, Clojure, LaTeX, TikZ, Processing, PDDL) and construct a small-scale dataset and evaluate LLMs' performance on synthesizing programs within the language
- Pick a high-resource language but a specific domain (Kernel driver within C, CUDA drivers, Unreal Engine Programming within C++, Neural Network Modules within Python/PyTorch/JAX) and evaluate LLMs' performance on the domain
- Design a small domain-specific language (for robot control, games, theorem proving, databases, maths, logic, visualization, animation, and etc.) and demonstrate LLMs' capability generating programs within the language
- Implementing a conversational coding agent that can help users make edits or feature requests to a codebase
- Implementing a downstream application that involves synthesizing of programs
- Using LLMs to synthesize formal specifications for existing programs, and properly evaluate the generated specifications
- Identifying security vulnerabilities within LLM generated programs
- Implementing the technique in existing papers (refer to Readings)
Note: These are suggestions only. You are encouraged to propose your own ideas based on your interests and background.