Choosing a Branching Scheme for GitHub when Teaching Newbies
There is a commonly-used branching scheme used in professional software development, specifically, a developer creates (and shares) a branch specifically for developing a feature, then issues a pull request into the main line from that branch.
I am skeptical that this scheme is practical for teaching students. Specifically, when teaching students who have never on any computer programming before, it is very important to minimize the additional overhead of version control, especially managing branches.
Which Features of Version Control are Most Useful for Beginners?
What are our objectives in having students use Git(Hub)? While we would want to have this experience acclimate our students to the rhythms they might expect in a professional software development project, this is not foremost went teaching beginners.
The objectives that are foremost in this use of Git/GitHub are, in my opinion, as follows:
- Provide starting code and files to students for assignments when such initial files are necessary or beneficial
- provide a means for the student and the instructor to trace the development of related assignments over time
- provide a tool for the instructor to give useful feedback with in-line comments and direction
- facilitate delivery of assignments to the instructor
- provide a means for the student to recover work in case of errors or system problems
What aspects of Git should we avoid for beginners?
I believe it is important to only use those aspects of Git and GitHub that are critical to accomplishing these objectives. For many students, simply learning how to do computer programming is a heavy cognitive load. Consequently, we should avoid aspects of version control that add to that cognitive load. In particular, we should avoid or minimize the student’s need for these operations:
- Issuing (textual) git commands via terminal/command window
- selective file staging
- multiple repositories
- switching branches
I believe we should strenously avoid having students do the following operations, some of which are challenging even to experienced professional developers:
- merging branches
- attaching submodules
- resolving merge conflicts
- creating branches
- changing upstream/origin
- SSH authentication
Key use cases
- INITIATION Student retrieves starter files for assignment N to his workspace and begins edits and other work.
- SUBMISSION A student performs the work on assignment N and submits it for review and grading via GitHub.
- NEXT SUBMISSION After the student submits assignment N, the student begins to work on assignment N+1?
- PROGRESSION A student has overlapping consecutive assignments. That is, the later assignment requires approved work from the prior assignment. How do does the student use branching simply for the purpose of submitting that work?
Detailed sequence of Progression use case:
|T||student submits the assignment N|
|T+1||student begins work on assignment N+1|
|T+2||instructor returns assignment N to student with required revisions|
|T+3||student fixes assignment N and resubmits|
|T+4||instructor approves assignment N|
|T+5||student applies corrections from assignment N to ongoing work on assignment N+1|
List of possible branches and their meanings
To consider the impact of different branching strategies, we need to have a list of candidate branches. By considering these candidate branches and there usage, we can assess the work sequence, and thus the work load to the novice student, for using particular branching strategies.
In the following table, the name of the branch is essentially a model. While the"Master" branch has a fixed name, the other names are figurative and me, of course, be any actual name. As you may guess, the letters"XYZ" in the branch name are a stand in for the name or number of the actual assignment.
The headings “W?” and “P?” are indicators as to whether a particular branch is intended to be used for ongoing work by the student, or for a delivery or “Pull,” or both.
|==master==||Y||Y||This is the usual Git master branch|
|==ASGN-xyz==||Y||This is a working branch for assignment ‘xyz’|
|==MILE-xyz==||Y||This is a milestone for assignment ‘xyz’|
|==done==||Y||This is a branch to which all assignments would be merged|
|==DONE-xyz==||Y||This is a branch specifically to be the pull request target for assignment ‘xyz’|
|==WORK-???==||Y||This would be a branch created by the student with any arbitrary name the student selects|
|==work==||Y||This would be the branch on which the student would work for every assignment|
List of Possible Branching Schemes
So we have four possible branches that can be used as the student’s working branch, and for branches as the “Pull-To” branch. Now, I am not including the milestone branch in this table.
I am currently using option “C” for my COSC-A211 course at Loyola New Orleans. I have two dozen “pull-to” branches in the assignment repo.
|ID||Working Branch||Pull-To Branch||Notes|
|C||==master==||DONE-xyz||This has been the approach to date in COSC A211. Student does not create/name branches, does not switch branches, does not merge.|
|D||ASGN-xyz||==master==||Student must switch branches. Progression use case would require merges|
|E||ASGN-xyz||==done==||Student must switch branches. Progression use case would require merges.|
|F||ASGN-xyz||DONE-xyz||This branching scheme is arguably the norm in professional software development organizations. Student must switch branches. Progression use case would require merges.|
|J||==work==||DONE-xyz||Comparable to scheme C. Instructor could designate ==work== branch as default, eliminating need for the student to switch branches from ==master==|
|K||WORK-???||==master==||Student would have to create branches, switch, merge. Like F, very typical for professional organizations.|
Four of these options, specifically, A, B, G, and H, are non-starters. Using catchall branches such as “master,” “done,” and “work” be with each other would’ve prevented us from distinguishing work (and pull requests) for different assignments.
Of course, it’s not consistent with professional workflow
When student issues a Pull Request to a later “Pull-to Branch”, the PR has all of the commits from all prior work, making the PR very muddy
Similarly, it’s hard to tune the notifications emails to make them usable.