I recently did a bunch of reengineering of my tools, notably adding a caching layer so you’re spending less time waiting for GitHub to respond with a list of all the repos. I merged together a bunch of separate repos that I had for each tool, and the result is here:
Please play with it and let me know what you think.
- github_clone_all: make a local clone of all repositories matching a given prefix
- github_rate_limit: print your GitHub API rate limits
- github_private_all: make every repo matching a given prefix private (i.e., fix it if you accidentally made them public in GitHub Classroom)
- github_graders: assign student repos randomly to graders
github_event_times: print the
pushtimestamps for each commit in a student repository
Performance details, because it’s fun: I was commonly hitting my head against the GitHub API rate limits, which say you can only consume 5000-ish units of work in an hour or thereabouts, where a given query seems to burn more than one of these unitless work units. Also, you can sit there for a minute or more while the tool is enumerating all of the student repositories, because GitHub’s APIs will only give you 100 answers at a time, requiring you to “page” through those answers.
There are several “search” APIs that promise to do this significantly faster, but I stumbled into an ugly bug, where they seem to return a subset of the correct answers. GitHub hasn’t fixed this yet, but a helpful engineer there told me about using
HEAD requests and checking the
ETag. The idea is that you can cache the results and then use this
ETag thing as a way of indicating whether your cache is still valid. I hacked that together yesterday and it seems to be working.
For my class of 50 students that just wrapped up, the cache file is 2.9MB of JSON data. For my class last fall, with 180 students, with a separate repo per student per week, the cache file is a more impressive 20MB of JSON data and takes three minutes to create. Thereafter, it’s just one of those
HEAD requests to validate it and then everything runs fast.