Detecting Trouble Spots in Programming Homework Assignments

Reading Time: 11 minutes

Hi. I’m Chelsea Troy. I’m a computer science educator at the University of Chicago. I frequently write about teaching. I put it all in the teaching category for easy perusal :).

I taught a Python Programming class in the winter. Now I’m updating the class to prepare for another session in the fall. I thought I’d take you through my process because:

  1. It sheds some light on what kind of work goes into teaching a class, especially outside the classroom.
  2. Y’all have seen me write two case studies (here and here) about learning from large datasets. But this series can demonstrate learning from small datasets (25 students).

This series has a tag called course-overhaul. Check it out here!

The Pedagogical Function of a Homework Problem

Why do we assign homework problems? To get students to practice, sure. But what makes a practice session effective? My answer to this question has two influences:

  1. Reality is Broken, a book by game designer Jane McGonigal that proposes a role for game design in skill acquisition
  2. Seven years of functional fitness training in which I’ve tried out dozens of ways to acquire various advanced physical skills, only some of which have worked.

I’ll skip past how these two influences impacted my current understanding: suffice it to say that, to me, an effective practice problem:

  1. Includes a clear context that connects the skill to a practical application. Example: a problem about calculating graduated income tax, a final project in which the student chooses the subject matter
  2. Gives the student room to experiment with different approaches and understand the reason for choosing one over another. Example: instead of a problem that demands a hashmap-based solution, a problem in which a hashmap-based solution is by far the most concise and maintainable, such that the student would not want to choose, say, an array-based solution.
  3. Provides enough work volume to ingrain the material. In particular, I try to hit a homework volume of a median six hours per week (plus three hours of synchronous instruction, and not counting asynchronous lectures).
  4. Does not provide so much work volume that the students resort to substandard solutions or copy/cheat their way to answers instead of taking time to think. In particular, I try to hit a homework volume of no more than a median of nine hours per week, with a maximum of 12 hours per week for the students who take the longest.

Let’s look at some data on the practice problems I assigned in the winter to see how they stacked up against these goals.

Identifying Trouble Spots in the Homework

For each homework, I ask my students how long each problem took in minutes. So the following chart for homework 1 shows one bar for each student that represents each individual problem with a different color.

The first thing I notice, looking at this, is a low overall volume. I’m eyeballing this for a median of maybe 300 minutes (5 hours), with a third of the class under 4 hours and only two students breaking 9 hours (one of whom evidently devoted six hours to git troubles. This student also would have come in under four hours if they had attended office hours and gotten help with the git thing). This means I can increase the homework volume in this week. I have some git exercises that I assigned in homework 8. I’ll move them into this first homework, as the number of git frustrations students experienced throughout the course suggest that early, graded attention to git will help them a lot).

Now let’s look at the individual problems in this homework: the standout here in terms of time spent is problem 5. That’s not necessarily a bad thing: sometimes a problem requires more thinking. The computational thinking is the desired stimulus. However, a homework problem can also take a long time for lots of reasons related to undesired stimuli. For example, when I ask students about unclear and misleading instructions on this homework assignment, I see that many students bring up problem 5:

This indicates to me that problem 5 either needs clearer instructions, or it needs to be reworked. In this particular case, I already didn’t like problem 5. This problem asks students to build something from scratch that could be done in just a few lines with the datetime library, except that the instructions say “Don’t use the datetime library.” (Pythonistas use the word ‘builtins’ to describe the language’s standard library. I will use the term ‘standard library’ instead of ‘builtins’ to be clear for CS instructors who do not write Python).

I don’t like homework problems where I have to rely on unrealistic rules to get the appropriate stimulus. I would rather:

  1. Design a problem for which the primary stimulus comes from something other than reimplementing a standard library (so that no well-known import would make the problem trivial), or
  2. Design a problem that deliberately replicates (rather than uses) a component of the standard library.

The first option mimics the work experience of an end user app dev. The second option mimics the work experience of a programming language maintainer. Asking someone to solve a problem where an end-user app dev would apply (not build) a part of the standard library and then banning the use of that library mimics no one’s work experience and therefore fails my standards for a maximally useful stimulus. I’m switching this problem out for a different one in the fall.

Now that you’ve seen one full analysis, I’ll speed through some similar ones.

Here’s homework 2. Once again, low overall volume. Once again, clear standouts in problems 2 and 3. In these cases there were also some instruction clarity issues, which I’ll address for the fall. I’m keeping in mind that, if I want to move some homework volume up, I have room in Homework 2.

Homework 3 also had a very unpopular problem 5:

In this case, problem 5 asks students to reimplement the @lru_cache decorator in the Python standard library. While I don’t dislike the question on its premise, I question its utility: a number of core Python maintainers to whom I spoke about this class expressed surprise that beginner students were even introduced to decorators, let alone asked to implement one.

If the goal were to teach these students Python such that they might write Python in a job, I’d agree with that assessment. However, the goal is to use Python as a substrate to teach these students about programming more broadly. To that end, I’d like to introduce them to some functional programming constructs. Python is, to be candid, not a good medium for this type of instruction. I mean, come on: the language maintainers devote space in several PEPs to expressing outright disdain for functional constructs. So I can’t be picky in Python about which functional constructs I introduce due to the fact that there are so few (no, merely having a map function does not make a language functional). So I’m going to keep decorators in the class. However, I think asking students to reimplement @lru_cache is too hard with no benefit. In addition to the individual drawbacks of “too hard” and “no benefit,” combining the two is just begging students to cheat. I will switch this problem out for a different decorator with more obvious general-purpose utility. I’ll also be sure it doesn’t include twists that I couldn’t reasonably expect a person with three weeks of Python to work out.

At this point in the class, I am deliberately bringing down the median time spend on homework. This is the week when my course staff approves students’ final project proposals and plans, meaning they should begin executing on the plans. Almost everyone’s project included at least one potential integration with a service or library that they did not know well or that might not work, so I wanted them prodding those integrations heavily this week and finding backup integration options if their first choices didn’t work out. So I adjusted the homework volume to allow for that work.

Problem 4 took people some time, but I saw no indication in the short answer questions that this one had an instructions problem. I’ll keep this one: in this case, I think the stimulus is appropriate.

In homework 5, I introduced students to the fluent interface, and they implemented one for themselves. As I mentioned before, by this point in the course, students are implementing their final projects. I trust these projects to provide them with the volume they need for an appropriate practice stimulus. The homework, therefore, focuses on providing a small volume of novel technical challenges.

Overall volume on homework 6 matched my target and the short answer responses produced nominal (in fact, very positive) results. This homework asks students to watch this lecture, essentially, and then presents them with a toy data analysis problem having to do with toxin readings in a collection of water sources. Students evaluate the data and write up recommendations based on their findings.

In homework 7 I ask students to watch three tutorials: one about making a REST API in Fast API, one about testing endpoints in Fast API, and one about building a few test-driven in-memory datastore methods in Python. I then ask them to implement a REST API in Fast API in which all of the endpoints are tested and all of the data lives in a tested in-memory data store implementation. I have no idea what happened with those students who logged almost zero time on this exercise. Note for next time to check this one immediately after its due date and reach out to students whose results aren’t nominal.

So homework 8 is a bit of an “odds and ends” homework. Students receive a couple of debugging challenges to practice their debugging skills, plus a couple of questions about git. At this point in the quarter, the students are putting the finishing touches on their final projects. I’d like to add something even more advanced in here, and I have the room from a volume perspective, even if I want students to have time for the final projects.

This is the homework from which I intend to pull the git question and place it in the first homework.

So there’s the data (or at least, enough of it that you understand some of my reasoning. I didn’t want to overwhelm you with every student response for every single homework).

Planned Improvements

Based on the above analysis, I made myself the following to-do list:

  1. Change the datetime question (HW1 P5) to the voting methods question and write hw solution and update rubric for hw1
  2. Homework1_problem6
  3. Homework2_problem1
  4. Homework2_problem3 
  5. Homework3_problem5 (lru_cache—replace with exercise at the end about caching a common request for some specific birds or something)
  6. Homework5_problem2
  7. Homework5_problem3
  8. Homework8 git problem (specify that they don’t open a PR with git itself)

I’ll be happy to share some of the changes with you in more detail as I make them, but this post is getting long, so I think we’ll call this the “analysis” post and move on to implementation in future posts!

If you liked this piece, you might also like:

The debugging posts (a toolkit to help you respond to problems in software)

The Listening Series (Prepare to question much of what you know about how to be good at your job.)

Skills for working on distributed teams (including communication skills that will make your job easier)

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.