The Case of the Failing Upload, Part 2

Reading Time: 8 minutes

In part 1 of this walkthrough, I introduced you to an app designed to help researchers run earth science projects on the Zooniverse. We discovered an upload issue with the app, began investigating its cause, and confirmed that we cannot fetch unpublished projects unless we’re authenticated as one of the project owners.

We also summarized our approach to debugging thusly:

We programmers tend to think in building mode. But while we’re debugging, we often get more mileage for our time spent by switching to investigating mode.

To that end, throughout our debugging session, our question has not been “How would I solve this if I’m right about what the problem is?”

But rather “How little work can I do to confirm that I’m right about what the problem is?”

The difference here saves us time every time we’re wrong—which is a lot more of the time than we realize.

Today, we continue capitalizing on that approach to dive deeper into the code and determine how we might get upload working.

We have learned that I can fetch my example project by instantiating a Panoptes client with the username and password that I used to create it. The problem: that’s not how we authenticate ourselves with Panoptes in client applications. Instead, we do it with something called social auth, which is built on top of an authentication protocol called Oauth2. What does this do? It allows Theia to upload images to the Zooniverse on behalf of Panoptes, so that researchers don’t need to make separate credentials for Theia and Panoptes doesn’t have to hand over authentication information to another app.

Theia already allows researchers to authenticate through social auth. The code lives in a file called panoptes_oauth2.py. Once a researcher authenticates successfully, Panoptes provides a bearer token and a refresh token. The bearer token is a string of letters and numbers that Theia can send up with a request to prove that it has authenticated. This bearer token works for a while, but expires after a set period of time. The refresh token can be used to forestay the expiration. Here’s where we assign those tokens to the client:

Screen Shot 2020-01-15 at 3.45.39 PM.png

Theia’s upload step does not use the authenticated client to find projects. At least, we don’t think it does. But we can do a quick and dirty experiment to find out what would happen if it did.

Step 1: Print out the bearer token, refresh token, and expiration.

Screen Shot 2020-01-06 at 1.21.20 PM

Step 2: Run the app locally, visit it in the browser, and go to the homepage, where we can activate social auth by clicking “Login with Panoptes” provided I am logged into the Zooniverse in this browser (which I am).

Screen Shot 2020-01-06 at 1.25.06 PM

Here’s how it looks when we have successfully performed the requisite click:

Screen Shot 2020-01-06 at 1.25.00 PM

Yes, it’s very bare bones. It is working, so for now this is fine.

Let’s check our run logs to see if we got our tokens printed:

Screen Shot 2020-01-06 at 1.24.42 PM

We did! (Nota Bene: by the time this post publishes, this token has long since expired. Something still felt weird to me about publishing it in a legible state, so I slapped red zigzags on it. Security achieved ;).

So, if we tried to fetch a project with the fully authenticated Panoptes client from the social auth utility, would that client be able to fetch it? Let’s find out by inserting a Project.find into the authenticated client:

Screen Shot 2020-01-06 at 2.17.19 PM

If this works, we should be able to log out of Theia, log back in, and then see the project name “Example Image Project” appear in the running logs. And lo and behold:

Screen Shot 2020-01-06 at 2.17.33 PM

There it is!

This begs a question: why do we need to make a new Panoptes client when it’s time to find the project and upload the images? Could we instead save the authenticated client and use it throughout the app?

To find out, I turned back to the source code for the API client. Reading the source code of my dependencies has saved me a lot of time and frustration during debugging processes; it’s second only (and even that’s a maybe) to reading and studying the exception itself.

Screen Shot 2020-01-08 at 12.00.41 PM.png

The Panoptes object appears to provide a mechanism called .connect to allow us to initialize it only once. However, for us to persist this individual object across tasks that are spun up on a cluster of worker threads, we’d have to somehow store it in the database—even though it’s not a data object. Do we need to introduce the concept of a Model Object to our app? This approach feels like a dead end.

In fact, the whole construction of this API client differs quite a bit from integrations on mobile apps I’ve done. In those circumstances, typically I’ve saved off the tokens in some kind of Session object and then pulled them up whenever I needed to make API calls. Any API client I used, if I even had one, provided no more than a thin layer of syntactic sugar over the HTTP integration API, maybe with some endpoints baked in.

Could we do that? Could we store tokens as data objects, pass them to a new API client in the upload task, and have the whole thing work?

In order to build that, we’ll need to make a new model and a new database table. We’ll need to think about whether to host an ever-growing database of mostly-invalid tokens.

But we’re not in building mode right now: we’re in investigation mode. Which means the important question for us right now is this one: How little work can we do to confirm that we’re right about what the problem is?

Let’s do something quick and dirty to determine if this would even work. What if we manually copy and paste the tokens into a new Panoptes client, then see if we can fetch my project with it?

Step 1. Print out the tokens from the social auth step.

Screen Shot 2020-01-13 at 4.05.35 PM.png

2. Let app refresh, log out, log in, check for authentication details:

Screen Shot 2020-01-13 at 4.03.46 PM

3. Make a new debugging utility method specifically to confirm that a new Panoptes client with these same details will make successful authenticated requests. I put this file adjacent to the upload file, but I did not commit it. We’re only using it for testing.

Screen Shot 2020-01-13 at 4.05.54 PM

4. Open the console and use the utility method, passing in the copy-pasted values from the app’s run log for the bearer token, refresh token, and expiration.

Screen Shot 2020-01-13 at 4.05.09 PM

Ta-da! There at the bottom, we see “Example Image Project.” This makes me reasonably confident that saving our tokens in a data object and making a new client with them at upload time will work for us.

Now that we’ve confirmed an approach to fixing our problem that’s likely to work, we can close our case and switch out of investigating mode, back into building mode.

But here’s the important part of this debugging walkthrough:

Debugging insidious issues requires us to admit that we don’t fully understand the code we’re debugging, then switch from our fast, execution-oriented building mode into a more cautious, more measured investigating mode. We make debugging unnecessarily hard for ourselves when we don’t do this. To feel happy and confident while debugging, we need to improve our skills at:

  1. Identifying our assumptions
  2. Asking questions about those assumptions, and
  3. Checking (rather than assuming) the answers

And the more insidious the bug, the bigger the advantage that we earn from having developed these skills.

If you liked this piece, you might also like:

The rest of the debugging posts (including a pairing session, illustrations, and exercises!)

The series on reducing job interview anxiety—no relation to this post, but people seem to like this series.

This talk about the technology and psychology of refactoring—in which you’ll hear me explain some of what you see me do when I code.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.