I’m writing this blog series about what software engineers can learn from spaceflight. You can check out all the posts in the space series (as well as some other space-related code posts) right here.
Today we’ll talk about requirement gathering in the context of…plants!
The CRS-20 mission brought 12 of these things to the space station:
It says Tupperware on it, but it’s not a sandwich case. Florida Field Operations Director Dave Reed from TechShot is showing us a PONDS (Passive Orbital Nutrient Delivery System), designed to answer this question:
How do we water plants in microgravity with no power and no crew interaction?
If you read my blog, you probably implement more software systems than space tools. But, like the PONDS team, you’re building the answer to a question: “How do we _________, given a set of constraints?” Here are some examples of products you might have helped build and the questions they answer:
- Git: How do we organize the iterations of one piece of software, developed over time to have new and different features, possibly by several different developers?
- Swift: How do we make a programming language that reads more like natural language than Objective-C, but makes problems easier to find than Ruby?
- Twitter: How do we minimize the amount of time it takes for two total strangers to develop a personal vendetta? (Kidding. Sort of).
Let’s talk about what it takes to get from the “how” question to the product release.
Enterprise software development in the ’70s and ’80s typically incorporated what we call the waterfall method. In this method we gather all the requirements, constraints, and solution choices up front, and the software gets built once we have all that information and make all the choices about its design. It takes a relatively long time to get to coding, and the initial design doesn’t end up getting implemented anyway if we make discoveries in the implementation process that invalidate that design.
Engineers looked for ways to shorten the feedback loop between implementation and information gathering. They formalized various rapid prototyping methods that made their way into industry in the ’90s. Then in 2001, seventeen software engineers published a manifesto (their word, not mine) that coined a term for this: Agile.
Agile methodologies encourage engineers to begin implementing against a small list of requirements and course-correct as they learn more. They leave decisions to the last responsible moment, when they have the most information to make the right choice. This methodology has characterized most of my software engineering career.
The question: which info must we gather up front, and what can wait?
Let’s look at the initial set of PONDS requirements as an example:
- in microgravity: we’re growing these plants in space.
- with no power: access to power delivery systems on the Space Station are finite.
- and no crew interaction: crew time is at a huge premium on the space station, so the projects that succeed up there consider this as a constraint.
The engineering team actually had far more requirements than this lined up before starting. They knew, based on the size and shape of the devices that astronauts currently use to grow plants on the space station, what kind of area their device needed to fit in. They knew they wanted a design that discouraged algae growth, because—to quote Dave— “Where there’s water, algae miraculously appears” (he was joking). They knew that, without gravity, they needed another way to fill water reservoirs from the bottom up. How many constraints—and which ones—are enough to begin implementation?
Anytime we implement before gathering information, we risk making an inaccurate assumption in the absence of information. Sometimes, wrong assumptions mean we have to redo work and redeploy. The PONDS you see in that picture up there is the third version they’ve sent to space, because the first one delivered too much water to plants, and the second one too little.
The PONDS team is a bit different from a software team, of course. Their release is to physically send an object to space. That’s expensive and time consuming, so they’d better know the constraints as much as possible before they do it.
You and I might experience less pressure to gather all the critical requirements up front. We don’t have to wait for the next spacecraft with room for us when we want to release.
But inaccurate assumptions can cost us a lot, too.
I worked on an app for ramp agents working on the tarmac at airports to scan luggage, pets, mail, military equipment, and hazardous materials into the holds of commercial airplanes. We built this beautiful app. API clients made calls to a server in real time to scan each bag, and they returned updated information about the whole flight calculated on the server with each call. We sent it to the tarmac for user testing. The manager came back with this: while the app is open on the scanner, agents are only spending 10% of that time scanning bags.
So what were they doing the other 90% of the time?
They were walking around on the tarmac, where they could get run over by a plane, waving the scanners in the air. Not wholly unlike this:
Why? Because this was the most likely way to get a network signal, since a network response blocked every single scan. Our whole design assumed reliable internet access, but ramp agents didn’t have that. We had to completely refactor the app to fix it.
What critical constraints do we need to know before building?
This is a complicated question. It depends on the project, the means of deployment, and the risk profile. In general, the costliest inaccurate assumptions I have seen might have been prevented by asking these three questions at the start of the project:
- Under what circumstances might this product save someone’s life?
- Under what circumstances could this product ruin someone’s life?
- How is my client addressing the problem I aim to solve right now—and what are both the strengths and the weaknesses of that solution?
Saving someone’s life: Say I’m building a voice-controlled assistant. If the assistant’s user becomes suicidal and asks the assistant for help, the assistant has the opportunity to save the user’s life. So she should respond appropriately by calling the person’s designated lifeline or at least providing a hotline number. Apple had to issue a fix because this question was not asked during the development of their voice assistant:
Ruining someone’s life: Say I’m building a system that stores half the U.S. population’s social security numbers. If a hacker gets in there, all those people are at risk of identity theft. So my system should always implement security patches immediately upon their release, rather than allow known vulnerabilities to persist in the software.
How is the problem being addressed right now: before my team built that cargo scanning app, ramp agents used a system that couldn’t coordinate weight distribution or hazmat. But it worked offline and it had some accessibility advantages that our own first design didn’t have. Had we gone to the tarmac and used the TC-70 that our app was replacing, our replacement would have worked better after the first try.
What can wait until later?
It’s inevitable: we’re human. We make mistakes. And when we try new things, we learn new things. So we’re likely, in any implementation process, to discover that we need to change things. We try to do that before deployment, if possible, and we try to create adjustable, maintainable systems that prevent us from having to start over if we were wrong about something.
Testing before deployment: It’s quite difficult to test PONDS on Earth because many phenomena in space do not manifest on the ground. A lot of engineering goes into trying to simulate the space environment on Earth. Again, that’s critical to do because deployment for PONDS is putting a thing in space. Nevertheless, it inspires us as software engineers to also care about our testing systems.
Do our tests run locally and on the staging environment? Are we taking measures to test in production as well? These systems can be time-consuming and frustrating to set up, but they can prevent us from botching a release. We talk about that more over here.
Adjustable systems: Plants on Earth look green because their chlorophyll takes in white light (typically from the sun), absorbs the red and blue wavelengths, and reflects the green one. Plants don’t technically need the green wavelength: only the red and blue ones. The Space Station plants can survive with an LED that serves them only the red and blue light, with the green tuned off. This saves power. Great news, right?
The PONDS team thought so. Here’s the rub: people like looking at green plants. And with crew time at such a premium, horticulture experiments in space benefit from the way that green plants comfort people. Astronauts were volunteering their free time to work with the plants. But it felt weird to them that the plants did not look green.
Luckily, this was an easy production fix: the LED board on the space station had the green lights; the team had just turned them off. They turned them on, and voila! Green plants! Had the board been a custom setup with no green light, changing this would have been harder. What can we do to make course-corrections in production easy?
We talked about that over here, but here’s the most relevant slide:
We have to balance how likely it is that we’ll have to change something, and how hard it is to change, against the amount of work we’re happy to do.
In finance, when we multiply the probability of something happening and the cost if it does happen, we call that the expected cost. So in this case we’re balancing the expected cost of maintenance with the cost we’d be happy (or at least willing) to bear.
This piece talks about what questions we can ask to try to strike that balance as software engineers.
Why does this matter?
We want these fast feedback loops so we can arrive at the answer to our question: “How can we accomplish A under some set of constraints B?” But where do these questions come from, and how do we decide they are important?
Why do we care about organizing iterations of our code, or having a programming language that combines some of the advantages of other programming languages? What’s the value of making it easy for strangers to get into arguments? And dingdangit, why do we grow plants in space?
We’ll explore these larger questions in the next post of the series.
If you liked this post, you might also like:
The rest of the space series (which is, of course, ongoing)
This piece that dives into the scipy CSR sparse matrix (perhaps of interest to you engineer types)
This applied data science case study (perhaps of interest to you scientist types)