Best practices for completing CS107 assignments

Written by Julie Zelenski

We keep your pipeline filled with regular programming assignments, and most of your learning will come from the efforts you expend in completing them. We want to share some of the best practices that will help you achieve successful results on your programs with minimal loss of sanity.

Getting your "C legs"

Syntactically, C is not a large jump from Java or C++, but it has vast philosophical differences. Intended for professional use, C is designed for high efficiency and unrestricted programmer control, with no emphasis on safety and little support for high-level abstractions.

Decomposition and incremental development

Our CS106 courses emphasize the value of good decomposition, yet the take-home for some students has been that decomposition is something you do when finished coding to appease your grader. Attempting to get the program working using a chaotic, sprawling main function and then spending even more time to decompose it afterward is utter foolishness. Decompose problems, not programs. Decomposition is not the frosting to spread on the finished cake, it's the tool that helps you get the job done. Starting with a good decomposition, you'll have an easier path through coding, testing, and debugging, and you'll spend less time at it.

Your initial work should be in design space: decomposing the problem from the top-down into sub-problems that you further decompose as needed. Sketch each function's role and have a rough idea of its inputs and outputs. A function should be designed to complete one well-defined task. If you can't describe the function's role in a sentence or two then maybe your function is doing too much and should be decomposed further. Commenting the function before you write the code may help you clarify your design (what the function does, what inputs it takes, and what outputs it produces, how it will be used). Pushing yourself to be specific now will force you to state your assumptions and resolve ambiguities earlier rather than later. Exploit opportunities for code unification, a sufficiently general function can handle multiple use cases within the program.

When ready to implement, write one function at a time and thoroughly test before moving on. To test, you might need to write code to specifically exercise the function (this may be dead-end code that is later discarded), create sample input files, and/or run under gdb and Valgrind to look for problems. Thorough testing gives you peace of mind that further functions can build on these pieces with confidence, rather than adding another floor on what amounts to a house of cards.

A corollary to this is my suggestion: always have a working program. Add features to your program one at a time, testing until complete, while verifying no regressions have been introduced. At a given point, your program may not cover all requirements, but the existing code is correct and can be verified as functional. That is vastly preferable to a program that attempts everything yet succeeds at nothing. It is much easier to extend a working but incomplete program than to fix a bug-riddled "complete" one.

Testing

Testing is an incredibly important skills for all programmers and we intend for you to become proficient at it. Testing is not something separated from programming, it is integral part of the development process. We provide some sample inputs and the simple sanity check to get your testing started, but these are intentionally insufficient. Your efforts are needed to brainstorm and identify additional cases that need to be tested, devise test inputs, and monitor your progress toward satisfying all test cases. Go check out our advice on software testing for a plentitude of ideas about testing tactics and strategies. We're especially keen on short-cycle test-driven development!

Debugging

Many students have up to now done all their debugging via print statements. That works for simple cases, but becomes unwieldy as programs get larger and have more complex interactions. Now is the time to invest in mastering the powerful tools provided by a debugger. During development, you may want to always run your program under gdb (CS107 guide to gdb), so that when the unexpected hits, you have the ability to poke around and get information to better understand the program state.

Successful debugging depends on a careful and systematic approach.

  1. Observe the bug. If you never see the bug, you'll likely never fix it. Another reason you want comprehensive testing!
  2. Create a reproducible input. Creating a trivial input that reliably induces the failure is a huge help.
  3. Narrow the search space. Studying the entire program or tracing the execution line-by-line is generally not feasible. Some suggestions for how to narrow down your focus :

    • Start where your intuition believes is the likely culprit, such as a function that recently changed or one you find suspicious.
    • Use binary search to dissect. Set a breakpoint at the midpoint and poke around to determine whether the program state is already corrupt (which indicates the problem is in the front half) or looks good (so you need to focus your attention on the back half). Repeat to further narrow down.
    • Run under Valgrind to identify the root cause of any lurking memory errors.
    • Use gdb conditional breakpoints or watchpoints to identify the point where data is first noticed to be corrupt.
  4. Analyze. With only a small amount of code under scrutiny, execution tracing becomes feasible. Use gdb to see what the facts (values of variables and flow of control) are telling you. Drawing pictures may help.

  5. Devise and run experiments. Make inferences about the root cause and run experiments to validate your hypothesis. Iterate until you identify the root cause.
  6. Modify code to squash bug. The fix should be validated by your experiments and passing the original failed test case. You should be able explain the series of facts, tests, and deductions which match the observed symptom to the root cause and the corrected code.

Do not change your code haphazardly. This is like a scientist who changes more than one variable at a time. It makes the observed behavior much more difficult to interpret, and tends to introduce new bugs. That said, if you find buggy code, even if it is not obviously related to the bug you are tracking, you still might want to make a detour to fix it, using a reproducible input to trigger that bug and validate its fix. That bug might be related to or obscuring the original bug and it's good to remove any source of potential interface.

Use Valgrind early and often

We run submissions under Valgrind during grading to report on memory errors and leaks. Some students have the impression that Valgrind is merely a final "double-check" on a finished program. Nothing could be farther from the truth. Doing regular Valgrind runs is an important part of your testing coverage. Valgrind reports on two types of memory issues: errors and leaks. Memory errors are toxic and should be found and fixed without delay. Valgrind can be a huge help with this. Memory leaks are of less concern and can be ignored early in development. Given that the wrong deallocation can wreak havoc, we recommend you write the initial code with all free() calls commented out. Much later, after having finished with the correct functionality and turning your attention to polishing, add in the free calls one at a time, run under Valgrind, and iterate until you verify complete and proper deallocation. (CS107 guide to Valgrind)

Mercurial: the power of undo

Using the revision control system may first seem more impediment than benefit. But the day you accidentally wipe out a critical file or make a last-minute change that introduces an evil bug, you will be eternally grateful for the "undo" capability provided by maintaining a revision history. Even without such catastrophes, revision control allows you to monitor your progress, review changes, try experiments that are easily backed out, and manage your efforts with more effective organization and less room for error. Adopt the habit to commit very regularly--- after making a critical fix, adding a new feature, when pausing for a snack break, and definitively at the end of each work session. Having this complete audit trial can also serve as an insurance policy should something go astray in your submission. We can grab a previous version from your history and confirm its provenance, but if there is nothing in your history to return to, you're stranded. Every serious project is managed under revision control or should be. (CS107 guide to Mercurial)

Write the high-quality version first (and only)

When faced with a challenging programming task, it can tempting to first hack together a low-quality solution where you use little/no decomposition, use one-letter identifiers, hard-code magic numbers, and copy-and-paste code, all in slapdash effort to get something working. After much iteration and debugging, you eventually get the functionality together, at which point you go back to clean up the decomposition, choose better names, unify common code, and so on to get the program up to the "A" level. This strategy has been tried quite a bit, and it doesn't work. It's easier and takes less time to write it at the "A" level right from the start. Well-decomposed, readable code is easier to write, easier to test, will have fewer bugs, and what bugs there are will be more isolated and easier to track down. Realistically, most of your development time is not going to be consumed by typing in long identifier names or writing two 10-line functions instead of a 20-line one. Write it once, and write it right. (Read Nick's awesome Landmarks in coding quality).

Although it may go without saying, the same reasoning justifies why you should strive to write functionally correct code from the get-go. When faced with a question (do I need to use calloc instead of malloc? do I need a +1 or -1 on this calculation? should this void * be cast to a char * or a char **?) you could make a quick guess and figure you'll find out later if it wasn't right. "Throwing code at the wall to see if it sticks" is not a effective strategy! ! If you happened to guess right, consider yourself lucky, but did you learn how to approach that decision so you can make the right choice in the future (say on the next assignment or the exam)? Worse, if your guess was wrong, how and when will you discover it? Maybe the bug will just lurk there undiscovered (until the autotester sniffs it out), or perhaps extensive testing/debugging/Valgrind will eventually lead you back to the incorrect passage, either way leads to much suffering. The truth is that the simplest and best time to get the code functionally correct is when you are writing for the first time. When the code you are writing involves something tricky, take the time to think it through -- draw some pictures, review the underlying concepts, ask questions about anything unresolved, whatever it takes, so that when you are writing that code, you understand the how/why of each step, you can accurately predict the behavior, and you feel strongly confident that it is correct. (But still test it... anyone can make a mistake!)

Healthy working habits