Part of The Inner Chapters Unbook.
Originally part of podcast episode number eleven.
Dedicated audio available from Podiobooks.
- Really huge pet peeve
- Surprising how few practicing programmers seem to get it
- Learned from a veteran two jobs back
- If I can get you to think about this as you program, you start to think about everything else you do when you're programming
- Academic example
- Instructor dinged a student for not using the private methods he prototyped
- Professor didn't get that functional decomposition, like re-factoring, is a process, not a goal
- Would have been better served to teach his students how to break their code down than to give them an arbitrary target
- Important qualities of programming
- Most non-programmers think it is like math
- Cut and dried
- Computers (hardware) may be like this, but not software
- Software is a complex phenomenon
- Turing halting problem
- The practice more like writing
- Is about communication, if it wasn't, we'd still be using machine code
- Good source code communicates the authors intention
- Good programmers plan a little first
- Good programmers constantly revise or edit (re-factor) their code
- More like an outline than prose
- Not linear, more hierarchical
- Additionally, about managing complexity
- Cannot figure out what a program will do solely from first principles
- Break program down into manageable chunks whose properties can be predicted
- Work top down and bottom up, creating and managing context to minimize complexity and increase understanding
- Most non-programmers think it is like math
- An example, return a library book
- First breakdown
- Find book
- Go to library
- Return book
- Return home
- Further breakdown, find book
- Look in bedroom
- Look in living room
- Re-factor, recurse or iterate
- First breakdown
- Can be applied to any language, even scripting languages
- Start with the loosest description of the program
- Routines should not be more than a screen long
- Any code that starts to interrupt the logical flow of the routine should be extracted into its own routine
- Use language capabilities to manage state
- Introduce parameter objects, structs
- Avoid global variables unless you have no other recourse
- Avoid side effects, comment them clearly when you have to use them
- If you are having a hard time decomposing because of limitations in the language, re-assed your language choice
As promised, I want to talk about functional decomposition, a baby step into talking more in depth about the practice of programming ― something that I've touched upon in some of the news items that I've talked about in the past, some of the other rants and topics that I've spent time on. But this one is targeted much more firmly at programmers or people who are interested in getting into programming, either professionally or in-depth as a hobbyist.
So, ‘Why is functional decomposition an important topic to talk about?’ I guess is the good place to start before we talk about what it is and how it works and how to do it better.
Well, first of all, being a curmudgeon, people who can't do functional decomposition properly, really is a huge pet peeve of mine. You know, code that's just disorganised and incoherent just drives me crazy. And it's really, I guess, kind of surprising in my professional experience, which spans a decade ― if you include my hobbyist level, experience that goes well beyond that, close to two decades ― just how many few practising programmers seem to kind of get functional decomposition, and what really that says about programming as an activity, programming as a discipline, and whether they grok that.
Now, I lucked out, in actually two jobs ago, from a fellow by the name of Brandon, who is ten years my senior in a lot of respects, chronologically as well as professionally ― really sharp guy, wish I had listened to him more when I had the chance, but you know, he has the luxury of telling me he told me so. But one of the things that I did listen to him about, one thing that I think I learned from him very well, was how to do functional decomposition, and how to really think about it, and how it is necessary.
So, if I can in turn get you to think about it, I think that's a good thing. And it gets to a point that I've been trying to make for a little while now, more about just being more introspective and contemplative about the things that I do professionally and technically. So that's part of why I want to talk about this, too. Like I said, if I can get you thinking about this in the same way that I've started thinking about this, then maybe we can move our conversations, our dialogues, to a higher level.
I also have an academic example of why I wanted to talk about this. I have a friend ― and I've mentioned him before ― who has gone back to school, and he has taken some initial programming courses in C++. Now, he had an instructor that ― and we used to chat about a lot of his assignments (I didn't help him do any of his assignments, before you leap to that conclusion). But he had good questions at a higher level that his instructor didn't seem to be answering, about how to go about programming. So, the instructor dinged him at one point for not implementing all of the functions that he prototyped. And this includes private implementation-level functions. And what it really says to me is: the professor, rather than understanding that functional decomposition, like re-factoring, is a process, he saw it as some sort of goal that you could, in essence, get a right answer in decomposing a problem into a program.
That is like, if you can remember back to high school math, if you had a good teacher, they said, ‘I not only want to see the right answer, because that is only part of the story, only part of what what I am interested in, but I want to see how you got there’. In this case, with software, with source code, this is a perfect opportunity for the same kind of thing. You can audit whether they got it right by whether the program does what it is supposed to.
But you can encourage some of that kind of creativity, and there is an opportunity for teaching, getting more hands-on with some of more of those process-oriented stuff, and being able to audit that, too, in a more subjective fashion by reading the source code. So I think that is really just an opportunity wasted there.
So let me back up and talk about the practice of programming just a little bit to help also frame my comments about functional decomposition. I am assuming some of the audience are already programmers. So to some of you this may be blindingly obvious. To some that are maybe on a more green or more junior end of the scale, maybe you have not thought about this as much. And to those who are not programmers, who are either just listening just because you like listening to me (which is kind of a strange thought to contemplate for more than five seconds) or are interested in getting into programming but have not made the leap yet...
To back up: there are people out there, non-programmers, who tend to assume that programming is a very analytical, logical, very cut-and-dried discipline. And, surprisigngly, that is not so. If that were true ― if computer science and programming were more like math, then we would still be programming in machine code. Most of the languages that you are going to encounter today are what we call high-level languages. And, really, if you look back at the evolution of programming, if you look back at the history of languages and how we came from machine code, up through assembly, and through functional, and procedural, and structural programming, and object-OO ― when we pass that high-water mark of high-level languages, that really says something about what programming really is all about and what a good programming language is really meant to do in terms of managing complexity and communication. And the first thing is that...
Let me back up again to that cut-and-dried point and say that computers, the hardware, may be cut-and-dried and may be what people are thinking of as ‘do this’ or ‘do that’, you know: binary. There's no ambiguity. But software is actually ― I guess this dovetails into my point about complex software ― software is actually a complex phenomenon. And there is a thing that, again, some of my audience may be familiar with called the Turing Halting Problem that essentially says, you cannot predict what a program will do, short of actually running it.
You may make a good guess, but until you actually run it, you cannot be entirely, 100% sure. I think actually, more formally stated, it says you cannot write a program that can assess what all other programs do, short of actually running those programs. So it speaks to the fact that software, being emergent ― you cannot sit down and write a proof that says that, ‘oh, this program will do what I anticipate’. You actually have to run it. And that leads me into my other point. It is more to do with communication and managing complexity ― and actually, in practice, it is a lot like writing, that you cannot really understand whether a piece of writing works well until you actually read it aloud or, better yet, give it to a reader or audience, and kind of assess it from there.
And there is actually, surprisingly, in my experience, a lot of striking similarities between the practice of writing ― whether it is a narrative or an essay ― and programming. And, again, this dovetails with my point that programming is more about communication. Like I said, if it wasn't about communication with other programmers, with yourself later in time (a point I will get back to later), we would still be writing in machine code. Machine code is very easy: it's binary. That's what computers understand. High-level languages are not binary. They are really meant for human-readability.
So good source code, then, should speak to or communicate the author's intention, just like a good piece of writing should speak to the author's state of mind or their intention. And high-level languages facilitate that intentionality. There is a new page on my website, cmln.net, selected essays ― I just added a new one by Peter Norvig, who I think is the head of search quality at Google these days ― it's an old essay, been around for a while, but I had not seen it until just recently, called ‘Learn Programming in Ten Years’.... No, that's a good essay, but what I am thinking about is ‘Thinking about Programming’ on that selected essays page, and I think the second or third instalment of that talks about literary criticism and programming. That is my substantiating point. And they talk about the possible, sheer author's intentionality, and then intention captured in the program source code.
Also, like good writers, good programmers plan a little bit first. Two camps, I guess: some of the best writers do stream of consciousness, write as they go. But I would argue that those kinds of writers have such strong native talent or are so well practised at their art that they just don't see the planning, or they are able to do it on the fly. Other good writers will actually outline the story first. They will think about characters and what-not. Good programmers, I think, evince those same kinds of characteristics, that the bulk of good programmers always do a little bit of planning first, no matter how big or how small the task.
So we are not necessarily talking about big design up-front for a huge, six-month software project. It could be a 15-minute console app somebody needs to whack together. Good programmers are going to take five or ten minutes ― whatever is commensurate with the task ― and just kind of think through, ‘Now, what do I want to accomplish, how do I want to structure things?’ before they actually attack it.
There are, I think, genius programmers, if you want to put them that way, that you may not actually see them pause to plan, but as they are writing, they are kind of assembling a plan in their head. And there is some planning activity; it just may not be as overt. Or they are leveraging past experience.
And then, I think the most important similarity between good writing, best writing, and programming, is that good programmers constantly revise or edit their code. And this is what we refer to as re-factoring ― one term for it is re-factoring. Again, it is a very rare writer that the first draft of anything they write is what gets published and what you read. It has been said somewhere, I don't know by whom, that a writer is no better than their best editor, basically, that that act of revising and thinking it through ― scanning it, ‘does it scan well?’, is as much responsible for producing good works of literature as the writing itself. And I think that is very true with programming, as well. That first cut through edit may be logically accurate and correct, but maybe it is not the best, most maintainable code. You are going to go through and revise and edit it down to something that speaks more clearly; it's more simple ― also, less likely for bugs.
So let me jump ahead and, just very quickly, speak to differences. The analogy does break down, that programming is actually more like outlining than like prose, I would say, in that you have kind-of multiple, nested hierarchies. Prose tends to be very linear. Programming: not so much. Maybe if you think about it as, not only writing prose ― this is another interesting analogy that I have been mulling over in preparing for this piece ― programming is like prose, so there is a linear aspect to it, and I guess there is, if you look at it a certain way.
It is like maybe writing prose in a new language that you are making up as you go. So you have to define terms along the way. And so, this gets a little ahead of myself to talk more about what functional decomposition is and how to do it well. But it is good insight, that those definitions are your routines, the functions that you are decomposing, if you will. That, in order to be able to write a linear sequence, you want to use a pithy, you want to use a succinct word, or series of words, or clauses, but each of those pithy little pieces blows out into something much more detailed. And because the computer is so arbitrary and precise, those definitions have to be arbitrary and precise. But you don't want to repeat them in-line. Think about that: if I had to define every term that I use in speaking to you, you would not be able to follow my train of thought. I would be incoherent.
So programming is like that, except that we do have to define a lot of terms, but the technologies allow us to do that in such a way that, once you grok it, you don't have to keep coming back to it. And if you do your functional decomposition well, people will very much have that kind of experience of, the first time they see a function call of yours they go, ‘uh huh?’, and then they go read the function call, and, ‘okay, I understand what he's doing there’. And then they don't have to read your function again because it makes sense. It's internally consistent. It fits well into that over-arching kind of logical flow at that top line of your code.
Even though there is that difference, that even speaks to some of the similarities I think that helps make my point. And then I think also to think about complexity really quickly: the act of making those definitions of using routines to define some, what you want in a higher context to treat as an atomic operation or single building block at a higher level of abstraction, that recurses up and down the scale quite considerably, based on how complex any given piece is. It goes back to my point that you cannot deduce what a program is going to do from first principles, that when you start to combine a level upon level upon level of operations and abstractions, that becomes very hard to predict. Well, functional decomposition can help us with that, too, in the sense that you are managing that complexity, that your decomposition breaks down along several levels, arbitrary levels that you choose, and if you choose them well, then you are dealing with complexity at a particular level as appropriate, and you can deal with those levels of complexity in isolation.
So if you pick good points at which to extract upwards and downwards, then if you are working at a particular level, then ideally, you do not get a lot of bleed from a higher level or a lower level. So it really allows you to scope your brain, if you will, as your are programming, a kind of up and down. And if you do it right, then you can do that effectively. And so, if you are working at a very high level, and you are working more towards the business space of your software, the user space of your software, and you want to think it more on their terms, if you have decomposed your functions well, if you have decomposed a program into a good set of nested functions, you can do that. So at the highest level, you are reading through it, you know, ‘If this, do that’, ‘If that, do this’, and it works well. And if one of those pieces is misbehaving and you have a bug, then you only have to step so far to resolve that. And once you make that step downwards, then you can make that operate at that level really well and not concern yourself as much with the upper level or the lower level. So it just makes life a lot easier, if you want to just think about it that way.
Very quickly, I want to give you a real-world example to get at what I am talking about with functional decomposition, maybe in a more visceral way. Say you need to return a library book to the library. That sounds on the surface like, ‘Oh, you need to put that on the to-do list, it's an atomic thing, it's one thing you do’. But let's think about this. You want to break this down a little bit. It's not atomic. There are several steps that you have to accomplish in order to return the library book.
So, for the sake of discussion, your first decomposition step, then, given the requirement of returning a library book to a library, is that you need to find the book.
The second step is, while you keep the book with you, go to the library.
The third step is that you need to actually return the book, so you need to put it in the return slot.
And then, the last step is that you want to return home, or wherever it was that you originally came from.
Again, we think that this would be enough detail right there to specify the task, but soon we have, somebody that got conked on the head and lost all common knowledge and all common sense. They're going to baulk at the very first step, and they're not going to understand how to pursue that.
Think about, for those parents out there who have small children, think about your experiences at helping them acquire some of these common-sense, common-knowledge operations that allow them to do. You can give them simpler instructions ― when they're very young, you have to give them very detailed instructions ― but they remember. And then you can give increasingly simple instructions as they develop as amateur: very similar process, here, very similar to programming. This is my point.
So we can take the first step of finding a book. We can break that down even further. How do you find a book, then? You might look in your bedroom. You might look in your living room. You might actually start to think here that, hmm, maybe everybody's house is a little bit different, and there is maybe even a more general way to approach finding a book or finding any other kind of object. So this is maybe an example of re-factoring, something that may I will talk about in a future instalment, where you might find some recursive, some iterative way of saying, ‘If a domicile has N number of rooms, you are going to go from room to room, not going back to any room you have already been to, and search through in a systematic way’, yadda, yadda, yadda.
But do you get my point that the simple task of returning a library book... You have one drop-down, you have one level that you can go deeper into that abstraction, to start breaking that down into the next set of tasks. And then for any one of those, you can break that down, and again, there is even another level ― I kind of implied it here with the finding a book ― that searching a room for that book could then become another level of decomposition that you can go down into.
The whole point of this, though, is that once you kind of grok all of this, you don't have to remember it. You just remember ‘Find library book’, and then there is all this implicit knowledge that is tied to this, an implicit experience. So functional decomposition in programming is trying to do the same thing. You do not want to have to explain the same procedure over, and over, and over again to somebody who is reading. You want these procedures to work well enough and to be structured in this kind of hierarchy consistently enough that once somebody has really kind of traversed that hierarchy, traversed enough of that map, then they are going to understand the high, the top-line code really, really well. And then, if there is a bug or a problem, they are going to go down into that, and they are going to be able to understand how to recurse down into that hierarchy that you built out of functions very easily to get at maybe where they think the problem is. Or they are going to be able at least to investigate it in a systematic way that makes sense.
So I hope that makes some sense to you. It made some sense when I was thinking about it, writing it up in the show notes. And all of what I am talking about, all of my talking notes, is usually there in the show notes, if you want to revisit and think about this some more.
Okay, last thing, I promise, and then we are going to wrap up. I will just go over some guidelines real quick, and then we will wrap this up.
First off, functional decomposition, perhaps more so than any other technique I am going to talk about as time goes on, can be applied to really any language, if you think about itb even scripting languages. So this is... before we even get into OO, or Aspect Oriented, or components, or any other kind of hujibu nonsense, any kind of programming that you set out to do, you write the simplest shell script, the simplest automation, functional decomposition is your friend. You know, anything that is more than a single page of code in your editor, functional decomposition is going to help you manage that. If you have to repeat anything, functional decomposition is a good technique for finding where those repeat points are, kind of saving yourself some work.
Start with the loosest description of the program. Start at that top line, ‘Return library book’, you know, start with a functional requirement or a business requirement. And then just kind of break that down into logical steps. Each of the routines that you break down, just a general rule of thumb, any routine that you write ― function, method, whatever ― whatever the programming unit, routine maps in the programming language you are working with, should not be more than a page of code in your editor. Any more than that, and you are not going to be able to read the whole thing, get at your variable declarations and all of your logic all at one go.
So this is a good rule of the thumb. If you find a routine starting to get a bit longer than that, then you might want to start looking through that for logical steps that could be broken out of that and in turn, turned into routines that are called off of that. Any code that starts to interrupt the logical flow at any particular level... So that top line, right? So you find that you are starting to write some iterative code to search through all the rooms in your domicile to find a library book, and that is taking up more room than any of the other steps of that top line. That is a clue that you want to break that out into a routine on its own. And then, use the language capabilities that you have to good effect, to help manage state as you are breaking these routines down.
So, if you are working in a structural or an object-oriented language, use parameter objects or parameter structures. Encapsulate that state that may need to be passed back and forth between different functional levels. I would really strongly recommend that you avoid globals in any form, whether those are static variables or they are true globals that, like in C, unless you can avoid it, just because of the possibility of something else coming along and clobbering the state that is exposed to that global mechanism.
I would also strongly encourage that you avoid side effects in languages that allow that. So any argument that is passed into your routine, I would say 80-90% of the time, don't muck with them. Don't reassign them, don't alter their state, copy them, create a new instance if you want to return that same type or you want to return some state based on an argument that is passed in. Because that screws with both the use of the global storage, global variables, and side effects, screws with that separation between functional levels. So really, works to your disadvantage when you are trying to manage and compartmentalise complexity. It is a way that that complexity, unfortunately, can leak out on you in ways that you did not intend.
These are not absolute rules, of course. Like I said, they are guidelines or recommendations. There are going to be cases when you make a good case for going against these, but I would say: document those very clearly. That's what code commenting is for. It is for when your code needs to violate some rules of common sense, or best practices, as some people like to put it.
The last thing is maybe not as helpful a point, but more high-level. If you are having a hard time decomposing your problem, you may need to think about: is it due to a limitation in the language that you are using? And if so, is it maybe time to reassess your technology decisions. So say you are using a shell script to do something profoundly sophisticated, and the shell language that you are using ― BASH, SASH, ZASH, TC Shell, korn shell, whatever ― is limiting you in some way that maybe a more sophisticated language might not limit you, maybe it is time to re-evealuate, maybe it is time to kind of bite the bullet and say, maybe it was a mistake to do it that way.
I am a strong advocate for recognising when you have golden hammer syndrome, or ‘If all I have is a hammer, everything is a nail’ kind of thing. If you are fighting against your technology ― this is true of anything that I am going to talk about, as far as the practice of programming is concerned ― fighting against your technology, you have to be realistic about, maybe you made the wrong choice. So that is the last thing I have to say a about that.