Why Procedural is the Wrong First Paradigm if OOP is the Goal

Joseph Bergin

Pace University

Abstract: Some educators [see Gibbons, for example] have been advocating teaching procedural programming before introducing object-oriented programming (OOP) into the curriculum. They seem to have the opinion that OOP is an add-on that can extend the procedural paradigm. This paper attempts to explain why that is not the case and why we do our students a great disservice by adopting this view. My conclusion is that procedural programming is no more a good introduction to object-oriented programming than it is to functional or logic programming. I will also try to give some practical advice on how to organize an object-oriented course for novices.

I will assume that for whatever reason, whether marketability of the students, or just the fact that by providing more opportunities for abstraction OOP is a better educational tool, you have decided that eventually your students should know OOP. Given that, I believe that you should start with OOP, using some object-oriented language, and more, that you should teach a smattering of object-oriented design along the way.

The implication of this is that you should teach classes very early (first), and that you should teach inheritance fairly early (somewhere in the first course). If you think that this is too complicated to do in a complex language like Ada or C++, then choose a simpler language like Java, Eiffel, or Smalltalk. (Beta, Oberon 2, and Modula-3 are other less well-known choices.)

The Challenge for Educators: In the past we taught procedural programming and many of us learned to do it well. The evidence from current textbooks is that we have not yet learned how to efficiently and effectively teach object oriented languages and systems. Doing so requires understanding of some underlying principles that greatly affect the pedagogy we must use. The ideas herein are intended to guide educators in developing curricula for the near term future in which we choose to use languages like C++ and Java as the primary programming languages for student projects.

Procedural Programming: Pascal and Modula-2 were designed to fit as well as possible with the procedural paradigm. C is less well suited. The procedural paradigm is also sometimes called the procedural top-down method, which I will now attempt to explain as a means of contrasting it with the object-oriented paradigm.

In top-down programming you conceive of the program as a whole as a process to be discovered. You think of the problem as a whole as a process to be decomposed. You decompose the problem into sub-problems with the following three characteristics:

  1. The sub-problem stands alone as a problem in its own right
  2. The sub-problem appears to be solvable, though you may not have an immediate solution, and
  3. If all of the sub-problems in the decomposition are solvable, then you know a composition that will put the solutions of the sub-problems together as a solution of the original problem.

The overall methodology is to apply this technique recursively to the sub-problems, until step 2 degenerates to "I have a solution to this problem". You then write a procedure or function to solve the problems at this level, and then apply the compositions implied by step 3 to get a procedure or function that solves the larger problem.

The nested function feature of Pascal and Modula-2 make these languages especially suited to this task. The solutions to the sub problems are themselves nested within the solutions to the larger problems. C does it less well, as functions can’t be nested in C and so the program structure doesn’t match the problem structure in quite as obvious a way. This means that C is less well suited to information hiding and encapsulation than Pascal and Modula-2. A consequence of this is that the solution is tree-like, with each node in the tree being a procedure or function. We shall see that an object-oriented program has a very different structure.

One of the things not emphasized by the above methodology, but critical to the development of good software, is that the amount of information communicated between the various functions and sub functions needs to be kept small. The information should also be passed explicitly as parameters, and there should be no information flows that do not match the links in the tree. In other words sibling nodes in the tree should have no communication. Since the procedures and functions are what you concentrate on in the process this matching of the data requirements may be hard to do in practice. When you find that you don’t have appropriate data flows between the parts, you sometimes need to rip out large parts of the proposed solution and replace it with one that has better data coupling.

In any case, the lesson to be learned from procedural programming is that the nature of the solution to a problem is that it is a process or a set of actions. But we know that a "program" takes more than processes. A program consists of both process and data. In the words of Niklaus Wirth, the developer of Pascal, Modula-2, and now Oberon: Algorithms + Data Structures = Programs. Procedural paradigm focuses primarily and first on the algorithms and then makes the data-structures fit the processes so developed. This sometimes works well and sometimes fails. We shall also see that OOP turns this focus around.

To look more deeply at top down programming, we can ask, where is the data in such a program. The correct answer to this, assuming that a language like Pascal is being used, is that the data are local to the functions and encapsulated within. That is how we achieve the proper flows of information along the arcs of the tree. At the global level of the program we define types and functions, but little or no data. To do so makes the data too easy to change from throughout the program, making error finding too difficult in practice. It also makes some names too visible, meaning that those same names can’t be effectively reused without hiding the data that they represent.

Maintenance is also a problem with this methodology as the most important reason for program modification is that the problem for which the solution was designed will change over time. It is difficult to predict what forms these changes will take and therefore difficult to design the original solution so that the effects of change will be limited to one or a few of the sub-problems. If we guess wrong, we may need to replace large parts of the solution, noting that whenever we replace one node, we probably need to replace or modify all of its sub-nodes in the tree. In object programming, when the class structure matches the system modeled, this is less of a problem as most system changes involve one or a small number of system components. Only these components of the model need to be modified. There is some evidence [Gabriel] that the standard waterfall model in which the problem solution proceeds from analysis to design to construction to maintenance is not successful in large projects and that such projects are best build by iteration from a small, well-designed core of functionality. This means that the original development cycles and the maintenance cycles are indistinguishable from one another.

One fairly typical component of a beginning course of programming using the procedural paradigm is a discussion of the von Neumann machine architecture. Students programming in this paradigm are encouraged to think of the RAM of the machine as holding both programs (procedures) and data. The program proceeds by transforming the data from an initial state into a final state. Emphasis is put on the state changing statements, namely reads and assignments. The RAM is monolithic. Data is sent to the CPU to be processed. This simple machine model fits well with the procedural paradigm, but less well with other, more abstract, ways of looking at computation. There is a simple relationship between the physical level provided by the von Neumann architecture and the virtual level provided by most procedural languages. This is just not the case with the functional or object-oriented paradigm. The functional paradigm, of course, completely hides the underlying physical architecture. The object-oriented one doesn’t hide it, but turns it on its head. Instead of the data being moved to the CPU for processing, a very common metaphor in OOP is that the CPU moves to the objects to provide processing power. As we examine OOP further, we see again and again that OOP is not an extension of the procedural paradigm, but rather it turns the thought processes completely around.

Object Programming : Now let’s look at object oriented programming, especially as exemplified in a pure OOP language like Smalltalk or Eiffel. When using these languages, you almost never think of a program as a process, and you almost never think about writing one function (method, technically) by decomposing it into simpler functions. Instead, while the procedural paradigm focuses first on the algorithms, the object paradigm focuses first on the data. However, you must interpret that correctly. I don’t mean that you design the low-level data elements first. Instead, what is meant is that the objects in the system themselves are thought of as data (high-level data) that encapsulate the algorithms as well as the low-level data. The encapsulated processes are designed as a secondary consideration. The program itself is seen as a web or graph of interacting elements, each of which acts as a server for some piece of information or other service. Each object can also act as a client of other objects when that object needs to collaborate with those others to achieve fulfillment of its own service. In a single processor system, the flow of control (the processor) moves with the service requests (messages) so that when an object makes a service request of another the requestor waits for the service to be completed and any results to be returned. The processor moves to the server so that it can fulfill the request. When the server completes, it passes any results along with the processor back to the client who requested the service.

The methodology for constructing an object-oriented program is to discover the objects first via a process of simulation. The software objects in the system should model real world objects in the system being modeled. This correspondence between the software system and the real system makes it easier to verify the correctness of the result than is typically the case in a procedural program. But note that what the programmer does is very different here. As an OOP designer, you are looking for elements to model, not procedures to decompose. One looks for objects that exhibit active behavior rather than being acted upon. These behaviors represent the services that the object must provide. If something has no behavior, it probably isn’t an object.

Once you have candidate objects, you assign them responsibilities. These again are the services. Once you know all of the interactions in which a given object will act as the server, you know the complete set of responsibilities of that object. If you have several objects with the same behavior, you have a class that will describe those objects. You are now in a position to define the responsibilities of your objects and to model them with methods. Note, however, that this step is not the first step, but the second. Contrast this with the procedural paradigm in which the processes come first.

As a third step, you need to decide, for each object, what information needs to be maintained within the object so that the service may be performed. Primarily this is the data that needs to be maintained while no service is in process, unlike the data within the methods that is only needed while that method is actually executing. This data represents the instance data of the class (the low-level data). If the objects of a class need to share information among themselves you also need to define this class level data, called static data in some languages like Java.

To conclude this part of the discussion, a procedural program is like a tree of functions and an object program is like a web of clients and servers. These are very different of course, and to be an effective programmer of one kind or the other, you need to be comfortable with the basic job you are trying to do. The view of the nature of the computation (tree vs. web) needs to be natural to your thinking processes so that your typical thought processes lead you to one set of solutions rather than to the other.

This, then, is the nature of the dreaded paradigm shift that procedural programmers go through when trying to become object programmers. There is nothing especially complex about OOP, any more than there is anything complex about procedural programming. It’s just that the world looks completely different in the two paradigms. The experience of the industry is that an experienced procedural programmer will take a year to 18 months to make the switch [Stroustrup, pg 172]. Lattanzi and Henry [Lattanzi] also report on the difficulty of teaching object-oriented principles to students experienced in the procedural paradigm. While the programmers are in this learning mode, they will naturally try to solve problems by decomposing functions and not by discovering objects. Whenever the going gets hard, they will fall back on what they know best–procedural programming. It takes a while for the mind to become re-wired to the new way of thinking. If fact, during this year, the practitioner is likely to build really ugly programs, mixing techniques in an awkward way.

This is why I think it is a mistake for educators to try to teach procedural programming first, if the goal is object programming. The better a job you do in the beginning, the harder it will be for your students later, because their natural problem solving skills will all be wired to look for solutions in the processes and not in the objects. If you do a really excellent job of teaching them to think like a procedural programmer, they will face this 12 to 18 month paradigm shift. I don’t know where to put this year of confusion in a four-year educational program.

I find that it is easier for students to learn pure functional programming (say in lisp) after procedural programming than it is to move them from procedural to object-oriented programming. This is because OOP languages all contain some procedural core, and some OOP languages like C++ actually permit you to avoid the object features altogether, so that you aren’t forced to train your mind. Lisp offers few comforts to the procedural programmer. You just have to plunge in and forget about variables and loops and think in terms of recursive functions. This also explains why C++ is probably not the right language for an experienced C programmer to use to learn OOP, in spite of the familiar syntax. The language does nothing to help guide your thought processes.

I also think that it is easier to move from the OO paradigm to the procedural than the other way. This is because you do learn about functions in OOP. Good procedural programs can be constructed by OOP programmers because they will naturally partition their programs into object-like units of functionality even though they don’t have the encapsulation tools that they are used to in the OOP world.

Course Suggestions: One benefit of OOP is that it gives the instructor tools with which to manage the conceptual load on his or her students. One can construct worlds with which the students interact and within which they program, and that reveal only the details that are important to the educational process at the current level of student understanding. Students don’t have to be exposed to all of the perhaps messy details of a complex language initially.

Let me give an example stolen from an excellent introduction to computing (using Smalltalk) that is delivered by the Open University in the United Kingdom [Woodman]. Before the students write or see a single line of Smalltalk, they interact with a program called FrogWorld. In this program you can create frogs (objects) and make them jump. You can make them jump (send them messages) by either clicking on a button (object) or by typing a line of text ("Freddy jump") into a textbox (object). This text is actually a statement in Smalltalk as is later pointed out. A later exercise lets you look into the implementation of a frog (its instance variables). The program was built with the Model-View-Controller idiom as is also later pointed out. This permits decoupling of the model (frog) from its views (GUI and textual). But the important thing here is that the instructor controls how much detail the student must deal with at any moment by making a world that requires thinking about a few things intensely, but not about everything simultaneously.

These "micro worlds" can also reinforce the mental model that is so important is helping a novice become familiar with a new and complex system. Here the mental model is the client-server interacting web of objects that was discussed earlier. This needs to be done in any paradigm, of course, but OOP gives us good modeling tools for assisting in this effort.

You can spend a bit of time at the beginning of a course having students use programs that you have built and which emphasize the interactions between objects. Students need to have clear ideas about objects, messages, behavior, and state. If the micro world is correctly built, it can also demonstrate inheritance, which is simply programming by extension and specialization. The FrogWorld discussed above has HoverFrogs in addition to the ordinary Frogs. The hover frogs have additional behaviors. This can be discussed long before you ever see a class, much less an inherited class.

If the instructor includes design documents such as class diagrams and sequence diagrams, along with programs shown to the students, they will get the idea that they are important and will begin to see the central nature of the objects, as all of these techniques emphasize some aspect of objects.

Next, as hinted above, you need to have a large kit bag of pre built code with which the students can interact. It is easier to read programs than it is to write them and students can benefit from reading well designed and well written programs. They can also read larger programs (presented with design documents) than they can be expected to write themselves. This is rather different from the typical procedural paradigm course in a language like Pascal, in which students typically write most of the code that they ever see. This is unlikely to be successful in the OOP world, and most OO programmers don’t work that way in any case.

Many of the exercises that you give students can be extensions of the code that you provide, rather than programs built from scratch. Some of these exercises can be simply to repair programs that you have broken in creative ways. Building programs by copying parts of other programs is also an officially sanctioned OOP technique. Finally, building by extension can be a good way to introduce inheritance. This provided code can either come from you or from standard libraries. The Java String class and C++ string class can be a very useful class to study early. You can show some or all of its protocol and have the students write interesting programs using it.

The exercises that you assign to students can also be designed in such a way that the students develop a tool kit over time. Indeed, one of the standard teaching techniques for apprentices in medieval guilds was to have fresh recruits build tools that they would need in the practice of the guild. Over time students could build up a set of classes that they would use in later projects. One of the early classes that I use is a timer that encapsulates the system clock. This implements the protocol of a stop watch and is used later to time programs when the discussions of program efficiency are undertaken. Another useful class at this level is a Die class, which encapsulates a discrete random number generator. This can be used in a number of situations, both to generate random data, and to provide the basis for probabilistic programming. If students have recently taken a discrete math course there are a lot of exercises that can reinforce the ideas learned there. As the course proceeds and you want to discuss a Stack class, for example, you can have the students build one that they can use later in the same course or other courses. Again this is unlike the typical procedural programming course in which the exercises do not build upon one another, but are simply discarded after being read by the instructor.

Finally, some of the low-level programming details can be postponed in an OO course. Operating with objects and sending messages around can result in quite sophisticated behavior without ever discussing loops, for example. Recursion is much more natural and much easier for students to learn, once they get used to objects sending messages, then sending messages to themselves, then sending the same message to themselves. It can actually be introduced without much discussion, and can be done long before loops are introduced. A simple if-then structure is often needed, but the more complex forms such as nested if’s and switch statements can be long postponed.

Again, since students are programming with interesting data (objects) you don’t necessarily need to introduce much of the detail of built-in data early. You don’t need to thoroughly discuss integer and floating data to give the students some material with which to program as they have more interesting things with which to work.

Finding a suitable textbook can be a problem as many of the current crop are not very good. Look for a book that does objects and classes early. Look for one that has ideas and not just programming early. The book should let students work with ideas before they need to implement those ideas in programs. Look for one that isn’t organized strictly along language feature lines. Those books may be good for reference, but are not good for learning. Look for a book that shows design and not just programming. Look for a book that gives the students lots of complete programs to read and steal from. And even then, be prepared to build a good bit of code yourself before the course begins.

My personal list of desirable features in a textbook include placing inheritance fairly early and also including some breadth topics. I think students embarking on a career should get an early feel for the range of possibilities. I also like to see some discussion of the ethical dilemmas that are faced by software developers.

Bibliography and Background Reading

Kent Beck and Ward Cunningham, "A Laboratory For Teaching Object-Oriented Thinking", OOPSLA 89 Conference Proceedings, ACM SIGPLAN Notices, Vol. 24, Number 10, Oct. 1989. (CRC cards)

David Bellin & Susan Suchman ,The CRC Card Book, Addison Wesley, 1997

Richard Gabriel, Patterns of Software, Oxford University Press, 1996

Jeremy Gibbons, "Structured Programming in Java," SIGPLAN Notices, V 33, N 4, April 1998

Craig Larman, Applying UML and Patterns: An Introduction to Object-Oriented Design, Prentice Hall, 1998.

Mark Lattanzi and Sallie Henry, "Teaching the Object-Oriented Paradigm and Software Reuse," Computer Science Education, V7, N1, p99, 1996

Bjarne Stroustrup, The Design and Evolution of C++, Addison Wesley, 1994

Rebecca Wirfs-Brock, Brian Wilkerson, and Lauren Wiener, Designing Object-Oriented Software, Prentice Hall, 1990. (Responsibility driven design. Highly recommended)

Mark Woodman, Rob Griffiths, Malcom Macgregor, Simon Holand, Hugh Robinson, "Exploiting Smaltalk Modules in a Customisable Programming Environment," OOPSLA ’98 submission.

Last Updated: March 31, 2000