(c) Copyright 1996, Joseph Bergin. All rights reserved.

The Object Is Computer Science-- C++ edition

Joseph Bergin

Chapter 1 Strings -- Programming With Objects

1.1 Preliminaries: What a C++ program looks like.

A C++ program is composed of a set of modules. These modules are either supplied by the C++ system itself, or written by the programmer and his or her colleagues. One gets access to the features of a module by "including its header." In the following, the header iostream.h is provided by the system and String.h was built by your author. Headers are more properly called interfaces. The purpose of an interface is to define what is usable in a module.

One of the modules must contain a "main function." This function directs all of the processing, usually by using features from the included headers.


#include <iostream.h>
#include "Strngs.h"

// ******************************
//	Output the message:  Hello World. 
// ******************************

void main( )
{	String greeting("Hello World.");
	cout << greeting << endl; 
}

The main function contains two sorts of things: declarations and statements. Declarations introduce the names of things and the properties of the names. In the above, the name "greeting" is declared in the line


	String greeting("Hello World.");

to be of type String with value "Hello World." This is a "variable" declaration and greeting is a variable. This means that it can hold different values at different times. Names can also stand for constants and other things. In the above, the name "main" stands for a function, and the name cout stands for an object. endl is a constant representing the end of a line of output.

The line


	cout << greeting << endl; 

is a statement. It sends the current contents of the variable greeting, namely "Hello World." to the standard output device represented by the object cout. It then sends the end line value, endl. The effect of executing this program is to produce

Hello World.

on the standard output device, which is probably your computer screen. Notice that the quotes that enclosed the original value are not printed.

Our program above also include a comment to describe what it does. It is hardly worthwhile for a program this simple, but will be very useful when programs get more complex. Comments can begin with the characters "//", in which case they extend to the end of the current line, or they can begin with "/*". In the latter case the comment extends, perhaps across several lines until the next occurrence of "*/".

Let's try another example.


#include <iostream.h>
#include "String.h"

// ******************************
//	Output a greeting
// ******************************

void main( )
{	String greeting("Hello World.");
	cout << greeting << endl; 
	greeting = "Bye now."; 
	cout << greeting << endl;
}

This program starts out just like the first one. It then changes the value of variable greeting to "Bye now." and then prints that on a separate line.

Syntax

	Syntax defines the structural rules of a language.  
	
	We shall use these boxes to set aside some of the rules of C++, 
	including syntax rules.  In these boxes we will use words contained 
	in angle brackets <words> to indicate general forms that must be 
	replaced with a legal alternative to make valid C++ code.  


Variable declaration

	The form of a variable declaration can be any of the following

	<typeExpression> <newIdentifier> ;
				e.g. 		String empty;
						int m;
		or
	<typeExpression> <newIdentifier>(<parameters>) ;
				e.g.		String myself("Joseph Bergin");
		or
	<typeExpression> <newIdentifier> = <initialValue> ;
				e.g.		String George = "Curious"
						String author = myself;
						int size = 200;

	where <typeExpression> can take many forms, one of which is a simple 
	name (identifier) like String;  <newIdentifier> is a name made up by 
	the programmer; <parameters> are values supplied by the programmer; 
	and <initialValue> is a single value supplied by the programmer.  

Some declarations are also definitions. A variable definition not only names the variable and its type, but it sets aside storage in the memory of the computer in which to hold a value. The forms above are all definitional forms.

1.2 Preliminaries: How to run a C++ program

The method of running a C++ program depends on your computer system as well as on the particular version of C++ that you are using. Some versions use "projects" and some use "makefiles." In either case, you must guarantee that all needed modules are brought together to build an executable.

To start, let's assume you are using Borland C++ on a DOS or Windows PC. For the above program, the needed modules are main.cpp, which contains the program above, and Strngs.cpp, which is supplied with this book. The system modules, like iostream.cpp are provided by the system (of course) and are known to it. If a C++ system uses an integrated environment, it will likely use projects. These are created with one or more menu selections such as "Create Project" and "Add File." You must first create a new project and then add the necessary files to it. The system will then save the details of the project on the disk, so you can return to your work without repeating these steps. After creating the project and adding the necessary files to it, the program is built with another menu command such as "Make" or "Build". Once the program has been built, you can "run" or "execute" it with another menu command.

A makefile is a description of a program and its parts. If you are using gnu C++ on a UNIX system, your makefile might look like the following. Note that this system expects module files to end in .cc rather than .cpp as was assumed above. This file should be named "makefile". The program is built by executing the shell command "make hello".


CC = /usr/local/bin/gcc

 CPPFLAGS = 
 	-Wall

sources = 	main.cc \
		Strngs.cc 

objects = 	main.o \
		Strngs.o 

hello : $(objects) makefile
	$(CC)  $(CPPFLAGS) $(CFLAGS) -o hello $(objects) \
			/usr/local/lib/libiostream.a 

main.o : main.cc 
	$(CC) -c $(CPPFLAGS) $(CFLAGS) main.cc
Strngs.o : Strngs.cc 
	$(CC) -c $(CPPFLAGS) $(CFLAGS) Strngs.cc

clean  : 
	rm $(objects)

This file describes how to create your executable "hello" as well as its necessary parts. Once the program has been made, the shell command


hello

will execute your program.

Modules, by the way, are made up of two parts: the interface part, or header, and the implementation. The header is contained in a file whose extension, according to convention, is ".h" and the implementation is contained in a file with extension ".cpp" or ".cp" or ".cc" depending on the particular C++ system. Usually the two files have the same name other than the extension. For example, the String module is supplied in String.h and String.cpp.

You Try It: Get the hello program running on your computer.

You Try It: Write a C++ program that write out your own name and address. Build the program and run it.

1.3 Preliminaries: Programming with Objects

Objects are a certain kind of value. In C++ they are usually relatively complex in structure. Other kinds of values are integers (int) and characters (char). Objects are built by programmers to solve specific problems. Objects provide services. The object cout, used above, provides an "output service." When we send it information using the << operator, cout displays that information on the standard output device.

Like other kinds of values, objects have types. The type of cout is ostream, provided in the iostream module. Types define the operations that can be applied to values. For example, int is a built in type in C++. the integers 5 and -22 are values of this type. The operations defined by type int include + and *, standing for addition and multiplication.

String is also a type, but it is defined by a programmer, rather than built in. When we declared the variable greeting in the above example program, we also gave it a value: the object containing "Hello World." In fact, the declaration above created two things, the variable itself and the value that it holds. The object is the value. The variable gives us access to the object.

1.4 String Objects

We create string objects by declaring variables to hold them. For example


String blank;	// Create an empty string named blank.
String Tom("These are the times...");
String val(256);	// Create a string named val, with characters 2, 5, and 6.  

Strings provide a "character sequence" service: they hold a sequence of character (char) values. A character value is written by enclosing a printable symbol, such as a letter or digit in single quote marks: 'a' is a character. Once we have created such an object, we want to manipulate it. What can be done with an object depends on the services provided by the object. These are defined by the programmer that created the object's type and they are described in the corresponding interface (header) file.

The first service provided by String objects is "write-ability." We used this above, when we sent our string greeting to the object cout. We can also compose strings with the catenation operator, +. To create a longer string out of parts we just compose them. For example


String first ("These are ");
String next ("the times...");

can be composed with


first + next

to form the longer string containing "These are the times...". We could save this result in a string variable and then print it out with


String result = first + next;
cout << result;

We can also compose a char and a string in either order as in


	'"' + Tom + '"' 

Which will print as


"These are the times..." 

Composing two char values, on the other hand, has a different meaning. This is due to the low-level nature of the built-in type char. Type char is actually a numeric type, so + implies adding the internal code values of the characters, rather than composing them. This is rarely useful.

Some people have the mistaken idea that computers are smart. They are not. What we call the memory of a computer is just a large collection of variables. A computer "remembers" something by saving a piece of information in a variable.

Another reason that people think computers are smart is that they seem to be able to make decisions. What they really do is examine some value saved in a variable and use that to determine what instructions should be executed next. A program in a computer examines values by comparing values to other values. We can compare two strings for equality with the == operator. For example:


	Tom == first + next

is a comparison. The comparison can be either true or false. C++ uses the constant 0 for false. Any other value is considered to be true. If we examine the values given to Tom, first, and next, this should evaluate to true. There is a type named Boolean defined in this library that will often be used in place of int to indicate truth values. It is defined in the interface Boolean.h, which also defines constants true and false. The type is named after George Boole, an English mathematician who lived in the middle of the 19th century. His work forms the foundation of Boolean Algebra: the mathematics of truth values.

We can use the value of a comparison to chose what to do next. The following program fragment will print the string "Yep, the same." if the string Tom contains the same characters as the string first + next.


if(Tom == first + next)
	cout << "Yep, the same." << endl;


If Statement

	An if statement has the form

	if ( <condition> ) 
		<statement>

	where <condition> is any integer valued expression. 
	If the expression evaluates to false ( == 0 ) the <statement> 
	part is not executed. Otherwise it is.  Any statement may substitute 
	for the form <statement>.  Note that C++ uses integers to 
	represent truth values, with zero meaning false, and any other 
	value interpreted as true.  

This last example uses two different kinds of strings. First are our string objects: Tom, first, and next. The other kind of string is a lower level "C-string": "Yep, the same.". The real purpose of our string objects is to make the lower level strings easier to use and especially less error prone. We can mix them freely, however. The following is legal.


String now = "THIS is the time.";

It begins by creating an empty string and then giving it the value of the C-string on the right of the assignment operator.

By the way, the newline character denoted by '\n' is one way to make a string contain a "carriage return," which causes more than one line to be output. For example


if (speaker == "Tom Paine" ) 
	cout << "These are \n the times.";

produces


These are 
the times.

if the string named speaker has value "Tom Paine".

The same result could also be produced by the fragment


if (speaker == "Tom Paine" ) 
{	cout << "These are" << endl;
	cout << " the times.";
}

In this example we want two statements in the body of the if statement. However, the form of the if statement allows only one. We achieve this by using a compound statement. A compound statement is a sequence of statements surounded by braces.


Compound Statement

	A compound statement is a sequence of zero or more declarations and 
	statements enclosed in braces.  The form of a compound statement is

	{
		<declaration or statement>
		 ...
	} 

	It is possible to have a compound statement with no declarations or statements 
	in its body.  A compound statement may be used wherever a statement is required.  
	Note that compound statements are not terminated with semicolons.  

Another service provided by strings is an alphabetic comparison service, using the < operator. This ordering is also called lexicographic. This service could be used to sort lists of names into alphabetic order. In the next fragment we see how the computer can be made to make a choice between two alternatives.


if (first < next) 
	cout << first + ' ' + next;
else
	cout << next + ' ' + first;

If the value stored in string first comes alphabetically before the value stored in string next then we will print out the strings in the order first, then next, with a space in between. If string first does not come before string next, then next will be printed first. The computer will only perform one of the actions. Which it will perform depends on the comparison test in parentheses after the word if.

By the way, the ordering of strings is defined by the internal "collating sequence" on characters. This defines the order of characters. In this ordering the lowercase letters are properly sequenced, with 'a' < 'b', etc. and the uppercase letters are also properly sequenced. However, all uppercase letters are less than all lowercase letters, so 'Z' < 'a'. The digit characters, like '8', come earlier than the uppercase letters, and the other symbols like '*' are scattered about. Here we are assuming your computer uses the common code named ASCII (American Standard Code for Information Interchange). It might not actually, in which case the collating sequence may be different than described here. We will see a later exercise in which you can explore this.

Notice that we have provided == and < and from these you can provide all of the other tests. These are not part of String, however, though it would be useful (as well as an exercise later) to add them.


Else Statement

	An else statement has the form

	else
		<statement>

	where any statement may replace the <statement> form.  
	Note that an else statement is legal ONLY after an if statement.  
	When an else statement follows an if statement, the <statement> 
	of the else is executed if and only if the <condition> of the if 
	statement is false when evaluated.  

Strings are not the only kind of data that we can manipulate. C++ provides a large number of data types. One of these is named int, and represents integer values, such as 55 and -231. Integer values have a large number of operators to perform arithmetic and comparisons.


int i = 5;
int j = 7;
if ( i <= j ) 
	cout << i << " is smaller. ";
else
	cout << j << " is smaller. ";

The next service of strings is the ability to get at and modify individual characters in a string. The characters in a string are indexed, starting with index 0 for the first character. Notices that spaces count as characters.

Figure 1.1 Indexing characters in a string.

We can therefore access the fourth character of string Tom with Tom[3], which is a lowercase s. We can actually change this character to another, thereby modifying the object Tom.


Tom[3] = ' ';

replaces this fourth character with a space.


Assignment statement

	An assignment statement has the form

	<variable> = <value> ;

	where <variable> has been defined to be a variable and 
	<value> has a type compatible with that of <variable>.  
	Note the required semicolon that terminates the assignment statement.  

The assignment can be used to make the computer "change its mind." If it has "remembered" one value of a variable, we can give it another value to replace the original by assigning the variable the new value. We can even make the computer do some things that migh look foolish to a mathematician.


int x = 5;
x = x + 1;  // x is given the value 6

The assignment here looks like a comparison between a value x and a value one more than x. It isn't a comparison, however. We are changing the value of x from its old value, represented by the x on the right of the "=" symbol to the value one higher, x + 1. This new value is then made to be the value of variable x. This process of increasing the value of an integer variable is very common in a computer. It is the basis of counting things, which computers do quite well. C++ provides a short-hand notation for such an increase or "increment" of a variable.


	x ++;

is equivalent to


	x = x + 1;

If x originally had value 10, it will then have value 11.

Suppose that we have the following strings:


String boss("John is the boss.");
String Mary("Mary");

If Mary replaces John as the boss we would want to change the value of the string boss. To do this requires changing four characters in string boss. We can do this by repeating the execution of a statement four times. This process of repeating an instruction is called looping.


int i = 0;
while(i < 4)
{	boss[i] = Mary[i];
	i++;
}

While Statement

	A while statement causes repeated execution of a <statement> that
	 it contains.  A while statement has the form

	while ( <condition> )
			<statement>

	The <condition> is tested first and if it is false the <statement> is 
	not executed.  If the <condition> is true then the <statement> is 
	executed and the entire while statement is re-executed, starting with the 
	<condition>.  This implies that the <statement> is repeatedly 
	executed until the <condition> is false for the first time.  If nothing 
	occurs in <statement> that might change the truth of <condition> 
	then the while will be executed repeatedly without end.

This fragment first sets a new int variable i to value 0. It then checks to see that i < 4 (of course it is). Then it executes


	boss[i] = Mary[i]

but i has value 0, so this is the same as


	boss[0] = Mary [0]

So the string now has 'M' in place of the 'J'. Next, the computer will increment i and will test the new value of i with i < 4. Since i is now just 1, we again get true for this so we execute


	boss[i] = Mary[i]

again. This time it is equivalent to


	boss[1]= Mary[1]

so now we have 'a' replacing 'o'. This continues until i has value 4 after the increment step, because the test is i < 4.

Now if we write the string boss, we will see:


Mary is the boss. 

To reiterate: the statement above is called a while loop. It causes the statement


boss[i] = Mary[i];

to be executed four times, each time with a different value of i, between 0 and 3. It says: start the integer variable i at 0 and repeat until it just reaches 4, incrementing it each time (i++) after executing the assignment statement. A while loop is very useful when the programmer wants a statement to be executed repeatedly. There are other loop statements in C++, the do statement and the for statement. These will be discussed later.

We need to be a bit careful when we index characters. If we try to use a negative index or one equal to or larger than the length of the string we will get an error and our program will halt with a message.

You Try It: Build a small program to test the following code fragment. Remember that you need to write a complete program. Just make sure it contains this fragment in its "main".


String test("012345");
int i = 0 ;
while ( i < 10) // error here
{	cout << test[i];
	i++;
}

For what values of i is the above legal?

As an aid in avoiding this error, a string will tell you its length if you ask. We simply execute


int len;
len = boss.length();

The assignment instruction here is structurally quite complex. The value on the right hand side of the assignment operator is an example of a message. We say that we send the length message to the string object boss. The allowable messages for a given object type are defined in the interface for the type, here String.h. Integer variable len will now hold the length of the string boss. Most services of objects are provided when we send messages to the objects of which we request the service.

We can combine the declaration of len and giving it a value with


int len = boss.length();

This is technically not an assignment, but an initialization of len. This is an important distinction.

You Try It: Before you run it, predict what will be printed by the following code fragment


String howMany("Apples\nOranges\nGrapes.");
cout << howMany << endl;
cout << howMany.length() << endl;

Now run the fragment and determine if you were correct. Justify any discrepancies.


Message

	A message has the form

	<anObject> . <serviceName> ( <actual parameters> )
	
	where <serviceName> is one of the services provided by the object, 
	and <parameters> are additional information required to perform 
	the service.   

You Try It: Write a program that will write out the characters at index 10 through 12 of the String Tom = "These are the times...". Use a while loop. The statement part of the while loop should print a single letter.

Another message understood by strings is the sub message. This permits us to extract substrings from a given string. This could have been used in the exercise just above. The sub message requires two parameters, however. The first is the character at which we should start to extract and the second is the number of characters that should be extracted. For example we could extract the third word, "the," of Tom with

String third = Tom.sub(10, 3);

Note that the "t" of "the" is at index 10, starting with 0, from the beginning of the string Tom. Many messages require parameters. The order and types and meanings of the parameters are likewise defined by the programmer that provides the interface.

You Try It: Write a program that will write out the characters at index 5 through 12 of the String Tom = "These are the times...". Use the sub message. Be careful. How many characters are there at indices 5 through 12. Hint. It isn't seven. After you have extracted these letters and printed them, you should also print Tom. Notice that extracting letters from a string doesn't change the string they are extracted from.

Sometimes we need to search a string for the occurrence of a character; perhaps because we want to extract some characters from that spot. Therefore strings also understand the search message. Search also takes two parameters. The first is the char to be searched for, and the second is the index at which to start the search. If we don't supply the second parameter at all, then the object will assume that we want to begin the search at index 0. The object will return the index of the next occurrence of the character. If the character can't be found at or after the starting index, it will return -1 to us, to tell us that it wasn't there.


int spaceIndex = Tom.search(' ', 10); 

This should return 13 to indicate the space between "the" and "times".

The final service of strings that we discuss here is a "read-ability" service. We can read strings from any convenient istream object, such as the built-in object cin, provided by the interface iostream.h. If we execute


String sentence; 
cin >> sentence;

then an entire line of input will be read and placed into the string named sentence. When we read from a stream like cin, which is normally the keyboard, the computer will give no indication that reading is being done. This means that when we expect a user to type a value for a variable, we should first output a hint about what is required. This output string is called a prompt.


String sentence;
cout << "Please enter a line of type.  End with a <return>." << endl;
cin >> sentence;

You Try It. No-really: Write a program to read a line of input. Then, have your program analyze the input to find and print the index of the first space character. What will happen if the user doesn't include any spaces in the input?

1.5 Using Strings

Now we would like to do some interesting things with strings. Strings are pretty versatile, so there are many possibilities. To get started, lets suppose we have a string named word that contains a word, and we would like the word capitalized. The standard interface ctype.h contains an operation named toupper that will transform a character into uppercase, provided that it is already lowercase. It will leave all other characters unchanged. To get access to this function, we must


#include <ctype.h> 

at the beginning of the file in which we need to use toupper. We next need to decide whether we want to change the value of word itself or to generate a new string containing the capitalization of word without changing word. For the former we could use:


word[0] = toupper(word[0]);

If the initial value of word was "this", then its final value will be "This". If we want a new string we would need to do more, perhaps creating a new string named copy as in


String copy (word);
copy[0] = toupper(word[0]);

Before we get too far, it's time to give a few rules. Any variable that you use in your program must be declared before you can use it. To declare a variable you must give its type, and perhaps its initial value. Once it is declared, however, you do not repeat the type with each use. Therefore, in the two line fragment above, we introduce the new variable copy, so we give its type, String. Thereafter we may use it by just using its name. Many languages require that we separate all of our declarations and write them in a special place, usually before all of our statements. C++ is more flexible, permitting us to place declarations where we find it most convenient. Many programmers do collect declarations at the beginning of a program, as it sometimes helps to find them, but it is not necessary. We must place the declaration at or before the first use of the variable, however.

So far we have seen mostly program fragments. Program fragments are nice because they are short and easy to read and understand. Real programs, however, are usually quite long and they can be very hard to understand. One way to control complexity in programming is to package up a useful fragment of code and give it a name so that we can refer to it, and even execute it, by giving its name, rather than giving the details of the code fragment each time we need it. This is especially useful when we need to use the same actions several times in a program. In C++, a package of program code is called a function. It is similar to a mathematical function because it usually has inputs or parameters and it usually computes some kind of a return result. We have seen one function already. Every C++ program must contain a function named main. We have also used several functions, most recently toupper, exported from ctype.h.

Depending on the program we were writing, we might need to capitalize words quite often. In that case it would be advantageous to provide a function to perform the capitalization. In this way we would not need to remember the details of capitalization, but only the name of the capitalization function. It would also provide a less error-prone solution, since we wouldn't be repeating the same instructions several times, each repetition of which might contain an error.


String capitalize(String s)
{	String result(s);
	result[0] = toupper(result[0]);
	return result;
}

Function Definition

	The form of a function definition is

	<returnTypeExpression> <newIdentifier> ( <parameters>) 
		<compound statement>

	where <compound statement>  has the form
		{
		<list of zero or more statements and declarations> 
		}

	Note that a compound statement does NOT have a semicolon at its end.  

We pass information into functions via their parameters. Parameters are a special kind of variable that can be referenced from the compound statement inside a function definition. We get information back from a function via its returned value, whose type is listed first in the function definition.


Return Statement

	The return statement has the form

	return ;
		or
	return <value> ;

	The latter form is used in functions.  It causes the processing of the 
	function to be terminated and the value to be sent back as the 
	function result.  The first form is optionally used in procedures, 
	which are discussed below.  It causes the termination of processing 
	of the procedure.  

For the function named capitalize, the return type is String, so we will get some new string back from the function when we call (or execute) it. We also pass a string into capitalize through its parameter named s. The function makes a new string named result, as an exact copy of s. It then capitalizes the first character of the result string. Finally it terminates, passing back the string named result.

A programmer will use this function by "calling" it. To call a function you use its name and supply actual values for any parameters. These actual values supplied for the parameters are also called arguments. The function call may appear anywhere that a value of the return type may appear. Therefore we may call capitalize anywhere that a String may be used. For example


String aWord("this");
cout << capitalize(aWord);

is legal because it is legal to send a String to cout and capitalize returns a String. Some function don't have any parameters. To call them we must still write the parentheses ( ) after the function name.

You Do It: There is an error in the above version of capitalize. A string of length zero doesn't need any change to get capitalized. What happens if we try with the above version? Write a better version using if.

A more interesting example is a function to produce "pig Latin." The simplest form of pig Latin simply moves the first letter of the word to the end and then appends the letters "ay". Thus "cat" becomes "atcay". An empty string is its own pig Latin.


String pig( String s)
{	String result;
	if(s.length() > 0)
		result = s.sub(1, s.length() - 1) +s[0] + "ay";		
	return result;
}

A common problem in computer science is to count things. Suppose we have a string that is supposed to contain at least one sentence. We would like to count the characters in the first sentence in the string. Errors are possible, however, so we should decide what to do if we can't find a period in the string, indicating that it isn't really a sentence. Since the length of a sentence is an integer (number of characters) and this needs to be at least 0, we could decide to return -1 if we can't find a period. Another decision is whether to count the period itself: we won't. Finally, we should decide what to do about spaces, especially spaces before the beginning of the first word. We will take the simplest approach here and just count them. We pass in a string to be counted as a parameter, and we get back an int.

For this program we can't use the length of the string to help us, since we don't know how long the first sentence in the string is. It might be much shorter than the string as a whole. However, if we know that there is a period somewhere in the string we can use a while loop such as the following


int count = 0;
while (aString[ count] != '.' )
	count++;

This loop only exits when the character at index count of variable aString is equal to a period. The "loop condition," aString[count] != '.', compares for non-equality ( != ), and we continue to repeat count++ as long as this remains true. In the current problem, however, we can't guarantee that the string even has a period. Therefore, we need to be more careful.


int sentenceLength(String s)
{	int result = 0;
	while (result < s.length() && s[result] != '.') 
		result++;
	if(result == s.length()) return -1;
	return result;
}

Inside sentenceLength we initialize an integer variable to zero. Then we use a while loop to continually increment the variable result as long as the result stays less than the length of the parameter AND (&&) the current char (s[result]) is not equal to (!=) the period character. When we exit this while loop we will have counted the characters in a legal sentence, but in an illegal sentence we will have exited with the first condition (result < s.length()) false, but the other still unfulfilled. We thus use an if statement to test for this possibility and return -1 if it is so. If not we continue executing here and return the previously computed value of result.


String Tom("These are the times.");
cout << sentenceLength(Tom) << endl;

produces 19.

Notice that our first version of capitalize, above, leaves unchanged the string passed as a parameter and returns a new string instead as a result. Let's try a different approach and write a function to modify the string passed in. The first difference is that this new version doesn't need to return anything, since it will be modifying its parameter. We indicate this by supplying void for a return type. This just indicates that there is no returned value. such a "void function" is also called a procedure. A procedure is called as if it were a statement. Our first try at this procedure is


void capitalize(String s)
{	s[0] = toupper(s[0]);
}

A call of this procedure might be something like


String aWord("this");
capitalize(aWord);
cout << aWord;

Unfortunately this reveals that our procedure had no effect on the variable aWord. It is still spelled "this". The problem is caused by the fact that C++ will pass a copy of the parameter string aWord to the procedure and not the object itself. It is the copy that was modified. If we put an output statement as the last statement in the procedure we will see this change in the parameter s, but not in the original string. In this case we need to pass the object itself, rather than a copy of it. We do this by using a reference parameter instead of the ordinary value parameter. A reference variable or parameter is a variable that doesn't "contain" an object but "refers to" an object. This can be described as having an alias (or alternate name) for the object. The correct form of this new version is as follows.


void capitalize(String& s)
{	s[0] = toupper(s[0]);
}

The ampersand that ends the type expression just before the parameter name indicates that the parameter is to be a reference. Nothing else is changed and the call is made in the same way, but now the string passed in is modified. Notice that a procedure (void function) does not need a return statement in its statement part. You may use one, and if it is executed, the procedure will immediately terminate and the control of the processing will return to the code that made the call. A procedure automatically returns when we come to its end. If you use a return statement in a void function it should not have any value. In a proper function, on the other hand, you always need a return statement, that gives the value to be returned as the result.

It would be a good idea to try some of these examples. But where do you put your function and procedure definitions? There are a couple of choices. The first is to place them in the main module just before the main function. Another is to place them in the implementation part of some other module. If we want to do this we will need to provide an interface part for that module and give enough information about the functions in the interface to allow them to be called. We do this by providing a function declaration, also called a function prototype.


Function Prototype

	The form of a function prototype is

	<returnTypeExpression> <newIdentifier> ( <parameterTypes>) ;

	This gives enough information to call the function, since we have its name, 
	its parameters, and its return type. 

It is also possible to put function prototypes into the main module before the main function, with the actual definitions following the main function. This layout is preferred by many programmers. In a function prototype, the names of the parameters are optional, but the types must be given. For our second version of capitalize, we could use either


void capitalize(String& s) ;
or

void capitalize(String& ) ;

A sample main.cpp to test this function follows


#include <iostream.h>
#include <ctype.h>
#include "String.h"

void capitalize(String& ) ;

void main( )
{	String test;
	cin >> test;
	capitalize(test);
	cout << test << endl;
}

void capitalize(String& s)
{	s[0] = toupper(s[0]);
}

You Try It: Build and test the above program. Modify it so that it produces an appropriate prompt before the input line.

If you have tried numeric strings, like String(5566), perhaps you noticed that the string representation has several blank spaces to the left of the first digit character. Perhaps you would like to strip these blanks. The following procedure will serve


String strip(String &s)
{	String result;
	unsigned int i = 0;
	int len = s.length();
	while(i < len && s[i] == ' ') i++;
	while(i < len) result = result + s[i++];
	return result;
};

Here we see a new type, the "unsigned int". An unsigned int is just a non-negative integer, representing a value in the range from zero to some maximum that depends on your computer system. It will probably be the value 232-1, which is a bit above four billion. Ints themselves use about half this range for negative numbers and about half for positive numbers. Therefore the smallest int is usually about -2 billion and the largest a bit above +2 billion.

Here we are depending critically on a special feature of the && (and) and || (or) operators. Before we can assure that the expression s[i] is legal, we need to know that i is between 0 and s.length() - 1. Otherwise our program will be terminated. In the first while loop of function strip the conditin is (i < len && s[i] == ' '). This is evaluated by first evaluating (i<len). If that is already false we stop evaluating and use the result false. Otherwise we evaluate (s[i] == ' '). Therefore we don't even look at s[i] until after we have already determined that i is legal. This is called short-circuit evaluation and it is used for both && and ||. Some other languages use complete evaluation of all terms before they evaluate the "and" and "or" conditions. In such languages we would need to write a test like this quite differently.

In function strip we start with a new empty string. We then use a while loop to pass over all of the initial blanks of the parameter, being careful not to move past the end of the string. When we finish, i should either be the index of a "non-blank," or should be at the end of the string. We then append the rest of the characters of the input (if any) to the result. The statement


result = result + s[i++];

that forms the body of the second while loop is another complex statement. It starts by computing the char s[i] for the current value of i. It then increments i for the benefit of the next cycle of the loop, but it appends the character it previously found to the right end of result. The expression i++ should be distinguished from the expression ++i, which is also legal. The former uses the current value of i in whatever context the expression appears and then increments i: it is the old value that is used, however. The latter form, ++i, first increments the value of i and then uses the new value in the present context. This would give a very different (and incorrect) result if we employed it here. We can also decrement a variable, say c, with either c - - (post decrement) or - - c (pre decrement). The name of the C++ language in fact is derived from the belief that it is an improvement (increment) over the language C.

You Try It: Implement and test function strip. Then modify it so that it uses


result = result + s[++i];

in place of the


result = result + s[i++];

What is the effect.

Another useful function is one that will take a string and extract a word from it. Since a long string, perhaps a sentence, can contain many words, we need to tell this new function where to start searching. To do this we will pass in the string to be searched as well as the index in which we want it to start. We will also make the function ignore initial non alphabetic characters, since they are not part of words. The procedure will stop extracting when it comes to the first non alphabetic character after the first alphabetic one. We must also be careful not to process past the end of the string. Finally, we would like to know where the function stopped processing so that we could call the same function repeatedly and extract several words. Therefore we pass the start parameter by reference, so that a variable can be updated as the function proceeds.


String word(String &s, unsigned int & start )
{	String result;
	int len = s.length();
	while(start < len && !isalpha(s[start])) start++;
	while(start < len && isalpha(s[start])) 
		result = result + s[start++];
	return result;
}

When this returns, we will have a word of the input string produced, the input will not be modified, even though we passed it by reference, At the end, the start parameter will index the first character after the word produced. We might call this function (repeatedly) as in:


String Tom("These are the times.");
String temp;
unsigned int startVal = 0;
while(startVal < Tom.length())
{	temp = word(Tom, startVal );
	cout << temp<< endl;
}

This will produce


These
are
the 
times

Leaving the value of startVal equal to 20.

Examining function word we find a few new C++ features. The first is the not operator denoted by the exclamation point, "!". This is used to compute the negation of a logical expression. In function word, the not operator is applied to the result of the function call isalpha(s[start]). Function isalpha is exported by the interface ctype.h. It returns true (1) or false (0) depending on whether its parameter is an alphabetic character or not.

Finally, suppose that we would like to capitalize all of the alphabetic characters in a string. We will write a procedure to do this (it modifies its input), and name it shout, since one way to indicate in written work that a speaker is shouting is to completely capitalize the words spoken. Take a few moments and write a solution for this using while loops. The solution below doesn't use loops at all. Instead it uses a procedure that calls itself. Such a procedure is called recursive.


void shout(String &s, unsigned int start = 0) //  modifies s
{	if(start< s.length())
	{	s[start] = toupper(s[start]);
		shout(s, start+1);
	}
} 

Notice that we first check to see that we have at least one character at index start. If so we capitalize that one character and then call the same function again, but with the next larger index. The new call will execute the same code, but since we sent it a new parameter, it will capitalize the next character, etc. until the last call will find that the value of start that it was passed is equal to s.length, and the whole process will terminate. Programming with recursion is a very powerful tool, but it must be used carefully to avoid an unterminated sequence of calls. Here we guarantee it because each call takes us closer to the end of the string and we make a specific check for the end so we will terminate when we reach it.

In shout we have also used a default value for parameter start. This makes it possible to call shout giving a value for only the first parameter. If we call it this way then zero, the default value of start, will be assumed by shout. We can also call it with any other legal value (any unsigned int).

1.6 Summary

In this chapter we have learned some C++ basics such as if and while statements. We have also learned a few things about objects and a few more things about a specific class of objects called String. The following lists the important names and symbols introduced here. Be sure you understand how each is used.

Boolean
char
int
String
unsigned int
+
++
=
==
!=
<
[ ]
"A low level string."
556
-234
'a'
'\n'
( )
&
&&
!
{ }
compound statement
;
if
else
while
return
for
main
cin
cout
#include

Additionally we have seen the following ideas

declaration
default parameter
definition
function
index
message
object
parameter
procedure
prototype
reference parameter
short-circuit evaluation
value parameter
variable

1.7 Exercises and Projects

1.1. C++ has a large number of operators. We have seen several of them in this chapter. Many operators are used for several purposes. The ones we have seen so far are:

+ addition and string catenation
++ increment a variable
= assign a value to a variable OR initialize a new variable or constant
== compare two values for equality
!= compare two values for "not equal"
< compare two values for size
&& logical and
! logical not
<< output
>> input

Some other similar ones are:

- subtraction ( a-b ) and negation( -x )
- - decrement a variable
> greater than
>= greater than or equal to
<= less than or equal to
|| logical or
* multiplication
/ division
% remainder: a % b is the remainder when a is divided by b

Additionally is is often useful to think of certain punctuation marks as operators

[ ] indexing
( ) function calling
. messages

There are many other operators and some of the above have alternate meanings.

Modify the programs that we have used in this chapter or write your own programs to get a feel for what each of the operators above actually produces. Recall that only == and < were provided by strings. Suppose that you want to compare two strings for <=. How would you do it?

1.2. Write a function that will

a) read a sentence from standard input

b) transform each word in it into pig Latin

c) output the resulting string on standard output.

1.3. Write a function that will reverse the characters in a string. For example "This is it." will be transformed into ".ti si sihT"

1.4. Write a function that will transform a string into its coded Caesar cypher. The Caesar cypher replaces each character with one that is a fixed distance away in the alphabet. For example replacing a with d (distance = 3) and b with e, etc. Write another function that will decode the coded string. This program could make you rich. Just create an electronic "decoder ring" and convince the makers of "Cracker Jacks" to put one in every box and pay you a royalty. Simple.

1.5. This is another code problem, but a bit more sophisticated. Create a fixed string, the code, that is a permutation of the uppercase alphabet. A permuation of a string is another string with the same letters in a different order. (In the header "Auxiliary.h" you will find a function

void jumble (String &);

that will help create this permuation.) In the third position of such a permutation, the normal position of 'C' there might be a 'W', or whatever. Read an uppercase string to be coded; called the clear. Use the characters of the clear to index into the code string to find a replacement. To index using characters we must transform them into values in the range starting with zero. For an uppercase alphabetic character stored in a variable ch we can use the value ch - `A', which will give a result in the range 0...25. Punctuation and other non alphabetic characters in the clear can be transformed into themselves in the code. Use this to create a coded transformation of the clear. How do you decode such a coded string?

1.6. In the header "Auxiliary.h" you will find a function

String transmit(String &);

that simulates a noisy transmission line. Suppose that you are trying to transmit a message (String) over a noisy modem line. There is a chance that your message won't go through. It is advanatageous to be able to determine if the message got through at the receiving end. One way to do this is to use a checksum. Append a "calculated character," called the checksum, to the end of the message before you transmit it. The character is calculated by adding up all of the character codes of the message and taking the remainder modulo 128 (The legal char codes are in the range 0...127.) Your computed character may not be a printing character, but that doesn't matter for this experiment. At the receiving end you recompute the checksum of all the characters but the last and compare it with the last character. If they are the same you have high confidence that the message got through. ( How high?) In a realistic system, the receiver will request retransmission when an error is detected. Build the coder for the sending end and the checker for the receiving end.

1.7. What happens in the following?


String greet("hi");
cout << greet << endl;
greet = greet + greet;
cout << greet << endl; 

How long is the string greet after the following program fragment runs?


String greet ("hi");
int i = 0;
while ( i < 9)
{	greet = greet + greet;
	i++;
}
cout << greet.length() << endl;

Run the above program to see if you are right. Now change the value 9 to a larger value and run again. How big does this value have to be before you computer can't finish the program and "crashes" your program?

1.8. Use the idea in the previous exercise to determine how long a string will be accepted on your computer.

1.9. What should the following program produce? What does it actually do?


void main ( ) 
{	int i = 50;
	int x;
	cout << "Please type an integer" << endl;
	cin >> x;
	cout << i / x << ' ' << i % x << endl;
}

Try this for a few different input values, both positive an negative. Now try a zero for the input.

1.10. How large an integer can your C++ system handle? To get an idea, first try to execute the following fragment. Of course you must write a complete program to do so.


int val = 1; 
int i = 0; 
while (int i < 50) 
{	cout << i << " : " << val << endl;
	val = val * 2;
	i++;
}

Does the program behave as you would expect. What happens when the computer can no longer hold a value as large as the one you ask it to compute? Keep in mind that a computer is finite. It must necessarily run out of capacity eventually. Think of integers as being similar to numbers on a clock. Immediately after the largest number (12 on a clock) is the smallest number (1 on a clock). For integers, if the largest integer is about +2 billion, then the smallest is about -2 billion. A few computers can handle all of the numbers implied by the above program. If yours can, try the program again, first replacing the 50 with 100.

1.11. A better version of pig Latin applies the rule given above only if the word begins with a consonant. If it begins with a vowel, a,e,i,o, or u, we leave the initial letter but append "yay" to the end. Write a function to produce pig Latin according to this new rule. Hint. Use if statements.

1.12. Some simple programs take an incredibly long time to execute. Consider the following function.


int slow( int n)
{	if(n < 3) return 1;
	return slow(n-3) - slow(n-2) + 2*slow(n-1);
}

This recursive function takes about twice as long to compute a value for n = 25 as it does for n = 24. If you execute the following fragment you will see what I mean.


i = 0; 
while(i < 30 )
{	cout << i << ' ' << slow(i) << endl;
	i++;
}

The first few lines show up very quickly. For these the execution time is dominated by the time to do the output. But then the computation of function slow starts to take more and more time until this is the dominant part of the work. If it takes about 5 seconds to compute slow(24) and a bit less than 10 seconds to compute slow(25) and if this doubling continues, how long will it take to compute slow(30)? How about slow(45)? How long is ten million seconds? How many seconds have you been alive?

If you want to time the above program you can use a StopWatch object to do so. Include the interface "StopWatch.h" into your program and then create a StopWatch object named timer.


StopWatch timer;
timer.start();
timer.mark();
i = 0; 
while(i < 30)
{	cout << i << ' ' << slow(i) << endl;
	timer.mark();
	i++;
}
timer.stop();

After a stopwatch is created and you start it, the mark method will print out the elapsed time since it was last started and the "lap time," which is the elapsed time since the last mark.

1.13. The above exercise demonstrates that some things are difficult to compute. This one hopes to show that there are some things that it is wrong to compute. There are currently (1995) many converging technologies. One of these is wireless communication. You have certainly heard of beepers and cellular phones. One aspect of this is the so-called PIM, or Personal Information Manager. Another technology is that of digital money, which can take many forms. One of the forms is the credit card. Another that is about to appear is digital cash, which shares with ordinary cash the ability to make anonymous transactions. Credit card transactions are not anonymous, since the bank keeps a record of both the buyer and the seller.

It will soon be possible to create a PGD, or Personal Gambling Device. Such a small electronic device could use wireless communication to enable instant gambling of whatever kind the user wanted and the developer enabled; perhaps playing blackjack for money while riding the subway. Users could buy cash cards, similar to pre-paid phone cards, with which to enable the device. Winnings could be added to the card via the wireless network, and losses could be deducted. The user could cash in any value in the card at any time.

There are two aspects we need to consider beyond the technical ones. The first is the legality of such a device. Most States in the USA, and the nation itself strictly regulate gambling. The device would probably be illegal. But suppose that you created it and licensed it to your home State. The State itself could then sell the devices and become the other gambling partner in each transaction. Then it would not be illegal. You could get a commission on the sale of each device, or even better, a commission on each gambling transaction. Any proceeds that the State earned could be used for some good purpose, such as is now done with many State lotteries, which are used to fund education.

But what of the morality? Remember that gambling addiction is a serious problem for some people. Should you, or the State, make it possible for some people to overindulge in this "vice"? What if a parent gambled and lost all. This would affect his or her dependents. Would the State be responsible for the consequences? Would you be? Discuss the morality of creating such a device. How would this be different if the transactions were not anonymous, but could be identified with an individual. Would this be better or worse?