Selected Reading: Design and Use of C++ by Bjarne Stroustrup
Selected Reading: Design and Use of C++ by Bjarne Stroustrup
Selected Reading: Design and Use of C++ by Bjarne Stroustrup
Reading
Talk given November 3, 1997 at Computer Literacy Bookstore in San
Jose, CA
Introduction
On behalf of Computer Literacy Bookshops and the San Francisco
Area Center for Advanced Technology, I'd like to welcome you to
tonight's event. I'd like to introduce Marian Corcoran, CEO and
The Annotated C++ Founder of the San Francisco Bay Area Center for Advanced
Reference Manual Technology, who will introduce tonight's speaker.
Slide #1
Design and Use of C++: Q&A session.
Bjarne Stroustrup
AT&T Labs
Florham Park, NJ 07932, USA
[email protected]
http://www.research.att.com/~bs
Learning C++
First, I'm going to talk for about twenty minutes, to set a few themes.
I'm going to say a little bit about what programming languages are for -
partly because I'm always befuddled about why people fight so much
over them. After all, they're just programming languages. Then, I'm
going to say a little bit about what I think C++ is and should be. Finally,
I'm going to show a few code samples of the sort you might see on the
first day of a C++ course because I think education is an absolutely
key issue in programming and system building today.
Slide #3: Why would anyone care what programming language
was used to implement a system?
User's view:
- I don't care how you build it as long as it works and is cheap.
- When can I get it?
System supplier/vendor's view:
- When can I get it?
- How do we test it?
- How do we maintain it?
- How do we install it?
- How do we predict things about it?
- Does it have lots of neat features?
Programmer's view:
- ???
- (why/how should/does a programmer's view matter?)
One question that you may like think a little about is why would
anybody care which programming language you use? I mean, the
user can't see what language you used. Even if he could, he shouldn't,
in my opinion. I don't have to know in detail how the engine of my car
works. If, when driving a car, I can recognize what engine it has,
there's something wrong with that engine. It's intruding itself on my
consciousness in a way it's not supposed to. I just want to drive a car,
I don't want to know which programming language was used to
program its fuel injectors. I don't care; I want it to work; I want it to be
cheap; and I preferably want it delivered yesterday.
Programmers have a lot of opinions, as you all know, but why do their
opinions matter? Why, as non-programmers, should we care? I think
that, as programmers, we ought to have a good answer to that
question, because if it doesn't matter, then the managers are right in
just hiring the cheapest people they can find and telling them exactly
what to do and how to do it. We ought to have a higher degree of
professionalism. We can only do that if we can answer simple
questions like this.
I don't know if you're experience with code generators that allow you
to work from a very high level but generate low-level code from which
you can't get back to the high level again. In that case, you have to
maintain some rather unpleasant machine-generated code in a low-
level language. I don't like that myself. Most of the time, I'm not
actually writing code. I'm trying to figure out what the code is doing,
either my code or somebody else's code.
Slide #5
Original idea: combine C's strengths as a systems programming
language with Simula's facilities for program organizations
Here's the original idea about C++. I wanted C's strength for a
systems programming language, and I wanted Simula's facilities for
program organization.
When C++ first came about, I called it "C with classes." C because C
was the best systems programming language around: it was efficient,
it was portable, it was very flexible. It was available, and it was known.
However, one of the first things I did to build C with Classes from C++
was to improve the static type system because I don't really want
"sqrt(2)" to mean "segment violation." I would like to make sure that
run-time errors happen as infrequently as possible.
The other part of "C with Classes" that became C++ was the classes;
the Simula aspects of program organization; the notion that you write
code by figuring out what your concepts are, and mapping them into
classes in your program; taking the view that a class is a type, and
static checking.
Over the years, a fair number of rules came about for the design of
C++. You can't just sit down and design a language. As soon as you
have a little bit of success, everybody comes and want the language
to be stable except for two things they absolutely want added to it. So,
you can't not just design the language, sort of from day to day. You
have to slowly build up a set of rules by which you live. Rules allows
you to say "yes, your suggestions are very nice ideas, but the rules I
operate under are these, and those suggestions don't quite fit."
The net effect of all of this is that C++ is a better C - meaning that it is
a language where you do the things you usually do in C, in better
ways, without additional overhead and without restrictions on what you
can do. It supports data abstraction - the notion of having concepts
represented directly. It supports object-oriented programming - from
Simula in form of class hierarchies and use of such hierarchies. And it
supports generic programming - the ability to parameterize types and
functions with other types, which is very useful.
Slide #10
// Simple program:
int main()
{
// read numbers
// sort numbers
// output mean median
}
// Typical for introductory texts
// Common activity in real code
// read input doing something to each element
// do something with all elements
// the exact number of input values is unknown
I'll show you some very simple code. Here is the kind of exercise you
might be given in the second week of a programming course. If the
programming course has gone really well, it would be on the second
day. Basically, write a program that reads some numbers, sorts them
and outputs the mean and the median. In a real course, probably
you've had a few exercises leading up to this one. I think most of you
are professionals, so there's no problem. The only real problem is to
remember the time when you knew so little, that writing this tiny
program was a challenge. So, it's typical for introductory activities, and
it's actually also and example of activities common in real code. That
makes it a good example; it's a simple form of something real. We
read something from somewhere, check it to see if it is all right, and
we do something with all of the elements afterwards. As a constraint,
assume that we don't know how many of these there are. We could
get the input from a human typing it in, or from a file, or over a
network. This is a realistic problem. Think a little bit about how you
might like to do that in your favorite programming language.
Slide #11
// C-style solution:
int main()
{
double* buf; // use malloc() and realloc()
// or double buf[MAGIC_NUMBER];
double d;
double mean = 0;
double n = 0; // number of elements
// prevent overflow
// explain qsort() and write compare()
//
// - implies detailed discussion of pointers and casts
You have two choices. You can have a pointer to a buffer, then malloc
that buffer, and when you run out of space in it you realloc to get
more. That's sort of the professional way of doing it, I think. The
beginners' way of doing it is to simply put in an array of "enough"
elements - and cross your fingers. Of course, something like that will
never happen in commercial software [sad smile].
You have some variables, like the double you read in, the mean, and
the number of elements. You read them in and check them each time
and update the running mean. That's very easy. Then you sort the
array and output the median. Now, if you see this in the second week
of a first programming course, or the first week of a course in a
programming language you've never seen before, preventing overflow
is actually somewhat difficult. You have to figure out how memory is
managed, how to extend the input buffer or prevent it from
overflowing. The average novice does not get it right the first time. Or
the second time.
Slide #12
// C++-style solution:
int main()
{
vector buf;
double d;
double mean = 0;
double n = 0;
while(cin>>d) {
n++;
// check d, update running mean
buf.push_back(d);
}
sort(buf.begin(),buf.end());
// output mean median
}
// Short, safe, simple, easy to explain
//
// What about efficiency?
First of all, you just grab a vector of doubles and read them in. Each
time we get a new one, we stick it at the end of the vector. It's the
vector's job to figure out how to grow big enough. And as a matter of
fact, that's what it will do, until you run out of either real or virtual
memory in your operating system. Roughly, what we would say is you
have a vector, you read things into it, you put elements in the vector at
the end (of course). Then you sort it - from the beginning to the end.
That's basically it. You have two examples. The first one,
unfortunately, is the traditional way in which we teach programmers in
languages such as C or Pascal. The second one is what we could do.
One of the problems we had is we could only teach really elegant,
easy style like this in languages that didn't quite scale. If you're a LISP
or a Smalltalk programmers, you just yawn and say "of course, we've
been doing this for a couple of decades". Yeah, but in Standard C++
this can be done within a framework that expands to do all the things
that C and C++ have been used for, and you don't pay in run time, so
you can actually afford to do things right. As I said before, I don't want
to force people to choose between elegance and efficiency.
Slide #13
- Education is key - not just training
- Focus on concepts and techniques - not on language features
- Base initial teaching on higher-level data structures and algorithms,
e.g. the standard library - the specifics of pointers, arrays, and free
store come later.
I think education, not just training, is the key. And the way to deal with
that is to focus on concepts and techniques, teaching language
features later. When people come and say I'm drowning in language
features, usually what they mean is "I'm trying to use all these
features, and I can't figure out what they're supposed to be doing."
Well, if you don't know what the features are supposed to do, what are
you doing with them? First you have a problem, then you have the
concepts for the solution, and then you look for the tools to solve
them.
A: Well, it has a realloc, if you use malloc, which you shouldn't. But,
basically, you don't need realloc in C++ because the standard data
containers, such as vectors and lists, expand with the technique I just
showed you. So instead of declaring a simple array, figuring out you've
run out of space, and then try to realloc, you simply use a vector that
keeps expanding as needed. This, of course, is implemented at a
lower level using something that looks rather like realloc. But it is far
less error-prone and usually more efficient. Most people don't seem to
realize that realloc moves every element, sometimes. They get bitten
by when they have pointer into the array, then they realloc the array,
and then they get surprised the array moved. So, yes, you can use
malloc/realloc in C++, but you shouldn't. And you needn't because this
is a better facility - there are safer and more convenient ways of
getting the same thing done.
A: I know what a closure is, and there are dialects of C with nested
functions that try to get to that idea. C++, standard C++, does not
have closures or nested functions and won't get them, at least not for
the next five or ten years. Part of the problem is how to define the
context well enough, and part of the problem is that you can get too
much context and get obscure code. That at least is the traditional
answer in the C++ and C world. And in a lot of areas that is a
reasonable answer.On the other hand, there's a lot of algorithms
where it's nice to have the equivalent of closure. The simplest case is
a "for each" in which you do something to all elements of a sequence.
Another example is taking the sum of a set of elements; where you get
the context to put the sum in and return it? Compare two sets. How do
you specify what is the comparison criteria? In C++, you generally use
an object that acts like a function when give to an algorithm such as
for_each, accumulate, or compare. This is directly supported in the
standard library, is through "function objects." Somebody calls them
"functors." There's a variety of names for them, but basically, you
define a class of which you can initialize objects. This initialization
explicitly gives an object its initial state. That is, you don't pick up the
context, where a context is everything that could affect you. You
initialize an object with a specific set of elements (from the context)
that defines what the object can refer to. Then, you go and apply the
call operator on the object, and at the end of the algorithm you can call
any of the functions that you have defined for that object to extract
information from it. That way you can pass a fair bit of context through
many iterations, for instance. So, there are no closures in C++, and no
nested functions. However, there are function objects that in many
areas serve the same purpose. Functional objects allow us to
approximate some function or programming techniques. And actually
do it fairly elegantly and very efficiently.
A: Machines have become faster every year for the last about 40
years or so. My experience is that the only thing that grows faster than
hardware cycles is human expectation. I think that throwing away a
factor of 5, 10, 30, 50 of efficiency is acceptable in some areas. But
there are many areas where it is not acceptable. Traditionally, I have
been most interested in applications that are rather demanding on
time or space, and for those the basic efficiency of a programming
language matters.
Essentially, I think there are three parts to your question. One is the
machine efficiency, another is LISP, and the third is garbage collection.
I think they are separate issues. I am sure that today there are many
things that you could do in LISP that you couldn't get away with doing
in LISP ten years ago. LISP happens not to be my favorite language
for most of the things that I'm doing. But clearly the hardware
improvements help all languages including LISP. I think there's some
very nice aspects of LISP that have been quietly forgotten or maybe
people have been ruling out LISP for other reasons.
I tried to get the standards committee to write into the text the fact that
garbage collection could be used and document two or three positions
that has to be explicit for using garbage collection in C++. The third of
the C++ committee that really likes C, had conniptions. Therefore the
fact that garbage collection is a valid implementation technique for
C++ is still implicit in the standard.
A: In the early years of C++, I used to go around with a slide that had
two lists: advantages and disadvantages. You found C prominent in
both columns. There's no doubt that some of the main problems that
people have had with C++ has to do with its C heritage. On the other
hand, there is no doubt that some of the things that attract people the
most - and are seen as most important by a lot of people - is the C
heritage. I am opposed to anything that decreases the compatibility
between C and C++. I hope that the C community will have a similar
attitude toward C++.
I think for real problems the conversion rules between the various
integer types ranks very high on my list of annoyances in C and C++.
For sheer annoyance, the syntax also comes high. But the syntax you
get used to after a week. The only people I really worry about are
people who are proud that the can write things like the definition of a
function returning a pointer functions without using a typedef or
looking in a manual. When things like that become an issue of pride,
there's something wrong. I think that the C's model of arrays and
pointers is actually very fundamental, has a very good match to real
hardware, so I don't particularly want to throw that away. So I would
think very hard before doing something like that, but I don't have to
think very hard because I'm probably not going to touch it.
By the way, since I'm in a bookstore, I guess I should point out that I
have yet to answer a question that is not answered in this book (The
Design and Evolution of C++). I took the time a few years ago to
actually sit down and think what is this, why is it here, how did it come
about, and to write it down.
A: I will answer one question about Java. And I'll try to give a
reasonably exhaustive answer, instead of getting into all kinds of
lengthy discussions. Before there was Java there have been other
languages that offered similar facilities - Smalltalk, Modula-3, Eiffel
and such. So Java doesn't look as new to me as it does to some
people. I dislike hype rather strongly. And I think Java is floating on
hype. I dislike proprietary corporate languages rather strongly and
Java is one of those. So that might color my answer a bit.
If you read anything I've been writing for the last 5 years or so, even
as far back as 85, you see that Java doesn't have some of the things
that are really useful in programming languages. The ability to
efficiently provide new primitives. Try writing a new string class in
Java, that you could use to replace the existing one. Having true local
variables, having user-defined and built-in types work according to the
same rules. Having standard type-safe containers. Java is still
incomplete and growing. Java is only just starting down the road to
encompass some facilities provided by C++ - just as other languages
have done.
One of the design criteria for C++ was coexistence with other
languages. This is another criteria that Java hasn't considered - or
rather rejected.
Well, I'll tell you a story. I had an agent from one of the big publishers
phone me a couple of weeks ago, who said: "C++ book sales are
dramatically up, not just your book, and I was just thinking, could it be
refugees from Java?" I very sadly had to explain to him, no that could
not be the case because there certainly weren't enough of them.
A: It depends where you're starting from. For the things I think about
most, I still have too much code using pointers and C arrays. And I
want to clean that up. I have too much code that splatters all over the
global name space. I would like to experiment more with exceptions.
That's a little bit trickier for introducing into old code, just as I never did
find a really good way of building class hierarchies into an existing
system. It is rare that I get to write a system completely from scratch.
Suggestions? [email protected]
Customer Service: 800-789-8590 or 408-
752-9910
Copyright © 1996 - 1999 Computer
Literacy, Inc.