[Phy 405/905 Home Page] [ Lecturer ]

3. Pointers - Part 2

We continue our discussion of pointers, showing their use in array manipulation, and take several detours along the way to discuss internal/external linkage, the keyword static, and other useful things.


Pointers used to access arrays

A very important use of pointers is in efficient manipulation of arrays. First, some basics of the low-level arrays provided in C++.

One-dimensional arrays are defined as in:

int q[20];
int a[] = {3,4,1}; // size of a[] is implicitly given by initializer
int c[10] = {3};
I've included examples with and without initialization. Some details:

const

The size of the array, if given between the []'s, must be a positive integer constant expression and defined at compile-time - as opposed to "dynamically-allocated", run-time arrays, see new() below. (Note: the GNU compiler includes an extension to allow for variable array size declaration - this is a feature which is not portable, and is not standard C++). Rather than using an integer literal, I'd recommend use of a const value for array size declaration:
void some_func(int q_things)
{
double bad_array[q_things]; // ****illegal! **** use new() instead

int bad_size = 20; // bad_size is in principle not a constant expression
double still_bad[bad_size]; // **** still illegal! ****

double ok_array[20]; // legal, but 20's a "magic number"
const int bufsize = 20;
double good[bufsize]; }
The const qualifier means that the object bufsize may not be changed - its value , defined at compile-time, is guaranteed to remain unchanged, in contrast to our attempt to specify at run-time the size of still_bad[] - although bad_size's value is known at compile-time, in principle its value could change before the statement defining still_bad[]. The const qualifier is useful in many other circumstances, and in contrast to the parameter in Fortran, a const object's constness is made explicit at the point of declaration.

Initializers

The array size may be omitted if the definition includes initialization, in which case the size is calculated. Initializers are given in braces {}. If the array size has been given explicitly and there are fewer initializers than elements, the remaining elements are initialized to zero. It is an error to have more initializers than elements (which is different from plain C, where this is allowed).
double x[1] = {3,4,5}; // **error*** too many initializers
int id_list[] = {1,2,3,4,5}; // same as int id_list[5] = {1,2,3,4,5};
double x[10000] = {1}; // *only* sets first element to 1, rest to 0.0!

There is default zero initialization of some types of arrays. The two kinds of declarations we know of so far - file scope (global) variables declared outside of all blocks, and local variables declared within a block (including the block defining a function) - differ in this respect. Locally defined arrays are not initialized by default, while externally defined arrays are. Arrays that are externally declared, but not defined, are not initialized at all.

// stuff outside of any block
double ext_x[20]; // all elements initialized to 0.0
int ext_q[20]; // all elements initialized to 0
extern not_defined_here[20]; // no initialization

void my_func()
{
double local_x[20]; // not initialized - elements remain
// undefined until assignment
}

A short detour on different variable types (storage class, linkage)

Storage class

The feature differentiating these two declaration types is not their scope (file vs. local), but instead their storage class, a distinction that applies to all types (not just the derived array type). The local array definitions we've used so far create objects that have storage class automatic, while the extern defined arrays have storage class static. The storage class specifies the "lifetime" of the object:
static object
The memory for a static object is allocated once, and remains in existence for the duration of the program; once assigned a value, that value persists until the next assignment
automatic object
Is "created" (allocated) each time program execution enters its scope (i.e., each time the object's definition statement gets executed), and whose value is then defined only up to exit from the block defining its scope; at this point, the object is deallocated
Objects that are static are default initialized, while automatic objects are not. Because automatic objects are allocated/deallocated so frequently, they are generally allocated from a special region of memory/hardware known as the stack, while static objects are allocated from the heap.
int i; // initialized to 0
void my_sub()
{
int j; // not initialized - undefined value!
cout << "i is "<<i<<" while j has some funky value:"<<j<<"\n";
}
Any compiler should warn you that "variable j used before being defined".

Linkage

Since I'm on the topic of storage class, I might as well let you know of the two other kinds of static definitions, both confusingly defined using the static keyword. They differ from the extern and local automatic variables defined so far in their linkage.
internal linkage
a name that , if used in other files, signifies different objects
external linkage
a name that, if used in other files, signifies the same object
The file scope declarations we already know about are external linkage, because simple declarations like:
double q; // this line lies outside of any block
automatically give the variable external linkage.
// in mydefs.h //////////////////////////////////////////////////
extern double q; // must be *defined* exactly once in some other file

// in myprog1.cc //////////////////////////////////////////////////
#include "mydefs.h"
void my_sub()
{
double m = q; // m assigned to q
}

// in myprog2.cc //////////////////////////////////////////////////
// for some reason, we don't include mydefs.h for this file
void other_func()
{
extern double q; // name of local scope, but external linkage
cout << "q is "<<q<<"\n"; // same q!
}
In the last function we used extern, which for names of either local or global scope, tells the compiler that the name has external linkage. Since I tend to minimize use of global objects for clarity's sake, in my programs, extern is primarily applied (in header files) to names of functions defined in one file but used in others. Be careful about the use of extern when you're actually intending to define the object, because without an initialization statement, the compiler parses an extern declaration as declaration-no-definition - which is exactly the form that should go in a header file:
// outside of all blocks
extern int i; // declaration but not definition of i
extern int j = 0; //declaration,def'n(because explicitly initialized)
int k; // declaration/definition/default initialization to 0
Anyways ... back the the static keyword. Applied to file scope names, it forces internal linkage:
// outside of all blocks
int k; // extern definition, file-scope can be referred to in other files
static int j; // file-scope, but not accessible outside this file
When I do use global variables, they're almost always static. In this way, they're conveniently accessible to all the functions in the file (file-scope, remember?), but there's no risk of conflict with the reuse of the name in any other files.

Applied to local scope names, static produces a local-scope object with static storage class. This means that in:

int my_sub()
{
static int num_calls;

++num_calls;
// other code, maybe to print out num_calls every 100 calls
}
the variable num_calls is initialized exactly once (to 0 automatically, because it's static), and then updated with each call of my_sub(), with the updated value retained from call to call. In older Fortran compilers, all variables were static by default.

Let's get back to arrays now.

Accessing array elements

The arrays in C++ are very low-level - aside from initialization, one access more than one element of the array at a time. In particular, the following is illegal:
// ***** illegal code follows - there is no "array assignment" expression
const int bufsize=2;
double f[bufsize]={1,20}, g[bufsize];
g = f; // illegal!!!!
Instead, access to each element of an array is accomplished using the subscripting operator []:
int i [4] = {3,1};
cout <<"first elementof i[] is "<<i[0]<<" and last is "<<i[3];
Very important note: the index of an array runs from 0...(# elements-1):
const int bufsize = 2;
double r[bufsize] = {3,4};
for (int i=0; i < bufsize; ++i) // i runs from 0 to bufsize-1
cout <<"r["<<i<<"] = "<<r[i]<<"\n";
The subscript is any integral type, even negative values, and there is no bounds checking!
double f[3];
f[-1] = 3; // unpredictable result
f[20] = 5; // unpredictable result
The reason why negative indices are allowed is because of the way subscripting is defined, and the near-equivalency of pointers and array names:
double q[20];
double *p = q; // p == &q[0]
*(p+9) = 4.0; // assigns value to q[9]
double *r = q; // r = &q[0]
*r++ = 3; // q[0] = 3, r= &q[1]
Putting this altogether, along with the fact that pointer addition is commutative, we get the following result:
double a[20];
a[4] = 3.0;
4[a] = 6.0; // now a[4] == 6.0

Strings

Character arrays are also know as strings. In contrast to string literals, character arrays are not necessarily null-terminated (i.e., the last element doesn't necessarily == '\0'), but any static storage class character arrays usually null-terminated - if default initialized, the whole array is zeroed out, and with explicit initialization by a string literal, the terminating '\0' on the string literal is copied into the character array. Don't forget that last '\0' in calculating string literal size:
// outside of any block
char q[] = "dog"; // q has 4 elements
char r[3] = "dog"; // **** error *** too many initializers
char s[] = {'a','b','c','d'}; // clumsier, but doesn't include '\0';
Note that initializing in the form used for other fundamental types will not include a final null-character automatically.

[ Phy 405/905 Home Page] [Lecturer ]