[Phy 405/905 Home Page] [ Lecturer ]

4. References

learn about references


What is a reference?

A reference defines a "synonym" for an object; after its (mandatory) initialization, the reference is never acted upon again - instead, the object it references is acted upon:
int x[2] = {4,3};
int *xp = &x;
int & x_ref = x[0]; // reference to x[0] (an int)
int *& xp_ref = xp; // ref. to xp (int *)

x_ref += 3; // x[0] += 3
*(&x_ref+1) = 5; // x[1] = 5
*xp_ref *= 6; // *xp *= 6
++xp_ref; // ++ xp
xp_ref[0] += 10; // *xp *=10
x_ref = x[1]; // x[0] = x[1], NOT reassigning x_ref (which is impossible)
The basic syntax for defining a reference to type T is:
T  x;
T &y = x; // y is a reference, initialized to refer to x
Note that if assignement (as opposed to initialization) of references were allowed, how would the following be interpreted?
int x, y, &xp = x;
xp = y; // is this "xp now refers to y" or "x=y"?

What references are not

References are not objects, unlike the things they point to. Other negative attributes of references that then follow:
double x=4, &x_ref = x;
double &*y = &x_ref; // illegal left side; anyways, right side IS "&x"
double *z = &x_ref; // z points to x, NOT x_ref
// using defs above
double &&y1 = x_ref; //illegal l.side - right side IS "x", not a reference
double & y2 = x_ref; // again, r. side IS "x", not a reference
double x, y, z;
double &v[3] = {x,y,z}; // illegal - no arrays of references
The reason (given in the Annotated Reference Manual,[[section]]8.4.3) is that by the equivalence of the subscripting operator and dereferencing (a[i] is *(a+i)). So in the last example, v[0]==*(v+0), which means that v would have to be a "double &*", which we've already disallowed.

Refere nces for function call-by-reference

Call by value

Up till now, all function calls have been made "by value" - the actual arguments used in the explicit call to a function are copied to the formal arguements used in the body of the function. Becuase the latter are only copies of the former, changes made in the function to the formal arguments do not affect the actual arguments:
void zero(double x)
{
x=0; // try to zero actual argument, which MUST fail
// because we pass by value
}
void test()
{
double y=4;
zero(y);
if (y)
cout<<"ALWAYS executed\n";
else
cout<<"Never exectuted\n";
}
We could have simulated call-by-reference - where the formal arguments are the actual arguments - using pointers:
void zero(double &x)
{
*x=0; // this time, zero() works!
}
void test()
{
double y=4;
zero(&y); // pass pointer to y by value (but y "by ref.")
if (y)
cout<<"Never executed\n";
else
cout<<"ALWAYS exectuted\n";
}

Function arguments passed by reference

Using references, however, we can effect true call-by-reference:
void stat(double *x, int size, double &mean, double &max, double &min)
{
if (!x || size <= 0)
{
cerr << "either array is null or non-pos. size!\n";
exit(1);
}
for (max=min=mean=*x; --size; ++x)
{
const double& val = *x; // note use of ref!

if (val > max) max = val;
if (val < min) min = val;
mean += val;
}
}
void test_stat()
{
double samples[] = {30,0.4, 6.0, -9.3d-4,4.2};
double mean,max,min;
stat(samples,sizeof(samples)/sizeof(double),mean,max,min);
cout <<"mean="<<mean<<" max="<<max<<" min="<<min<<'\n';
}
A few things to note:
extern int &x; // this is not a definition (only a declaration)
// if it is defined (once and only once) somewhere, it must
// be simultaneously initialized

References as function return values

What if we wanted to write our own subscript feature:
double& sub(double *x, int n_col,int row, int col)
{ return x[row*n_col+col]; }

void test_sub()
{
int n_col, n_row;
cout << "Enter number of rows, columns for matrix:\n";
cin >>n_row>>n_col;
if (n_col <= 0 || n_row <= 0)
{
cerr << "Illegal sizes (must be positive): (ncol,nrow)="
<<n_col<<','<<n_row<<'\n';
exit(1);
}
double *x = new double[n_col*n_row]; // 1-dim array simulating
// double[n_row][n_col]
for (int i=0; i < n_row; ++i)
for (int j=0; j < n_col; ++j)
sub(x,n_col,i,j) = 0; // "x[i][j]=0"
}
If this example has started you salivating at the prospect of really writing your own subscripting operator[], don't worry - you can, at least for types that you define. We will discuss this facility - "operator overloading" - in a couple lectures.

Incidentally, the reason that the function sub() didn't need n_row is the same reason that an array "double [1][2][3]" is automatically converted to "double (*)[2][3]".

References for argument passing efficiency

Fundamental types as arguments

So far, typical function arguments have been fairly limited in size. Even
void foo(double q[]);
void test()
{
double q[2000000];
foo(q); // pass &q[0] by value
}
results in an object of size sizeof(double *) being passed to foo(), not the much larger sizeof(q)==sizeof(double)*2000000 ~ 16,000,000 bytes (depending on the size of double). Although here, as usual, the argument is passed by value, what's passed is a pointer, which is much smaller than the pointed-to array.

Structures

When we begin to use user-defined types, or classes, however, the picture changes. The remainder of the semester will be devoted to learning all about classes, but for now, I'll only briefly define a special type of class, the structure.

Structures are derived types which essentially combine one or more smaller objects into a single encompassing structure. A structure "declaration" actually defines a new type (though it doesn't define a new object). Here's a structure we could have used to simplify the stat() function defined above:

struct Stat // structure declaration - *defines* type Stat
{
double mean, max, min;
}
Stat x; // legal in C++ only - defines object of type Stat
struct Stat y; // legal in C and C++ - a bit wordier
typedef struct Stat StatType; // for C equivalent of simple C++ decl'n:
StatType z; // defines object of type Stat
The objects contained within a Stat are accessed using the class member access operator '.':
x.mean = 0;
x.max += 5;
The relevant point here is that structures, like all objects, are passed by value:
struct BigArray
{
double array[5000000];
}
void foo(BigArray r);
void test()
{
BigArray q;
foo(q); // pass q by value
}
In the function call to foo(), the actual argument q is copied (all 5000000 contained doubles) to the formal argument r. Clearly there's an efficiency issue here; in this case, since the size of the object passed by value is not "small", passing by reference is probably the way to go:
struct BigArray
{
double array[5000000];
}
void foo(BigArray& r);
void test()
{
BigArray q;
foo(q); // pass q by reference
}
In this way, what gets passed to the function foo() is only something like a pointer to BigArray (which is how references are implemented by the compiler, of course), which is much smaller than a BigArray.

Of course, one can take the address of a structure (even a BigArray) and pass that to foo(), simulating call-by-reference, but I'll wait on discussing structure pointers for a little bit longer. Here's a short example just so you can see how things work:

BigArray q, *r = &q;
q.array[0] = 4;
(*r).m[0] = 4; // same thing as prev. line
r->m[0] = 4; // same thing as prev. line
The third line explains the utility of the syntax exemplified in the fourth line; because the dereferencing operator* has lower precedence than the class member accessing operator. the awkward parentheses are necessary. The fourth line shows the alternative operator->, which needs no extra parentheses and combines the two operations (* and .) into one.

[ Phy 405/905 Home Page] [Lecturer ]