Dereference an array pointer… UB? – C / C++

  c

Q(Question):

Do you think we can reach any kind of consensus on whether the
following code’s behaviour is undefined by the Standard?

int my_array[5];

int const *const pend = *(&my_array + 1);

Considering the syntax of the language, then we definitely do
dereference an invalid pointer… but if we consider the mechanics of the
language, then we know that nothing "happens" when we dereference a pointer
to an array, because arrays are dealt with in terms of pointers.


Tomás Ó hÉilidhe

A(Answer):

"Tomás Ó hÉilidhe" <to*@lavabit.comwrote in message

>
Do you think we can reach any kind of consensus on whether the
following code’s behaviour is undefined by the Standard?

int my_array[5];

int const *const pend = *(&my_array + 1);

Considering the syntax of the language, then we definitely do
dereference an invalid pointer… but if we consider the mechanics of the
language, then we know that nothing "happens" when we dereference a
pointer to an array, because arrays are dealt with in terms of pointers.

my_array and &my_array resolve to the same thing. It’s a quirk of the
language.


Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

A(Answer):

"Malcolm McLean" <re*******@btinternet.comwrites:

"Tomás Ó hÉilidhe" <to*@lavabit.comwrote in message

>>
Do you think we can reach any kind of consensus on whether the
following code’s behaviour is undefined by the Standard?

int my_array[5];

int const *const pend = *(&my_array + 1);

Considering the syntax of the language, then we definitely do
dereference an invalid pointer… but if we consider the mechanics of the
language, then we know that nothing "happens" when we dereference a
pointer to an array, because arrays are dealt with in terms of pointers.

my_array and &my_array resolve to the same thing. It’s a quirk of the
language.

But my_array + 1 and &my_array + 1 don’t. The word "resolve" allows you
to be right (since you can mean what you like by it) but it hides the
important difference between the two expressions — their type.


Ben.

A(Answer):

On Feb 11, 8:36 pm, "Malcolm McLean" <regniz…@btinternet.comwrote:

"Tomás Ó hÉilidhe" <t…@lavabit.comwrote in message

Do you think we can reach any kind of consensus on whether the
following code’s behaviour is undefined by the Standard?

int my_array[5];

int const *const pend = *(&my_array + 1);

Considering the syntax of the language, then we definitely do
dereference an invalid pointer… but if we consider the mechanics of the
language, then we know that nothing "happens" when we dereference a
pointer to an array, because arrays are dealt with in terms of pointers.

my_array and &my_array resolve to the same thing. It’s a quirk of the
language.

Only in value context.
I believe it’s undefined behavior.
You dereference a pointer past the end of an object.
It is essentially the same with

int *foo;
int *bar = *(&foo+1);

Which is invalid.
&foo is an object, which can be treated as an array with 1 element.
Therefore, &foo+1 is a valid pointer, which cannot be dereferenced,
however you do dereference it.

It is invalid.

I am, however, not 100% sure about this, but it appears to be logical
and correct.

A(Answer):

Malcolm McLean:

my_array and &my_array resolve to the same thing. It’s a quirk of the
language.

I’m not sure what you mean by that.

my_array is a int[X] (and it decays to an int*)

&my_array is a int(*)[X] (and it DOESN’T decay to an int*)


Tomás Ó hÉilidhe

A(Answer):

"Tomás Ó hÉilidhe" <to*@lavabit.comwrote in message

Malcolm McLean:

>my_array and &my_array resolve to the same thing. It’s a quirk of the
language.

I’m not sure what you mean by that.

my_array is a int[X] (and it decays to an int*)

&my_array is a int(*)[X] (and it DOESN’T decay to an int*)

That was an error on my part.


Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

A(Answer):

vippstar:

It is essentially the same with

int *foo;
int *bar = *(&foo+1);

Which is invalid.

No no no, they’re not the same. Syntactically, yes they’re the same,
but mechanically, they’re not. The difference is that *(&foo+1) is an
actual value, it results in a value being read from memory.

&foo is an object, which can be treated as an array with 1 element.
Therefore, &foo+1 is a valid pointer, which cannot be dereferenced,
however you do dereference it.

You’re correct.

It is invalid.

I’m not sure I agrees, because an array doesn’t have a value. Its elements
do, but not the array itself.


Tomás Ó hÉilidhe

A(Answer):

On Feb 11, 9:52 pm, "Tomás Ó hÉilidhe" <t…@lavabit.comwrote:

vippstar:

It is essentially the same with

int *foo;
int *bar = *(&foo+1);

Which is invalid.

No no no, they’re not the same. Syntactically, yes they’re the same,
but mechanically, they’re not. The difference is that *(&foo+1) is an
actual value, it results in a value being read from memory.

I am not sure what you are talking about, however, both &foo and
&your_array are pointers.
int * and int (*)[X} respectively.
You point one past the end of what.. they point to, which is valid but
cannot dereferenced.
*(&foo+1) is not valid.

&foo is an object, which can be treated as an array with 1 element.
Therefore, &foo+1 is a valid pointer, which cannot be dereferenced,
however you do dereference it.

You’re correct.

And the same applies for &your_array. They are both pointers that
point to 1 valid thing. (foo and your_array respectively)

It is invalid.

I’m not sure I agrees, because an array doesn’t have a value. Its elements
do, but not the array itself.

We are, however not talking about arrays, but pointers.
I insist that my example is the same with what you are trying to do,
and they are both invalid.
I suggest to think of another solution for your problem, and if that
is not possible, consider if that is the _only_ way.

A(Answer):

vippstar:

int * and int (*)[X] respectively.
You point one past the end of what.. they point to, which is valid but
cannot dereferenced.

Dereference an int(*)[X] and you get an int[X], which doesn’t have a
value, and so it couldn’t result in an out-of-bounds memory access because
there shouldn’t be any memory access at all if arrays don’t have values.


Tomás Ó hÉilidhe

A(Answer):

On 2008-02-11, Malcolm McLean <re*******@btinternet.comwrote:

>
"Tomás Ó hÉilidhe" <to*@lavabit.comwrote in message

>>
Do you think we can reach any kind of consensus on whether the
following code’s behaviour is undefined by the Standard?

int my_array[5];

int const *const pend = *(&my_array + 1);

Considering the syntax of the language, then we definitely do
dereference an invalid pointer… but if we consider the mechanics of the
language, then we know that nothing "happens" when we dereference a
pointer to an array, because arrays are dealt with in terms of pointers.

my_array and &my_array resolve to the same thing. It’s a quirk of the
language.

No.
6.3.2.1/3
"Except when it is the operand of the sizeof operator /or the
unary & operator/ […] an expression that has type "array of type"
is converted to an expression with type "pointer to type" that
points to the initial element of the array object".

Marc Boyer

A(Answer):

On Feb 12, 7:21 am, "Tomás Ó hÉilidhe" <t…@lavabit.comwrote:

Do you think we can reach any kind of consensus on whether the
following code’s behaviour is undefined by the Standard?

int my_array[5];

int const *const pend = *(&my_array + 1);

&X + 1 is a pointer to one-past-the-end.
Dereferencing such a pointer this causes UB.
Doesn’t matter what data type the pointer is.

A(Answer):

Old Wolf:

&X + 1 is a pointer to one-past-the-end.
Dereferencing such a pointer this causes UB.
Doesn’t matter what data type the pointer is.

That’s a very superficial way of looking at it.

The REASON why it’s UB to dereference a pointer to one-past-the-last is
because it could result in an out-of-bounds memory access.

With a pointer to an array, nothing happens when you dereference it — all
that happens is that you’ve got an expression of int[X] rather than int(*)
[X].


Tomás Ó hÉilidhe

A(Answer):

Tomás Ó hÉilidhe:

With a pointer to an array, nothing happens when you dereference it —
all that happens is that you’ve got an expression of int[X] rather
than int(*) [X].

In fact, I’d go one step further to say that the following should be legal:
int (*parr)[X] = (int(*)[X])798797; /* Some random address (but which
doesn’t cause a trap)

*parr;

Tomás Ó hÉilidhe

A(Answer):

Tomás Ó hÉilidhe wrote:

Old Wolf:

>&X + 1 is a pointer to one-past-the-end.
Dereferencing such a pointer this causes UB.
Doesn’t matter what data type the pointer is.

That’s a very superficial way of looking at it.

The REASON why it’s UB to dereference a pointer to one-past-the-last is
because it could result in an out-of-bounds memory access.

Perhaps your point is that the Standard /should/ have defined a behavior,
but didn’t. I agree with that.

My reading is that a unary * applied to a function pointer is defined. A
unary * applied to a pointer to an object is defined. There are no other
cases defined for the unary * operator. Since &X+1 technically isn’t a
pointer to an object, *(&X+1) is undefined by omission.


Thad

A(Answer):

"Tomás Ó hÉilidhe" <to*@lavabit.comwrites:

Old Wolf:

>&X + 1 is a pointer to one-past-the-end.
Dereferencing such a pointer this causes UB.
Doesn’t matter what data type the pointer is.

That’s a very superficial way of looking at it.

The REASON why it’s UB to dereference a pointer to one-past-the-last is
because it could result in an out-of-bounds memory access.

The reason why it’s UB is that the standard doesn’t define the
behavior. (Though you’ve correctly described the rationale for what
the standard says.)

With a pointer to an array, nothing happens when you dereference it — all
that happens is that you’ve got an expression of int[X] rather than int(*)
[X].

An expression of array type is converted to a pointer. There has to
be something to convert in the first place.


Keith Thompson (The_Other_Keith) <ks***@mib.org>
Nokia
"We must do something. This is something. Therefore, we must do this."
— Antony Jay and Jonathan Lynn, "Yes Minister"

A(Answer):

Keith Thompson:

An expression of array type is converted to a pointer. There has to
be something to convert in the first place.

Yes but an array type isn’t a value — which is the very reason why
arrays decay to a pointer to their first element, so that we can actually
get a value out of them.


Tomás Ó hÉilidhe

A(Answer):

Tomás Ó hÉilidhe wrote:

Old Wolf:

>&X + 1 is a pointer to one-past-the-end.
Dereferencing such a pointer this causes UB.
Doesn’t matter what data type the pointer is.

That’s a very superficial way of looking at it.

The REASON why it’s UB to dereference a pointer to one-past-the-last is
because it could result in an out-of-bounds memory access.

I would say that the reason that the behavior is undefined is that the
committee didn’t realize (or appreciate) the potential utility of defining
the meaning of the unary * operator on pointer values derived from pointers
to objects, but not themselves a pointer to an object.


Thad

A(Answer):

On Feb 12, 9:56*am, "Tomás Ó hÉilidhe" <t…@lavabit.comwrote:

Keith Thompson:

An expression of array type is converted to a pointer. *There has to
be something to convert in the first place.

* * Yes but an array type isn’t a value — which is the very reason why
arrays decay to a pointer to their first element, so that we can actually
get a value out of them.

And what value will you get from a nonexistent, imaginary array, after
decaying it to a pointer to a nonexistent first element?

A(Answer):

Kaz Kylheku <kk******@gmail.comwrites:

On Feb 12, 9:56Â*am, "Tomás Ó hÉilidhe" <t…@lavabit.comwrote:

>Keith Thompson:

An expression of array type is converted to a pointer. Â*There has to
be something to convert in the first place.

Â* Â* Yes but an array type isn’t a value — which is the very reason why
arrays decay to a pointer to their first element, so that we can actually
get a value out of them.

And what value will you get from a nonexistent, imaginary array, after
decaying it to a pointer to a nonexistent first element?

You would get exactly the pointer the OP wanted — to the int
immediately following the 1D array. I say "would" because I think the
example is UB, though only because the utility of applying * to an
array pointer (pointing "one past" a whole array) was missed when
drawing up the rule about applying * to these "one past" pointers.


Ben.

A(Answer):

Kaz Kylheku <kk******@gmail.comwrites:

On Feb 11, 10:21Â*am, "Tomás Ó hÉilidhe" <t…@lavabit.comwrote:

>Â* Â* Do you think we can reach any kind of consensus on whether the
following code’s behaviour is undefined by the Standard?

Â* Â* int my_array[5];

Â* Â* int const *const pend = *(&my_array + 1);

You may have a pointer one element past the last element of an array
object. However, my_array as whole is not an element of an array. So
&myarray + 1 is invalid.

That is not a problem. There is explicit permission to this.
Anything that is not an array element is to be treated as it it were
an array of length one.

What you are doing is similar to computing p below:

int i, j[1];
int *p = &i + 1; // not right, i is not an array object

Expressly permitted. You can apply the * to this pointer, but you may
calculate the inter value and store it.

int *q = &j + 1; // okay, since j is an array object

<snip>


Ben.

A(Answer):

Kaz Kylheku <kk******@gmail.comwrites:

On Feb 11, 10:21Â*am, "Tomás Ó hÉilidhe" <t…@lavabit.comwrote:

>Â* Â* Do you think we can reach any kind of consensus on whether the
following code’s behaviour is undefined by the Standard?

Â* Â* int my_array[5];

Â* Â* int const *const pend = *(&my_array + 1);

You may have a pointer one element past the last element of an array
object. However, my_array as whole is not an element of an array. So
&myarray + 1 is invalid.

No, &myarray + 1 is valid. C99 6.5.6p7 (Additive operators):

For the purposes of these operators, a pointer to an object that
is not an element of an array behaves the same as a pointer to the
first element of an array of length one with the type of the
object as its element type.

&my_array + 1 is a valid pointer value of type int(*)[5], pointing
just past the end of my_array.

Since this pointer value doesn’t point to an object, attempting to
dereference it invokes UB, so *(&my_array + 1) is invalid.

What you are doing is similar to computing p below:

int i, j[1];
int *p = &i + 1; // not right, i is not an array object
int *q = &j + 1; // okay, since j is an array object

Again, &i + 1 is valid.

[snip]


Keith Thompson (The_Other_Keith) <ks***@mib.org>
Nokia
"We must do something. This is something. Therefore, we must do this."
— Antony Jay and Jonathan Lynn, "Yes Minister"

A(Answer):

On Feb 13, 8:57*am, Ben Bacarisse <ben.use…@bsb.me.ukwrote:

Kaz Kylheku <kkylh…@gmail.comwrites:

On Feb 12, 9:56*am, "Tomás Ó hÉilidhe" <t…@lavabit.comwrote:

Keith Thompson:

An expression of array type is converted to a pointer. *There has to
be something to convert in the first place.

* * Yes but an array type isn’t a value — which is the very reasonwhy
arrays decay to a pointer to their first element, so that we can actually
get a value out of them.

And what value will you get from a nonexistent, imaginary array, after
decaying it to a pointer to a nonexistent first element?

You would get exactly the pointer the OP wanted — to the int
immediately following the 1D array.

Of course, you would only get that if the implementation didn’t throw
a diagnostic in your face and stop the program first. 🙂

>*I say "would" because I think the
example is UB, though only because the utility of applying * to an
array pointer (pointing "one past" a whole array) was missed when
drawing up the rule about applying * to these "one past" pointers.

Exactly. Correctness is not just about getting the right value, but
about how you got it. What is 64/16? Ah, numerator 6 cancels the
denominator 6 so we get 4/1 = 4. 🙂

A(Answer):

Peter Nilsson said:

<snip>

As quickly as you can, please tell me which of the
following functions has UB.

Both of them. (Both do illegal pointer comparisons.)


Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" – dmr 29 July 1999

A(Answer):

Richard Heathfield <r…@see.sig.invalidwrote:

Peter Nilsson said:

<snip>

As quickly as you can, please tell me which of the
following functions has UB.

Both of them. (Both do illegal pointer comparisons.)

Are you saying that because that’s your "quickly as
you can" response, or because you genuinely think this
to be true?

If the latter, c&v for version 1 would be appreciated.


Peter

A(Answer):

Peter Nilsson said:

Richard Heathfield <r…@see.sig.invalidwrote:

>Peter Nilsson said:

<snip>

As quickly as you can, please tell me which of the
following functions has UB.

Both of them. (Both do illegal pointer comparisons.)

Are you saying that because that’s your "quickly as
you can" response, or because you genuinely think this
to be true?

No, you’re right – I read the article too quickly. Apologies.


Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" – dmr 29 July 1999

A(Answer):

Kaz Kylheku wrote:

"Tomás Ó hÉilidhe" <t…@lavabit.comwrote:

…. snip …

>

>Considering the syntax of the language, then we definitely do
dereference an invalid pointer… but if we consider the
mechanics of the language, then we know that nothing "happens"
when we dereference a pointer to an array, because arrays are
dealt with in terms of pointers.

We could also argue that “nothing” happens when you merely
increment a pointer out of bounds.

Piggybacking. Nonsense. Dereferencing an invalid pointer means
attempting to access memory that is not available to you. A system
that detects all errors should crash. Many won’t.


[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.


Posted via a free Usenet account from http://www.teranews.com

A(Answer):

In article <87************@kvetch.smov.org>
Keith Thompson <ks***@mib.orgwrote:

>Here’s something to chew on. It probably says something about the
original question, but I’m not sure what.

int main(void)
{
struct s {
int x;
int y[2];
} ;
volatile struct s obj = { 10, { 20, 30 } };

obj; /* Computes and discards the value of obj.
Must access obj.x, obj.y[0], and obj.y[1]. */

This seems reasonable, although I would be unsurprised to find
compilers that did not in fact access the three "int"s.

obj.x; /* Computes and discards the value of obj.x.
Must access obj.x. */

And must not access obj.y[0] and obj.y[1] (I believe).

obj.y; /* Computes and discards the address of obj.y[0].
Must this access obj.y[0] and obj.y[1]?
*May* it do so?
C&V? */

I think the answer to this is "no and no" but I cannot prove it.

If the answer *is* "no and no", I think this guarantees that the
OP’s construct (not included in this follow-up) is strictly conforming.

return 0;
}


In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22’N, 111°50.29’W) +1 801 277 2603
email: gmail (figure it out) http://web.torek.net/torek/index.html

LEAVE A COMMENT