'Output using char * to point to int array [closed]

What is the output of following program written in C?

Is it 2 0 or 0 2, and why?

int main()
{
    int arr[]={2,3,4};  // predefined pointer
    char *p;
    p=(char *)arr;

    printf("%d\n",*p);
    printf("%p\n",p);
    p=p+1;

    printf("%d\n",*p);
    printf("%p\n",p);

    return 0;

}


Solution 1:[1]

The result depends on the endianness of your system as well as the size of an int (it also depends on the number of bits in a byte, but for now we'll assume it's 8).

Endianness dictates the ordering of bytes in types such as integers. x86 based processors are little-endian, meaning that the least significant byte is first, while others are big-endian meaning the most significant byte is first.

For example, for a variable of type int with the value 2, and assuming an int is 32 bit, the memory on a big-endian system looks like this:

-----------------
| 0 | 0 | 0 | 2 |
-----------------

While on a little-endian system it looks like this:

-----------------
| 2 | 0 | 0 | 0 |
-----------------

Moving on to what happens when you take a char * and point it to an int (or a member of an int array). Normally, using a pointer to one type to point to another type and read the value though the other pointer is a strict aliasing violation which invokes undefined behavior, however the C standard has an exception for character types to allow you to access the bytes in an object's representation. So in this case it's allowed.

When you do this:

p=(char *)arr;

It causes p to point to the first byte of the first member of the array arr.

On big endian systems:

-----
| . | p
-----
  |
  v
-------------------------------------------------
| 0 | 0 | 0 | 2 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 4 | arr
-------------------------------------------------
|    arr[0]     |    arr[1]     |    arr[2]     |
-------------------------------------------------

On little endian:

-----
| . | p
-----
  |
  v
-------------------------------------------------
| 2 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 4 | 0 | 0 | 0 | arr
-------------------------------------------------
|    arr[0]     |    arr[1]     |    arr[2]     |
-------------------------------------------------

So when you read the value of *p you'll get 0 on big endian systems and 2 on little endian systems.

When you then perform p=p+1, you increase the address p points to by 1 character, i.e. 1 byte, so now it looks like this:

Big endian:

-----
| . | p
-----
  |----
      v
-------------------------------------------------
| 0 | 0 | 0 | 2 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 4 | arr
-------------------------------------------------
|    arr[0]     |    arr[1]     |    arr[2]     |
-------------------------------------------------

Little endian:

-----
| . | p
-----
  |----
      v
-------------------------------------------------
| 2 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 4 | 0 | 0 | 0 | arr
-------------------------------------------------
|    arr[0]     |    arr[1]     |    arr[2]     |
-------------------------------------------------

Now *p contains the value 0 on both big endian and little endian systems. This assumes however that an int is 32-bit. If an int is 16 bit, it instead looks like this:

Big endian:

-----
| . | p
-----
  |----
      v
-------------------------
| 0 | 2 | 0 | 3 | 0 | 4 | arr
-------------------------
|arr[0] |arr[1] |arr[2] |
-------------------------

Little endian:

-----
| . | p
-----
  |----
      v
-------------------------
| 2 | 0 | 3 | 0 | 4 | 0 | arr
-------------------------
|arr[0] |arr[1] |arr[2] |
-------------------------

In this case *p is 2 on big endian systems and 0 on little endian systems after incrementing.

Solution 2:[2]

This

int arr[]={2,3,4};

looks like below if your system supports little endian, in case of big endian output may vary.

 arr[2]      arr[1]   |------------arr[0]-----------------------------|  
 ----------------------------------------------------------------------
|     4      |   3    | 0000 0000 | 0000 0000 | 0000 0000 | 0000 0010  | 
 ----------------------------------------------------------------------
           0x108    0x104       0x103      0x102       0x101       0x100 -- assume arr base address starts from 0x100
                                                                     arr
MSB                                                                  LSB

Now when you do

char *p;
p=(char *)arr;

Here p is a char pointer & arr type casted as char* which means pointer p points to one byte memory location at a time i.e first time 0x100 to 0x101.

When the statement

printf("%d\n",*p);

executes it prints what data is there in 0x100-0x101 location which is 2, hence it prints 2.

And next when you do

p=p+1;

the pointer p increments by one byte i.e now p points to 0x101 memory location and when the statement printf("%d\n",*p); executes it prints what data is there in 0x101-0x102 location which is 0, hence it prints 0.

Also while using %p you should typecast pointer variable as void* as printf("%p") and casting to (void *)

printf("%p\n",(void*)p);

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Achal