'Why is the Index NOT out of bounds although it intuitively should?
I'm relatively new to C programming and I stumbled upon a for me unexplainable behaviour while running the following code and debugging it using gdb and lldb.
In short: When swapping the indices i and j (max i != max j) when accessing a value in a two-dimensional Array inside a double nested for-loop it does not seem to matter if I access the value using array[i][j] or array[j][i].
The two loops and arrays are mostly identical.
unsigned matrix[3][1] =
{
{3},
{4},
{5}
};
//Loop1
for (size_t i = 0; i < sizeof(matrix) / sizeof(*matrix); i++)
{
for (size_t j = 0; j < sizeof(matrix[i]) / sizeof(*matrix[i]); j++)
{
matrix[i][j] <<= 1;
printf("matrix[%zu][%zu] has the value: %d\n", i, j, matrix[i][j]);
}
}
//same two dimensional array as matrix
unsigned matrix2[3][1] =
{
{3},
{4},
{5}
};
//Loop2, basically the same loop as Loop1
for (size_t i = 0; i < sizeof(matrix2) / sizeof(*matrix2); i++)
{
for (size_t j = 0; j < sizeof(matrix2[i]) / sizeof(*matrix2[i]); j++)
{
//swapped i and j here
matrix2[j][i] <<= 1;
printf("matrix2[%zu][%zu] has the value: %d\n", j, i, matrix2[j][i]);
}
}
Am I missing here something?
In both cases i is passed the value 2 at the end of the outer loop and j the value 0 at the end of the inner loop.
Intuitively, matrix[0][2] should throw an exception as each row only has one element.
Solution 1:[1]
I will take a slightly different approach than the other respondents.
You are technically not reading outside of the array's boundary as far as the memory layout is concerned. Looking at it from a human perspective you are (the index [0][2] doesn't exist!), but the memory layout of the array is contiguous. Each of the "rows" of the matrix are stored next to each other.
In memory, your array is stored as: | ? | 3 | 4 | 5 | ? |
So when you index to matrix[1][0]
or matrix [0][1]
you are accessing the same position in memory. This would not be the case if your array was larger than 1 dimension wide.
For example, replace your array with the following one and experiment. You can access integer '4' either by indexing matrix[0][2]
, or matrix [1][0]
. The position [0][2] shouldn't exist, but it does because the memory is contiguous.
unsigned matrix[3][2] =
{
{3, 6},
{4, 8},
{5, 10}
};
Solution 2:[2]
Oops, matrix[0][2] should throw an exception as each row only has one element...
Some languages do warn the programmer by an exception if they try an out of bound access, but C does not. It just invokes Undefined Behaviour. On a technical point of view, it means that the compiler does not have to test the out of bound condition. On an operational point of view, it means that anything can happen, including expected behaviour... or an immediate crash... or a modification of an unrelated variable... or...
Solution 3:[3]
If my C skills aren't mega-rusty you're reading "unsafe memory".
Essentially your matrix is declared as a block of bytes. After that block of bytes there are more bytes. What are they? Usually more variables that are declared as your program's data. Once you reach the end of the program's data block you reach the user code memory block (encoded ASM instructions).
Most languages perform checks and throw an exception when you run out of bounds by somehow keeping track of the last index that is valid to access. C does not do that and doing such thing is your very own responsibility. If you aren't careful you might be overwriting important parts of your program's code.
There are attacks that one can perform on C programs that don't sanitize user input, like a buffer overrun; which exploits what it's been described.
Essentially if you declare a char[]
of length N and store a string that comes from outside and this string happens to be of length N+X you'll be overwriting program memory (instructions).
With the right sequence of characters you can inject your very own assembly code into a running program which doesn't sanitize user input
Solution 4:[4]
As your array is int and all elements are of the same size, i don't see any problem as your array is stored in contiguous space in RAM and you use a special case of matrix where inverting indexes has no side effect.
- In the first loop your indexes are [0][0], [1][0], [2][0]
- In the second loop your indexes are [0][0], [0][1], [0][2]
now try to linear the access, as your array is saved as linear array into the RAM.
address of element = row * NCOL + col
row: is row number
NCOL: number of columns into your matrix
col : the column number
so the linear index for :
[0][2] ==> 0 * 1 + 2 = 2 /* the third element*/
[2][0] ==> 2 * 1 + 0 = 2 /* always the third element */
But if you use a matrix of n x m , n >= 1 and m > 1 and n != m. if you inverse the indexes, the result will not be the same.
so if you take a 4 x 2 matrix
linear index of [3][1] = 3 * 2 + 1 = 7
linear index of [1][3] = 1 * 2 + 3 = 5 /* even [1][3] is out of the range of your matrix index */
[1][3] you will manipulate the element [2][1]
So be worry when manipulating matrix indexes.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | Serge Ballesta |
Solution 3 | |
Solution 4 |