-1

I have been trying to write a function that will insert commas into a binary number.

Below is my best attempt. It does work if I do NOT try to free() the memory. If I try to free() the memory, I get an error.

I am puzzled. Please let me know what I am doing wrong.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
 
int insertCommasIntoBinaryNumber(char* outstring, char* instring)
{

    char *ptr, *optr;
    int i, length, commas;
 
    // move ptr to end of instring 
    for ( ptr = instring; *ptr; ptr++ ); 
 
    //calculate offset with commas
    length = ptr - instring;
    commas = ( length - 1 ) / 8;
    optr = outstring + length + commas;   
 
    //copy instring into outstring backwards inserting commas
    *optr-- = *ptr--;
    for ( i = 1; ptr >= instring; i++ )
    {
        *optr-- = *ptr--;
        if ( ( i % 8 ) == 0 )
            *optr-- = ',';
    }
}

int main (void)
{
    
    const int arrayDimension = 100;
    char* instring = (char*) malloc(sizeof(char) * arrayDimension);
    char* outstring = (char*) malloc(sizeof(char) * arrayDimension);
    
    strncpy(instring, "111111110101010100001100", arrayDimension-1);
    
    insertCommasIntoBinaryNumber(outstring, instring);
    
    /* show the result */
    printf ( "%s\n", outstring );
    
    free(instring);
    free(outstring);
}

Here is the output:

11111111,01010101,00001100
*** Error in `./a.out': free(): invalid next size (fast): 0x0000000000bc8010 ***

P.S. Many thanks for letting me know where the code was crashing on the 24th iteration. I soon realized that I was not calculating the number of commas needed correctly and not keeping track of the number of commas being inserted. After I did that, the code below appears to work fine now.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
 
int insertCommasIntoBinaryNumber(char* const outString, const char* const inString)
{

    char const *iptr;   // iptr will be a pointer to the 
                        // constant inString char array.
    char *optr;
    int i, commaCount;
 
    // move iptr to end of inString 
    for ( iptr = inString; *iptr; iptr++ );  
 
    // Calculate Number of Commas Needed
    const int inStringLength = iptr - inString;
    const double totalNumberOfCommasFP = (( inStringLength ) / 8.0) - 1.0;
    const int totalNumberOfCommas = (int) ceil(totalNumberOfCommasFP);
    
    // Set optr
    optr = outString + inStringLength + totalNumberOfCommas;   
 
    //copy inString into outString backwards inserting commas
    *optr-- = *iptr--;
    
    commaCount = 0;
    for ( i = 1; iptr >= inString; i++ )
    {
        *optr-- = *iptr--;
        if ( ( ( i % 8 ) == 0 ) && (commaCount < totalNumberOfCommas) )
        {
            *optr-- = ',';
            commaCount++;
        }
    }
}

int main (void)
{
    const char testString[] = "111111110101010100001100";
    const int inStringArrayDimension = strlen(testString) + 1;

    char * inString = (char*) malloc(sizeof(char) * inStringArrayDimension);
    strncpy(inString, testString, inStringArrayDimension);
    
    const int inStringLength = (int) strlen(inString);
    const double totalNumberOfCommasFP = (( inStringLength ) / 8.0) - 1.0;
    const int totalNumberOfCommas = (int) ceil(totalNumberOfCommasFP);
    const int outStringArrayDimension = inStringArrayDimension + totalNumberOfCommas;
    char* outString = (char*) malloc(sizeof(char) * outStringArrayDimension);
    
    insertCommasIntoBinaryNumber(outString, inString);
    
    /* show the result */
    printf ( "%s\n", outString );
    
    free(inString);
    free(outString);
    
    exit (EXIT_SUCCESS);
}

And here is the output:

11111111,01010101,00001100
RobK
  • 156
  • 1
  • 9
  • You're going outside (before) the bounds of your allocated memory, Step through it with a pad of paper to figure out how / why / when. When you're working your way backwards put in some sort of test similar to `if (optr == ostring)` or something to stop before you go outside the bound. – LEF Aug 31 '21 at 00:39

2 Answers2

2

Since the string length is 24, for ( i = 1; ptr >= instring; i++ ) iterates 24 times. On the 24th iteration, optr points to the first character of outstring. Since (i % 8) == 0 is true, *optr-- = ','; is executed. That puts a comma before outstring, writing outside array bounds and corrupting your program’s memory.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • Many thanks for letting me know where the code was crashing on the 24th iteration. I soon realized that I was not calculating the number of commas needed correctly. And after I changed the >= to an > to avoid the issue with the last iteration, the code appears to work fine now. – RobK Aug 31 '21 at 01:16
1

I recommend that you do this

#include <assert.h>

//copy instring into outstring backwards inserting commas
assert (optr >= outstring);
*optr-- = *ptr--;


for ( i = 1; ptr >= instring; i++ )
{
    assert (optr >= outstring);
    *optr-- = *ptr--;
    if ( ( i % 8 ) == 0 ) {
        assert (optr >= outstring);
        *optr-- = ',';
    }
}

Then when the assertion goes off, debug it. You have almost certainly botched the space calculation, under-estimating how much space is needed to store the version of the datum with commas inserted.

Secondly, you're doing something that ISO C does not require to work: incrementing pointers below the start of an object. This is not the actual problem; even if you debug the malloc corruption, that issue is still there.

What I'm getting at is that this is not a correct idiom:

for (ptr = end_of_object; ptr >= start_of_object; ptr--)
{
   // ... loop body in which ptr is dereferenced
}

Here is why. When the last iteration of the loop occurs, ptr == start_of_object holds. The body of the loop is executed, and then, unconditionally, the ptr-- decrement is executed. Even though we do not execute the loop body any more, and therefore do not dereference ptr, it is still incorrect to be decrementing it. It is undefined behavior according to ISO C.

The idiom works in machine languages.

One way to avoid it is to use integer indexing.

for (i = num_elements - 1; i >= 0; i--)
{
   // work with array[i]
}

Here, i is assumed to be a signed integer type. At the top of the last iteration, i == 0 holds. Then, unconditionally, i is decremented to -1. Since array[i] is never accessed in this case, that is completely safe.

Lastly, a robust, production version of this type of code cannot just assume that you have all the space in the destination array. Your comma-inserting API needs to provide some means by which the caller can determine how much space is required. E.g.:

const char *str = "1110101101";

// determine how much space is needed for string with commas
size_t space = comma_insert_space_required(str);

// OK, allocate that much space
char *str_with_commas = malloc(space);

// now process the same string into that space 
comma_insert(str_with_commas, str);

This will work whether you have a five character input or five thousand.

If you choose an approach involving artificial limits in there like 100 bytes (the idea being no valid inputs will ever occur that come close), you still need defenses against that, so that you don't access an object out of bounds. "Bad guys" trying to break your software will look for ways to sneak in the "never occur" inputs.

Kaz
  • 55,781
  • 9
  • 100
  • 149
  • Thanks Kaz for the detailed response. I was able to figure out my issue. I was not calculating the number of commas needed correctly. And you were right I was decrementing a pointer to below the start of the character array. That cannot be good!! I changed the >= to just > to avoid that. And as you pointed out, it probably would have been easier to write the function using array indexing rather than pointers. But in my mind, the C language is all about pointers! – RobK Aug 31 '21 at 01:14
  • P.S. I am trying to add commas to binary numbers using my newly added %B printf specifier. (see https://stackoverflow.com/questions/68944842/alternatives-to-register-printf-specifier-to-print-numbers-in-binary-format-us). That code does check to see if the destination array is large enough. – RobK Aug 31 '21 at 01:14
  • @RobK But if you simply use `>` as your loop guard, then your loop body will never execute for the case when the pointer is aimed at the first byte. That can't be good, because you must fill in the first byte of the output string! – Kaz Aug 31 '21 at 01:37
  • I would build the output string left to right, in order of increasing address, using a decrementing counter of remaining digits to keep track of the comma insertion points. E.g. 6 characters: initialize `count = 6`. Suppose we want a comma every three digits: when `count % 3 == 0` insert a comma, except at the beginning. I.e. `if (ptr > out && count % d == 0) *ptr++ = ','`. Something like that. – Kaz Aug 31 '21 at 01:42
  • Thanks Kaz. I have updated my code. My code now keeps track of the number of commas being inserted. I even addressed your concerns about properly calculating the dimensions of the arrays. I also updated the data type declarations to make it easier for the compiler to catch any errors. – RobK Aug 31 '21 at 14:11