This started as a comment to @Poncho's fine answer, and grew over the 600-char limit. Point is: a careful choice of the definition of V2 messages can keep some the existing capabilities of the original CRC to always detect some kinds of errors.
Foremost, we are interested in short error bursts (where all bits in error are within a small number of consecutive bits, a fair model of some errors likely to occur under many practical circumstances); and, marginally, errors affecting an odd number of bits (this is desirable in some communication contexts using a descrambler with the property that any one-bit error at the physical level expands to a fixed error burst with an odd number of bits; and, if all else was pointless, it at least ensures that any single-bit error is detected).
I'll assume the CRC for V1 is such that the remainder of the polynomial representing the message (including the CRC portion therof) by a binary reduction polynomial $P$ (with degree $d$ and constant term $1$) is some constant independent of message content (at least for a given message length, in the absence of error). Any textboox CRC, and most of these used in communication contexts, are built with this property. Such CRC always detect errors bursts only affecting bits all within a segment of at most $d$ bits in the message and CRC. This holds because for all polynomial $Q$ of degree less than $d$ and constant term $1$ (representing the error burst), for all $n\in \mathbb N$ (representing the position of the last bit in error from the last bit of the CRC), $(Q\cdot x^n)\bmod P\ne 0$ (proof is easy by induction). Also, such CRC detects any error affecting an odd number of bits when the reduction polynomial $P$ has an even number of terms.
At least, we want to ascertain that, in the absence of transmission error, a V2 message can not be confused with one of V1, and vice versa. This is achieved simply if, in V2 messages, the field used as CRC for V1 is the exclusive-OR of the correct CRC for V1 and some non-zero $d$-bit constant $K$. This construction can be expressed in terms of polynomial: we add the polynomial of degree at most $d-1$ with binary coefficients representing $K$ to the polynomial representing the CRC. We'd like to choose $K$ in a manner optimizing the capability to detect our special kinds of errors.
This is sometime possible. For example, if the reduction polynomial is $P=x^8+1$ (that is, if the CRC for V1 reduces to bytewide exclusive-OR, within some initialization value), we should choose $K=x^7+x^6+x^5+x^4+x^3+x^2+x+1$ (that is, use a CRC for V2 that is the bytewide complement of the CRC for V1). This will catch any error where all bits affected are within $7$ consecutive bits (versus $8$ consecutive bits without the introduction of V2). And, because both $P$ and $K$ have an even positive number of bit set, this also detects any error involving an odd number of bits.
With the reduction polynomial $P=x^8+x^2+x+1$ used by ATM communication, we should choose $K=x^7+x^6+x^5+x^4+x^3+x^2+1$; that will also catch any error where all bits affected are within $7$ consecutive bits. This is because $(K\cdot x)\bmod P=K$, hence for any polynomial $Q$ of degree $6$ or less with non-zero constant term (representing a burst of error of length at most $7$), and any integer $n$ (representing the position of the error), $(Q\cdot x^n)\bmod P\ne K$ (proof by induction). Hence the error $Q$ can not change a valid V1 message into a valid V2 message, or vice versa; while the aforementioned CRC property to always detect short error burst insure that the error $Q$ can't go undetected without version change.
For primitive reduction polynomials $P$ and long messages, no matter how we choose $K$, a single-bit error can change a V2 message to V1, or vice versa. But we can still choose the constant $K$ such that, up to a certain message size, short error bursts are always detected. Unless I err, with the primitive reduction polynomial $P=x^8+x^4+x^3+x^2+1$ used by AES3 (no relation to the block cipher AES), we can use $K=x^7+x^3+x^2+1$, and messages up to 148 bits (including CRC) are fully protected against errors within a burst of at most 3 bits. This looks quite like what the question asks for.
Further, we can build V2 messages by first appending some other error detection code (or MAC, to become topical for crypto.se), then appending the CRC for V1 XOR-ed with $K$ as above. This makes it unlikely that a V1 message is changed to V2, and the undetected error rate for long random errors can be back to almost that for V1 (with most undetected errors transforming a message of either version into another V1 message). Still further, once a source is known to use V2, perhaps a receiver can refuse V1 messages form that source, and then we'll have V2 more robust than V1.