We don't really know exactly how hard it is to find preimages of most cryptographic hash functions. Even for MD5 it is only "easy" to create collisions, while finding preimages is still considered "hard".
That said, we can apply statistical tools to estimate a "worst case", if we assume a random oracle for the hash function. In this case, we need in average $2^{255}$ tries to find a preimage to a 256 bit hash. Now, if we are looking only for a partial match (only 128 bits, it doesn't really matter which ones), we can estimate a match in average at $2^{127}$ tries.
Now, hash functions take an arbitrary input and map it to a fixed size output. There are no "preimages for M or K", there are just preimages (an input which maps to the same hash value). So once you found a preimage, you have this input. If the HMAC scheme requires a certain format of the input values for key, message and possibly other information, then you should only test inputs which fit this format.
To your second question: The attacker can not distinguish between the original message/key and any other preimage, if he has only the hash value (since they are the same). This might only be possible with additional information about the message: For example if the message is composed of English words in UTF8, preimages of random bits are not a "wrong preimage". However, in this case he would not even try this random input.
In general, the difference between HMAC and a cryptographic hash function is not that big. The difference becomes important, if the message is chosen from a small set of possible messages. In this case, the attacker can not just test all possibilities and find the real one, but he would have to test all messages with all keys (which should be larger than the hash value space). But in average, he will still find a preimage in 2^{k-1} tries for a k-bit hash value.
As final note: While $2^{127}$ possibilities is too much to brute-force, the security of this reduced hash value is less than the original one. Each ignored bit of the hash value cuts the complexity of finding a preimage by brute force into half. If you consider cutting down to something like every 4th bit of a 256 bit hash value (or any other fixed 64 bits of it), you get into the "possible to find with today's tools"-range. You have to ask yourself if this is worth saving a couple of bits for a potentially longer message, especially if you consider the other message overhead (e.g. IPv4 and TCP headers are each 160+ bits)