A compression function is necessary to fulfill the requirements of a hash function. This is because hash functions are expected to be able to accept as input bit strings of arbitrary length, and output a bit string of a constant length*.
The specifics of how the compression function compresses the input string is particular to the construction of the hash function in question. For example, the Merkle–Damgård construction tends to use a block cipher as the compression function. As to how it compresses, it typically uses one block of hash input at a time as the key to a block cipher, and uses this to successively encrypt the internal state.
There is also the sponge construction, which compresses the input data in a different way. The sponge construction holds an internal state consisting of rate + capacity bytes, and "absorbs" rate bytes of hash input into the rate section of the state, either via xor or direct replacement, then mixes the state using an invertible permutation.
*technically there is a reasonable maximum in practice as to the size of the inputs, and the sponge construct is notable for being able to produce arbitrary length outputs.