Inspired by this question. I'd wish to know about binary codes that seek to maximize, not the distance from each codeword to the nearest codeword, but the average distance from any vector ($n-$tuple) to the nearest codeword (or if both things are somewhat equivalent).
More in detail: Let $T=\{0,1\}^n$ be the set of all $n-$binary tuples. Let's call $C \subset T $ , with $|C|=M<2^n$, a "codebook" (with each element being a "codeword").
Let $$d_m^C=\min_{c_i, c_j \in C} d(c_i,c_j)\tag1$$ where $d()$ is the Hamming distance.
In the theory of linear error correcting codes, one clasically wants to design a $(n,k)$ code (with $M=2^k$ and $C$ being a vector subspace) which has a big $d_m^C$.
One could also be interested in the average distance from a each codeword to the nearest different codeword $$d^C_a= \frac{1}{M}\sum_{c_i \in C} \min_{c_j \in C,j \ne i} d(c_i,c_j) \tag2$$
... but for a linear code, $d^C_a=d^C_m$, because all terms inside the sum are equal.
Now suppose we are interested in computing the average distance to the nearest codeword, not from another codeword, but from each of the $2^n$ tuples:
$$d^T_a= \frac{1}{2^n}\sum_{x_i \in T} \min_{c_j \in C} d(x_i,c_j) \tag3$$
Again, we wish to design a $C$ (linear or not) that attains a big $d^T_a$, that is, that covers as evenly as possible (in this sense) the whole space.
Has this been studied? Some bounds or asymptotics? In particular: are the usual codes, designed for high $d^C_a$, expected to perform well also with regards to $d^T_a$? Some preliminary (mostly numerical) work of mine (for the other question) seems to suggest that a random code performs better than a (say) BCH code, which surprised me a bit. Also, it would seem that there is a computable asymptotical value, and that the random code attains it.
I'd appreciate any pointers or answer.