Your question is an elementary consequence to the fact that the site percolation model (as you defined it) satisfies the Fortuin-Kasteleyn-Ginibre inequality for increasing events.
In your model, say that two colorings $\sigma$ and $\tau$ satisfy
$$\sigma \leq \tau$$
if and only if
$$\sigma(v) \leq \tau(v) \text{ for all gridpoints }v\,,$$
where we say that 'white < black'.
This defines a partial order on colorings. Call an event $A$ increasing if $\sigma \in A$ and $\sigma \leq \tau$ implies $\tau \in A$. The events $A$ and $B$ of diagonal black crossings you described are increasing events w.r.t. this partial order.
The FKG-inequality states that
$$P(A\cap B) \geq P(A) P(B) $$
for increasing events $A$ and $B$, or equivalently
$$ P(B\mid A) \geq P(B)\,.$$
To see that this is true, consider all colorings in $\sigma \in A$
and define the following Markov chain:
- For a given $\tau \in A$, choose a vertex $v$ randomly
- Recolor $\tau(v)$ white. If this new configuration is in
$A$, accept it with probability $1-p$, otherwise recolor it black.
The stationary distribution of this Markov chain is exactly $P(\cdot \mid A)$. Call this Markov chain $(\tau^t)$, with $tau^0$ the all-black coloring. Now, define the chain $\sigma^t$ via the process
- For a given $\sigma$, choose a vertex $v$ randomly
- Recolor $\sigma(v)$ white with probability $1-p$, otherwise black.
with $\sigma^0$ the al-white coloring. The stationary distribution of this Markov chain obviously is $P(\cdot)$. Now, we have $\sigma^0\leq \tau^0$, and the Markov chains can be coupled to ensure $\sigma^t \leq \tau^t$ for all $t\geq 0$ (simply choose the same vertex in Step 1, and base the choice in Step 2 on the same random number generated uniformly in $[0,1]$). As a consequence,
we get
$$P(B) = \lim_{t\to\infty} P(\sigma^t\in B) \leq \lim_{t\to\infty} P(\tau^t\in B) = P(B\mid A)\,.$$