Problem (TL;DR): I'd like to know how to construct a CLA adder that has $O(n)$ size and $O(\log n)$ depth using only fan-in 2 AND gates and XOR gates, as suggested in this answer and this answer.
Settings and motivation: I am doing a research project where we need to add or subtract two $n$ bit strings in 2's completement representation. Specifically, I am working in the area of secure multiparty computation, where generally we express a function that we want to securely compute as a boolean circuit, and use some off-the-shelf "compiler" to "compile" such circuit into a secure protocol that can be executed between the parties without leaking information.
In our work, we care about two things: communication complexity and round complexity. It turns out that the aforementioned compiler requires $O(1)$ communication and $O(1)$ rounds to compute an AND gate and no communication or round to compute an XOR gate. We are only allowed to use these two types of gates but they are universal. Therefore, the communication complexity scales roughly with the size of the circuit and the round scales roughly with the depth of the circuit.
Constraints: Only fan-in 2 AND gates and XOR gates can be used.
My attempts: I understand how to do it in $O(\log n)$ depth, but the problem is that my carry-lookahead unit is too large in size, specifically, it is $O(n^2)$. I am looking at the following carry-lookahead formula from Wikipedia.
To compute $C_i$, we need to (at least) compute the $i$ OR's, which are just AND gates in disguise. Therefore, even we can compute the individual summands in constant size, it does not seem that I can get better than $O(n^2/2)$ because of the OR's. This implies that we must reuse computation in some clever way, and I've tried the work-efficient parallel prefixsum network or a segment-tree, none of which allowed me to cut these OR's or even just evaluates the individual summands efficiently.
I must be lacking some crucial insights here, after all, these expressions looks so temptingly recursive, but I've tried everything I can think of. I would appreciate either a fully worked-out solution or just a hint. Thanks for the helps in advance.

