The other answers provide an insight on how to solve the question using elementary methods. I will try to add to the discussion by outlining how one may use typical tools from Markov chain theory to solve this problem. While the solution I present is overkill, it can easily be generalized to a larger class of problems containing the one you presented.
We can consider the given process as an absorbing Markov chain with four states. The states are as follows:
- We have neither seen heads nor tails;
- We have seen only heads before;
- We have seen only tails before;
- We have seen both heads and tails before.
The matrix of transition probabilities for the four states is as follows:
$$A=\begin{pmatrix}
0 & \frac 12 & \frac 12 & 0 \\
0 & \frac 12 & 0 & \frac 12 \\
0 & 0 & \frac 12 & \frac 12 \\
0 & 0 & 0 & 1
\end{pmatrix}.$$
We want to calculate the expected time until we reach state 4 from state 1. Let $\tau$ we the time it takes to reach state 4 from state 1.
Consider now the matrix
$$B=\begin{pmatrix} 0 & \frac 12 & \frac 12 \\ 0 & \frac 12 & 0 \\ 0 & 0 & \frac 12\end{pmatrix}.$$
This matrix describes the transitions between the transient states. By the layer-cake formula, we have
$$\mathbb E(\tau) = \sum_{k=1}^\infty \mathbb P(\tau\ge k).$$
Now $\mathbb P(\tau\ge k)$ is just the probability that after $k$ steps we are still in one of the transient states after starting in state 1.
Thus $$\mathbb P(\tau \ge k) = \text{first entry of }B^k \begin{pmatrix} 1 \\ 1 \\ 1\end{pmatrix}.$$
In particular, $$\mathbb E(\tau) = \text{first entry of }\left(\sum_{k=1}^\infty B^k\right)\begin{pmatrix} 1 \\ 1 \\ 1\end{pmatrix} =\text{first entry of } (\operatorname{Id} - B)^{-1} \begin{pmatrix} 1 \\ 1 \\ 1\end{pmatrix}.$$
We calculate $$(\operatorname{Id} - B)^{-1} = \begin{pmatrix}1 & 1 & 1 \\ 0 & 2 & 0 \\ 0 & 0 & 2\end{pmatrix}$$ and thus $$\mathbb E(\tau) = 3.$$ In fact, by taking the second and third entry of $$(\operatorname{Id} - B)^{-1} \begin{pmatrix} 1 \\ 1 \\ 1\end{pmatrix}$$ you get, respectively, the expected number of steps until you go to state 4 if you started in state 2 or state 3.
In conclusion, the answer to question a) is $3$.
Let us go to question b). Let $(Y_t)_{t\in\mathbb Z_{\ge 0}}$ denote the Markov chain of the states we are in. That is, we start with $Y_0=1$ and once we see heads we jump to $Y_\cdot = 2$, once we see tails we jump to $Y_\cdot = 3$, and if we have seen both we jump to $Y_\cdot = 4$. Having heads on the last throw before we stopped is equivalent to $Y_{\tau - 1} = 3$. (Exercise: Why is that so?)
Therefore
$$\mathbb P(\text{last throw is heads}) = \mathbb P(Y_{\tau-1} = 3) = \sum_{n=1}^\infty \mathbb P(\tau=n\land Y_{n-1}=3) = \sum_{n=1}^\infty \mathbb P(Y_n=4\mid Y_{n-1}=3)\mathbb P(Y_{n-1}=3) = \frac 12\sum_{n=1}^\infty\mathbb P(Y_{n-1}=3).$$
But $\sum_{n=1}^\infty\mathbb P(Y_{n-1}=3)$ is just the expected number of times we visit state $3$ before getting absorved when starting at state 1. This is the entry at position $(1,3)$ of the matrix $(\operatorname{Id}-B)^{-1}$, which is thus $1$. We therefore conclude
$$\mathbb P(Y_{\tau-1}=3)=\frac 12,$$
which thus establishes that the answer to part b) is $\frac 12$.