This problem is a special case of Problem SS7 from Garey & Johnson which will provide us the necessary reduction to sketch out a proof that OP's problem is NP-complete.
[SS7] SEQUENCING TO MINIMIZE MAXIMUM CUMULATIVE COST
INSTANCE: Set $T$ of tasks, partial order $\lessdot$ on $T$, a "cost" $c(t) \in Z$ for each $t \in T$ (if $c(t) < 0$, it can be viewed as "profit"), and a constant $K \in Z$
QUESTION: Is there a one-processor schedule $\sigma$ for $T$ that obeys the precedence constraints and which has the property that, for every task $t\in T$, the sum of the costs for all tasks $t'$ with $\sigma(t') \leq \sigma(t)$ is at most $K$?
Reference: [Abdel-Wahab, 1976]. Transformation from REGISTER SUFFICIENCY.
Comment: Remains NP-complete even if $c(t) \in [-1, 0, 1]$ for all $t\in T$. Can be solved in polynomial time if $\lessdot$ is series-parallel [Abdel-Wahab and Kameda, 1978], [Monma and Sidney, 1977]
A "schedule" $\sigma$ in this context refers to a bijection $\sigma: T \to \{1,2,\ldots,|T|\}$ (i.e.: an ordering of $T$).
Given an instance of SS7 as above, we'll construct an instance of OP's problem as follows. If $T = \{t_1, t_2, \ldots, t_n\}$, then we let:$$
A = \{a_1, a_2,\ldots, a_n\}\\
B = \{b_1, b_2, \ldots, b_n\}\\
E = \{(a_i, b_j) | t_i \lessdot t_j\} \cup \{(a_i, b_i) | i=1,2,\ldots, n\} \\
G = (A \cup B, E) \\
\mbox{If $c(t_i) > 0$, then $w(a_i) = c(t_i), w(b_i) = 0$ }\\
\mbox{If $c(t_i) \leq 0$, then $w(a_i) = 0, w(b_i) = c(t_i)$}
$$
If the $w$ weight function is not allowed to be zero, then I think we can repair this by using sufficiently small epsilons but I haven't worked out the details.
Now, it would be nice if the linear orderings of $T$ (i.e.: orderings of $T$ that respect $\lessdot$) were in bijection with all the permissible orderings of $G$, say by looking at the order of the occurrence of $B$-nodes in the $G$-ordering. Unfortunately, we have a little bit more freedom in the latter as we can see by taking $G$ as $2K_2$, for example. For concreteness, let $G = ( \{a_1, a_2, b_1, b_2\}, \{ \{a_1, b_1\}, \{a_2, b_2\} \} )$. Then, we see both $(a_1, a_2, b_1, b_2)$ and $(a_1, b_1, a_2, b_2)$ as distinct permissible orderings but with the $B$-nodes in the same order. Intuitively, we prefer the latter to the former since we're trying to minimize the maximum cumulative weight, thus leading us to formulate the following definition.
Definition: Let ${\bf{v}} = (v_1, v_2, \ldots)$ be some permissible ordering of $G$. Suppose further that the $B$-nodes occur at indices $i_1, i_2, \ldots$. Note that all the nodes $v_{i_j+1}, v_{i_j+2}, \ldots, v_{i_{j+1}}$ are $A$-nodes.
We will say that a permissible ordering is canonical if for every pair of successively appearing $B$-nodes, say, appearing at indices $i_j$ and $i_{j+1}$:
- The $A$-nodes that appear between $v_{i_j}$ and $v_{i_{j+1}}$ are in lexicographic order (e.g.: the order that they are specified in their original indices).
- For every $A$-node $a$ that appears between $v_{i_j}$ and $v_{i_{j+1}}$, the edge $\{a, v_{i_{j+1}}\}$ is present in $E$.
We'll treat these criteria as vacuously satisfied if two $B$-nodes appear consecutively (i.e.: there are no $A$-nodes appearing between that pair). We'll denote the obvious map from any permissible ordering $\bf{v}$ to its associated canonical ordering as $\chi(\bf{v})$.
Lemma 1: For any permissible ordering $\bf{v}$, the maximum cumulative weight of $\bf{v}$ is greater than or equal to the maximum cumulative weight of its associated canonical ordering $\chi(\bf{v})$.
Proof Sketch: Criterion 1 is used just to fix one ordering among many equivalent ones since we can observe that the ordering of the elements in these blocks won't affect the maximum cumulative weight. Criterion 2 encodes the intuition that if we're trying to minimize the maximum cumulative weight, we never need to include an $A$ node before it's needed.
Lemma 2: For every linear ordering $\sigma$ of $T$ the maximum cumulative cost of $\sigma$ is equal to the maximum cumulative cost of the canonical ordering of $G$ associated with $\sigma$.
Proof Sketch: If the linear ordering of $T$ is $(t_{\sigma(1)}, t_{\sigma(2)}, \ldots, t_{\sigma(n)})$, then the associated canonical ordering of $G$ is $(a_{\sigma(1)}, b_{\sigma(1)}, a_{\sigma(2)}, b_{\sigma(2)}, \ldots , a_{\sigma(n)}, b_{\sigma(n)})$. I think everything we need should follow from that. $\square$
Abdel-Wahab, Hussein M. Scheduling with applications to register allocation and deadlock problems. Diss. University of Waterloo, 1976.
Abdel-Wahab, Hussein M., and Tiko Kameda. "Scheduling to minimize maximum cumulative cost subject to series-parallel precedence constraints." Operations Research 26.1 (1978): 141-158.
Monma, C., and J. Sidney. A general algorithm for optimal job sequencing with series-parallel precedence constraints. Cornell University Operations Research and Industrial Engineering, 1977.