Data structure for sparse matrices for an online problem

Question

I need to compute a large linear optimization problem very often after recieving updates to my optimization problem.

That is I have a linear problem to find an x such that

$x_1 * c_1 + ... + x_n * c_n$ is as small as possible

under the conditions that $Ax \# b$ (where $\#$ can be $<=$, $>=$ or $=$ on each row).

At some (small) time interval the $a_{ij}$, $c_i$ and $b_i$ can get updates and I need to recompute. The updates only touch very few of the entries and most of the time the changes are only by small amounts.

Some of the $x_i$ are integer variables, some are binary variables and some are real variables.

Current idea:

find a sparse matrix data structure which allows for efficient updates
implement a branch-and-bound algorithm which works on that data structure

Question: which combination of sparse matrix representation and linear optimization algorithm is the current state of the art for such a problem?

Sub-question: is there a way to use the result from the previous run if I know only a few entries change?

D.W. · Answer 1 · 2015-06-14T06:33:11.433

In low dimension, Seidel's algorithm can be useful: if we have the optimal solution to $m$ constraints in $d$ dimensions, and you add one more constraint, then the amortized cost to find the optimal solution to those $m+1$ constraints is $O(d!)$, assuming the constraints are being presented in a random order.

(It is usually presented in the following form: if we have $m$ constraints, and we add them one-by-one, updating the optimal solution after each constraint is added, then the total running time to do this is $O(d!m)$ using Seidel's method. But that's equivalent, if we assume constraints are added in a random order.)

In high dimension, I don't know of any good method. There is a trivial observation that if $x^*$ is the optimal solution to the constraints, and we add a new inequality, and $x^*$ satisfies this new inequality, then $x^*$ remains the optimal solution, so there's no need to re-solve the LP. This condition can be tested in $O(d)$ time. However, if $x^*$ doesn't satisfy the added inequality, I don't know if there's anything better you can do than solve the new LP from scratch. Same for changing or removing an existing inequality, or changing the objective function.

I'm not sure your question about data structures is well-defined. Data structures are chosen based upon the set of operations you intend to perform on them. In this kind of problem, it's usually the algorithm that is the more important part. You start with the algorithm, then choose a suitable data structure. But my impression is that with LP, the hard part is the algorithms, and the data structures tend to be comparatively simple. So asking about sparse matrix data structures for your problem seems to be putting the cart ahead of the horse.

score 3 · Answer 2 · answered Jun 12 '15 at 23:53

The state-of-the art for any MILP like you describe, is complicated software like CPLEX (or some other expensive proprietary package). Such packages do take into account issues of sparse linear algebra, but this is a problem of numerical stability just as much as efficiency. See for example: (http://www.gurobi.com/resources/getting-started/lp-basics). Those packages may also contain a feature to repair an almost feasible solution, which you could use if the previous optimal solution is no longer feasible.

The integrality constraints make that the access times of the datastructure used to feed the algorithm, are dominated by the running time of the optimization part. This is combined with the fact that efficient approach to MILP's are not particularly amenable to reusing previous results. If you drop the integrality constraints (or they are redundant) this is different.

If you use the simplex method to solve an LP and then perturb the problem slightly, you may use the state of the algorithm at termination to solve the new problem, if the optimal solution doesn't change to much. What helps is that a local solution to an LP is also global. Ergo starting close to the previous optimal solution is beneficial.

If you start branching this is no longer the case. Though finding an optimal solution in somewhat similar, proving it's optimal requires you to rule out there is a better solution in every other branch. Branch and bound typically uses some binary tree to compactly express the branches that are 'ruled out'. The problem is that efficient branch and bound schemes do not retain information that could be used to adapt this tree if you change the problem coefficients slightly. Basically, since branches are ruled out based on results in solved sub-problems, changing the problem might invalidate the grounds on which a branch was ruled out. Only we don't know where this is happening, and therefore need to verify every ruled out branch. This is done most efficiently using a branching method again, which means we're back to square one (though it definitely helps if you have a good initial feasible solution).

Of course this is assuming a general MILP (i.e. not some special case that is polynomial time solvable), and that use a sensible datastructure (i.e. polynomial time mutations).

Data structure for sparse matrices for an online problem

2 Answers2