7

I started studying hypergraphs theory some days ago.

I know that a hypergraph is a tuple $H = (X, E)$, in which $E \subseteq \mathcal{P}(X)$ and is actually a generalisation of the notion of graph.

Though, I'm wondering why they're useful. I saw this example of this paper. They explain how in the first sample I can't discern whether an author wrote more than one article, whereas in the second one (with the hypergraph representation) I can easily get this information.

But this is not true, right? I can always attach the information on the edges or nodes to compute that. In addition, from what I understood, I can always represent hyperedges $e \in E$ as cliques, right? Hence, I can always (?) transform an hypergraph to a graph. I must be wrong.

My questions are:

  • Is the notion of hypergraphs really necessary?
  • Do hypergraphs and graphs have the same expressivity?
  • Can I represent something with hypergraphs which I can not with graphs?
danin
  • 95
  • 1
    If you represent hyperedges as cliques, how do you distinguish between hyperedges and cliques? For example, how would you represent the hypergraph on $3$ vertices with $1$ hyperedge of all $3$ elements as a graph? – Servaes Nov 06 '20 at 14:05

1 Answers1

8

If you try really hard, you can express anything with graphs, especially if you let yourself attach information to vertices or edges to help.

The way you're suggesting - replacing each hyperedge by a clique - is not the best. In this way, you can't distinguish the hyperedge $\{1,2,3,4,5\}$ from the three hyperedges $\{1,2,3\}$, $\{3,4,5\}$, and $\{1,2,4,5\}$. You have to have a graph with an edge-labeling saying which hyperedge each clique comes from, and that's awkward.

The standard way to represent hypergraphs as graphs is with the incidence graph. Given a hypergraph $(X,E)$, its incidence graph is the bipartite graph with vertices $X$ on one side, vertices $E$ on the other side, and an edge $xe$ if $x \in e$ in the hypergraph.

Via this representation, (multi-)hypergraphs are actually equivalent to bipartite graphs (with a designated "$X$" side and "$E$" side). Any multi-hypergraph gives a bipartite graph, and any bipartite graph gives a multi-hypergraph. Theorems about one can be turned into theorems about the other.

Sometimes we use hypergraphs anyway, because a concept is easier to express for the hypergraph than it is for the incidence graph. Many theorems about graphs have natural generalizations to hypergraphs, and representing them as incidence graphs is very unnatural.

Misha Lavrov
  • 159,700
  • Thank you very much. So basically, it's a matter of convenience. I guess running algorithms on hypergraphs is not as computationally expensive as running them on graphs under certain circumstances, am I right? For instance, in the example I provided I can count the cardinality of the hyperedge corresponding to one author, while in the incidence graph I have to visit the graph to compute the same information. Does this sound sensible? – danin Nov 06 '20 at 14:26
  • 1
    Computational complexity isn't necessarily affected, because both for graphs and for hypergraphs, we haven't specified how we encode the information. For example, we can represent a hypergraph with its incidence matrix ($A_{ve}=1$ if vertex $v$ lies on edge $e$, $A_{ve}=0$ otherwise) and a bipartite graph with its biadjacency matrix ($A_{vw}=1$ if vertex $v$ is adjacent to $w$, where rows correspond to one side of the bigraph and columns to the other side). For a hypergraph and its incidence graph, these two matrices will be the same! So in this model, computation complexity is unchanged. – Misha Lavrov Nov 06 '20 at 14:30
  • Alright, fair enough! Thank you for helping me with this :) – danin Nov 06 '20 at 14:33
  • Complexity will be affected I believe. The node distance for a bi partite graph might be considerable larger than the same graph represented as a hyper graph. So for the purpose of random walks, a hypergraph might hit deeper nodes quicker. Correct? – Jonathan Jan 12 '23 at 17:44
  • @Jonathan That should only be a factor-of-$2$ difference, which we usually ignore in such a situation. (It could easily get swamped by an implementation-specific difference that comes from how we represent the data in each case.) – Misha Lavrov Jan 12 '23 at 18:50