The Networks Behind the Microbiome
Analyzing the Phenomenon of Microbiome Variability
- Fig. 1: Illustration of the ESABO method using a randomly generated 15-species microbial interaction network with ten positive and ten negative interactions. The example (A) of two binary abundance vectors (for species B and G from the network in (B)) combined via a logical AND shows the main statistical signal evaluated by ESABO. In (C) the ESABO reconstruction of the network from (B) based on simulated abundance vectors is shown.
- Fig. 2: ESABO result for the large set of measured abundance patterns from .
There is a range of examples, where the number of events in each category and the pattern of presences and absences of categories provide a very different type of information. Imagine a diplomatic meeting involving a multitude of actors (companies, government agencies, NGOs, etc.). The total number of delegates attending from each of these actors will most likely provide information on the size and internal diversity of the actor (agency): A government body with a multitude of departments will be likely to contribute a larger set of delegates than a single-purpose citizen initiative. The presences and absences, on the other hand, will provide insight in strategic decisions. The fact that a particular agency opts not to participate or to participate (independently of the size of the delegation) is a highly informative data set in its own right.
In transcriptomics, the absolute expression level of a gene is often indicative of the gene product’s function: Typically the expression levels of (genes encoding) transcription factors are much smaller than those of metabolic enzymes, while the “on” and “off” pattern of genes expression is in many cases rather informative about the underlying regulatory networks [1–4].
In our recent investigation  we tried to adapt such a presence/absence analysis to the phenomenon of microbiome variability.
Similarly to the case of gene expression mentioned above, our question was: Which interaction network is most compatible with the pattern of presences and absences of microbial organisms in a large set of measured microbiome compositions?
Analyzing bacterial abundances has received widespread attention during the last decade . In various disciplines, ranging from marine soil microbiology to human microbiology, sequencing-based abundance estimation of bacterial taxa has become a widespread tool to gather information that has been ignored before. The immense clinical potential of microbiome analysis possibly is still in its promising onset [7,8].
Among the many severe diseases with difficulties in diagnosis and therapy, chronic inflammatory diseases seem most daunting in need of additional means of data acquisition, data analysis and computational modeling.
The study presented here  is embedded in the interdisciplinary collaboration sysINFLAME which focuses on the systems biology of chronic inflammatory diseases and is funded by the systems medicine initiative of the German Ministry of Research and Education, BMBF. The data analysis pilot study  is based on a dataset collected and preprocessed in the research groups headed by Andre Franke, John Baines and Wolfgang Lieb from the Christian-Albrecht University Kiel, the Max-Planck Institute for Evolutionary Biology Plön, and the University Clinics Schleswig-Holstein (UKSH) which hosts the data within the database Biobank PopGen. Abundances have been derived using standard techniques from 16S rRNA sequencing data for samples from 822 healthy subjects.
The ESABO method introduced in  (ESABO stands for Entropy Shift of Abundance Vectors under Boolean Operations) uses the ‘binary’ paradigm introduced above to reconstruct microbial interaction networks from microbiome abundance patterns.
There are two additional methodological layers of exploration accessible due to this step to a binary representation of the microbiome: (1) For each pair of microbial organisms (or taxa) we can statistically measure slight preferences, e.g., in co-occurences (both taxa present, (1, 1)) or mutual exclusion (one taxon present, the other absent, (1, 0) or (0, 1)) in the binary abundance vectors across a large number of samples. (2) We can also simulate such binary abundance patterns using a variant of a Random Boolean Network (RBN) model [2,4] and then test and calibrate our method on these simulated binary abundance patterns.
The clearest representation of such statistical shifts in the composition of such binary pairs is how the two binary abundance vectors ‘simplify’ under Boolean operations.
Figure 1A illustrates this procedure: The presence/absence vector of two taxa across multiple samples are shown. These taxa could be the microbial species B and G shown in the small, fictitious 15-species interaction network in figure 1B, which interact competitively. As a consequence of the negative interaction, in the binary abundance vectors the presence of B tends to coincide with an absence of G and vice versa. Combining these two vectors using a logical AND therefore leads to an almost constant binary vector. This simplification creates a powerful statistical signal, which is evaluated in the ESABO analysis.
Formally, the ESABO score is defined as this simplification (i.e., the entropy of the resulting vector) compared (via a z-score) with entropies derived from shuffled abundance vectors. If this z-score is smaller than –1, a negative (competitive) link between the two taxa is assumed. If the z-score is larger than +1, a positive (synergistic) link is assumed.
Figure 1C shows the ESABO reconstruction of the network from figure 1B when binary abundance vectors have been simulated using the variant of an RBN model mentioned above.
Application of the same analysis to measured abundance patterns on a phylum level yields the microbial interaction network shown in figure 2. It is well known that the main interactions among high-abundance phyla (like Bacteriodetes and Firmicutes) in such networks are predominantly inhibitory. Remarkably, the ESABO analysis shows in addition a dense systematic network of positive interactions among low-abundance species.
The microbiome is a topic of high theoretical fascination and dramatic practical relevance due to its association with disease phenotypes and its general relevance for human health [9,10].
We believe that the binary view implemented by the ESABO method  can help figure out some of the systematic properties of the microbiome and therefore contribute to a deeper theoretical understanding of microbiome compositions. But even on a very practical level the ESABO method can be used to interpret clinical data, by providing candidate interaction networks.
The ESABO method can be applied on the theoretical side to study the compatibility of networks on different phylogenetic levels. In subsequent work we intend to extend this method to the interpretation of individual microbiomes in the context of such a candidate interaction network derived from ESABO scores. In  one of the surprising findings was the large number of systematic and positive interactions, which complement the dominant negative interactions reported in the literature before. This systematic contribution of low-abundance taxa (the ‘rare biosphere’, ) to the interaction pattern of microbial organisms in the human gut emphasizes their relevance to the metabolic function of the whole system.
The authors gratefully acknowledge financial support from the German Ministery for Education and Research (Bundesministerium für Bildung und Forschung, BMBF) within the e:med program (sysINFLAME project; grant 01ZX1306D).
Jens Christian Claussen1 and Marc-Thorsten Hütt1
1 Department of Life Sciences and Chemistry, Jacobs University, Bremen, Germany
Prof. Dr. Marc-Thorsten Hütt
Priv.-Doz. Dr. Jens Christian Claussen
Department of Life Sciences & Chemistry
 Liang, S., Fuhrman, S., & Somogyi, R. (1998). Reveal, a general reverse engineering algorithm for inference of genetic network architectures. In Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing (Vol. 3, pp. 18-29). PMID: 9697168
 Kauffman, S. A. (1969). Metabolic stability and epigenesis in randomly constructed genetic nets. Journal of theoretical biology, 22(3), 437-467.
 Bornholdt, S. (2005). Less is more in modeling large genetic networks. Science, 310(5747), 449-451. doi: 10.1126/science.1119959
 Li, F., Long, T., Lu, Y., Ouyang, Q., & Tang, C. (2004). The yeast cell-cycle network is robustly designed. Proceedings of the National Academy of Sciences of the United States of America, 101(14), 4781-4786. doi: 10.1073/pnas.0305937101
 Claussen, J., Skiecevicene, J., Wang, J., Rausch, P., Karlsen, T., Lieb, W., Baines, J., Franke, A., and Hütt, M. (2016). Boolean analysis reveals systematic interactions among low-abundance species in the human gut microbiome. PLOS Computational Biology 13.6 (2017): e1005361.
 Human Microbiome Project C (2012). A framework for human microbiome research. Nature 486, 215-221. doi: 10.1038/nature11209
 Brown J., De Vos W.M., DiStefano P.S., Dore J., Huttenhower C., Knight R., Lawley T.D., Raes J., Turnbaugh P. (2013).
Translating the human microbiome. Nature Biotechnology 31(4), 304-308. doi: 10.1038/nbt.2543
 Clemente J.C., Ursell L.K., Parfrey L.W., Knight R. (2012). The Impact of the Gut Microbiota on Human Health: An Integrative View. Cell 148, 1258-1270. doi: 10.1016/j.cell.2012.01.035
 Cho, I., & Blaser, M. J. (2012). The human microbiome: at the interface of health and disease. Nature Reviews Genetics, 13(4), 260-270. doi: 10.1038/nrg3182
 Heinken, K., and Thiele, I. (2015) Systematic prediction of health- relevant human microbial co-metabolism through a computational framework, Gut Microbes 6, 120–130. doi: 10.1080/19490976.2015.1023494