ART

LOADING PUBLICATIONS

Prevalence of processed foods in major US grocery stores

Babak Ravandi, Gordana Ispirova, Michael Sebek, Peter Mehler, Albert-László Barabási & Giulia Menichetti

Nature Food (2025)

The offering of grocery stores is a strong driver of consumer decisions. While highly processed foods such as packaged products, processed meat and sweetened soft drinks have been increasingly associated with unhealthy diets, information on the degree of processing characterizing an item in a store is not straightforward to obtain, limiting the ability of individuals to make informed choices. GroceryDB, a database with over 50,000 food items sold by Walmart, Target and Whole Foods, shows the degree of processing of food items and potential alternatives in the surrounding food environment. The extensive data gathered on ingredient lists and nutrition facts enables a large-scale analysis of ingredient patterns and degrees of processing, categorized by store, food category and price range. Furthermore, it allows the quantification of the individual contribution of over 1,000 ingredients to ultra-processing. GroceryDB makes this information accessible, guiding consumers toward less processed food choices.

NetMedPy: A Python package for Large-Scale Network Medicine Screening

Andrés Aldana, Michael Sebek, Gordana Ispirova, Rodrigo Dorantes-Gilardi, Albert-László Barabási, Joseph Loscalzo, Giulia Menichetti

bioRxiv, 2024

Summary Network medicine leverages the quantification of information flow within sub-cellular networks to elucidate disease etiology and comorbidity, as well as to predict drug efficacy and identify potential therapeutic targets. However, current Network Medicine toolsets often lack computationally efficient data processing pipelines that support diverse scoring functions, network distance metrics, and null models. These limitations hamper their application in large-scale molecular screening, hypothesis testing, and ensemble modeling. To address these challenges, we introduce NetMedPy, a highly efficient and versatile computational package designed for comprehensive Network Medicine analyses.

Availability NetMedPy is an open-source Python package under an MIT license. Source code, documentation, and installation instructions can be downloaded from https://github.com/menicgiulia/NetMedPy and https://pypi.org/project/NetMedPy. The package can run on any standard desktop computer or computing cluster.

Human-AI coevolution

Dino Pedreschi, Luca Pappalardo, Emanuele Ferragina, Ricardo Baeza-Yates, Albert-László Barabási, Frank Dignum, Virginia Dignum, Tina Eliassi-Rad, Fosca Giannotti, János Kertész, Alistair Knott, Yannis Ioannidis, Paul Lukowicz, Andrea Passarella, Alex Sandy Pentland, John Shawe-Taylor, Alessandro Vespignani

Artificial Intelligence, Volume 339, 104244, 0004-3702 (2025)

Human-AI coevolution, defined as a process in which humans and AI algorithms continuously influence each other, increasingly characterises our society, but is understudied in artificial intelligence and complexity science literature. Recommender systems and assistants play a prominent role in human-AI coevolution, as they permeate many facets of daily life and influence human choices through online platforms. The interaction between users and AI results in a potentially endless feedback loop, wherein users' choices generate data to train AI models, which, in turn, shape subsequent user preferences. This human-AI feedback loop has peculiar characteristics compared to traditional human-machine interaction and gives rise to complex and often “unintended” systemic outcomes. This paper introduces human-AI coevolution as the cornerstone for a new field of study at the intersection between AI and complexity science focused on the theoretical, empirical, and mathematical investigation of the human-AI feedback loop. In doing so, we: (i) outline the pros and cons of existing methodologies and highlight shortcomings and potential ways for capturing feedback loop mechanisms; (ii) propose a reflection at the intersection between complexity science, AI and society; (iii) provide real-world examples for different human-AI ecosystems; and (iv) illustrate challenges to the creation of such a field of study, conceptualising them at increasing levels of abstraction, i.e., scientific, legal and socio-political.

A Network-Based Framework to Discover Treatment-Response–Predicting Biomarkers for Complex Diseases

Uday S. Shanthamallu, Casey Kilpatrick, Alex Jones, Jonathan Rubin, Alif Saleh, Albert-László Barabási, Viatcheslav R. Akmaev, Susan D. Ghiassian

The Journal of Molecular Diagnostics, Vol. 26, 10, 917 - 930

The potential of precision medicine to transform complex autoimmune disease treatment is often challenged by limited data availability and inadequate sample size when compared with the number of molecular features found in high-throughput multi-omics data sets. To address this issue, the novel framework PRoBeNet (Predictive Response Biomarkers using Network medicine) was developed. PRoBeNet operates under the hypothesis that the therapeutic effect of a drug propagates through a protein-protein interaction network to reverse disease states. PRoBeNet prioritizes biomarkers by considering i) therapy-targeted proteins, ii) disease-specific molecular signatures, and iii) an underlying network of interactions among cellular components (the human interactome). PRoBeNet helped discover biomarkers predicting patient responses to both an established autoimmune therapy (infliximab) and an investigational compound (a mitogen-activated protein kinase 3/1 inhibitor). The predictive power of PRoBeNet biomarkers was validated with retrospective gene-expression data from patients with ulcerative colitis and rheumatoid arthritis and prospective data from tissues from patients with ulcerative colitis and Crohn disease. Machine-learning models using PRoBeNet biomarkers significantly outperformed models using either all genes or randomly selected genes, especially when data were limited. These results illustrate the value of PRoBeNet in reducing features and for constructing robust machine-learning models when data are limited. PRoBeNet may be used to develop companion and complementary diagnostic assays, which may help stratify suitable patient subgroups in clinical trials and improve patient outcomes.

A key promise of precision medicine is the ability to match patient subgroups with the most appropriate treatments.1 This is achieved by discovering biomarkers that connect a patient's biological status with therapeutic outcomes for a specific therapy. In precision medicine, biomarkers can be discovered using machine-learning models that unveil complex, generalizable patterns from large molecular and clinical data sets, usually comprising data from hundreds to thousands of patients. For example, analyzing extensive molecular and clinical data sets from patients with cancer, machine-learning models found biomarkers that predict response to treatment in patients with diverse cancers.2–8 These models have substantially improved outcomes and survival rates for many cancer subtypes and greatly reduced the financial burden on health care payers.

Decoding the Foodome: Molecular Networks Connecting Diet and Health

Giulia Menichetti, Albert-László Barabási, and Joseph Loscalzo

Annual Review of Nutrition, 44(1), 257–288 (2024)

Diet, a modifiable risk factor, plays a pivotal role in most diseases, from cardiovascular disease to type 2 diabetes mellitus, cancer, and obesity. However, our understanding of the mechanistic role of the chemical compounds found in food remains incomplete. In this review, we explore the “dark matter” of nutrition, going beyond the macro- and micronutrients documented by national databases to unveil the exceptional chemical diversity of food composition. We also discuss the need to explore the impact of each compound in the presence of associated chemicals and relevant food sources and describe the tools that will allow us to do so. Finally, we discuss the role of network medicine in understanding the mechanism of action of each food molecule. Overall, we illustrate the important role of network science and artificial intelligence in our ability to reveal nutrition's multifaceted role in health and disease.

Measuring Entanglement in Physical Networks

Cory Glover and Albert-László Barabási

Physical Review Letter 133(7), 077401 (2024)

The links of a physical network cannot cross, which often forces the network layout into nonoptimal entangled states. Here we define a network fabric as a two-dimensional projection of a network and propose the average crossing number as a measure of network entanglement. We analytically derive the dependence of the average crossing number on network density, average link length, degree heterogeneity, and community structure and show that the predictions accurately estimate the entanglement of both network models and of real physical networks.

Improving the performance and interpretability on medical datasets using graphical ensemble feature selection

Enzo Battistella, Dina Ghiassian, Albert-László Barabási

Bioinformatics, 40 (6) btae341 (2024)

Abstract

Motivation

A major hindrance towards using Machine Learning on medical datasets is the discrepancy between a large number of variables and small sample sizes. While multiple feature selection techniques have been proposed to avoid the resulting overfitting, overall ensemble techniques offer the best selection robustness. Yet, current methods designed to combine different algorithms generally fail to leverage the dependencies identified by their components. Here, we propose Graphical Ensembling (GE), a graph-theory-based ensemble feature selection technique designed to improve the stability and relevance of the selected features.

Results

Relying on four datasets, we show that GE increases classification performance with fewer selected features. For example, on rheumatoid arthritis patient stratification, GE outperforms the baseline methods by 9% Balanced Accuracy while relying on fewer features. We use data on sub-cellular networks to show that the selected features (proteins) are closer to the known disease genes, and the uncovered biological mechanisms are more diversified. By successfully tackling the complex correlations between biological variables, we anticipate that GE will improve the medical applications of machine learning.

Hidden citations obscure true impact in science

Xiangyi Meng, Onur Varol, Albert-László Barabási

PNAS nexus, 3(5) pgae155 (2024)

References, the mechanism scientists rely on to signal previous knowledge, lately have turned into widely used and misused measures of scientific impact. Yet, when a discovery becomes common knowledge, citations suffer from obliteration by incorporation. This leads to the concept of hidden citation, representing a clear textual credit to a discovery without a reference to the publication embodying it. Here, we rely on unsupervised interpretable machine learning applied to the full text of each paper to systematically identify hidden citations. We find that for influential discoveries hidden citations outnumber citation counts, emerging regardless of publishing venue and discipline. We show that the prevalence of hidden citations is not driven by citation counts, but rather by the degree of the discourse on the topic within the text of the manuscripts, indicating that the more discussed is a discovery, the less visible it is to standard bibliometric analysis. Hidden citations indicate that bibliometric measures offer a limited perspective on quantifying the true impact of a discovery, raising the need to extract knowledge from the full text of the scientific corpus.

Mapping philanthropic support of science

Louis M. Shekhtman, Alexander J. Gates & Albert-László Barabási

Scientific Reports 14, 9397 (2024)

While philanthropic support for science has increased in the past decade, there is limited quantitative knowledge about the patterns that characterize it and the mechanisms that drive its distribution. Here, we map philanthropic funding to universities and research institutions based on IRS tax forms from 685,397 non-profit organizations. We identify nearly one million grants supporting institutions involved in science and higher education, finding that in volume and scope, philanthropy is a significant source of funds, reaching an amount that rivals some of the key federal agencies like the NSF and NIH. Our analysis also reveals that philanthropic funders tend to focus locally, indicating that criteria beyond research excellence play an important role in funding decisions, and that funding relationships are stable, i.e. once a grant-giving relationship begins, it tends to continue in time. Finally, we show that the bipartite funder-recipient network displays a highly overrepresented motif indicating that funders who share one recipient also share other recipients and we show that this motif contains predictive power for future funding relationships. We discuss the policy implications of our findings on inequality in science, scientific progress, and the role of quantitative approaches to philanthropy.

Reproducible science of science at scale: pySciSci

Alexander J. Gates , Albert-László Barabási

Quantitative Science Studies 4 (3) 700–710 (2023)

Science of science (SciSci) is a growing field encompassing diverse interdisciplinary research programs that study the processes underlying science. The field has benefited greatly from access to massive digital databases containing the products of scientific discourse—including publications, journals, patents, books, conference proceedings, and grants. The subsequent proliferation of mathematical models and computational techniques for quantifying the dynamics of innovation and success in science has made it difficult to disentangle universal scientific processes from those dependent on specific databases, data-processing decisions, field practices, etc. Here we present pySciSci, a freely available and easily adaptable package for the analysis of large-scale bibliometric data. The pySciSci package standardizes access to many of the most common data sets in SciSci and provides efficient implementations of common and advanced analytical techniques.

The clinical trials puzzle: How network effects limit drug discovery

Kishore Vasan, Deisy Morselli Gysi, Albert-László Barabási

iScience 26, 26(12) 108361 (2023)

The depth of knowledge offered by post-genomic medicine has carried the promise of new drugs, and cures for multiple diseases. To explore the degree to which this capability has materialized, we extract meta-data from 356,403 clinical trials spanning four decades, aiming to offer mechanistic insights into the innovation practices in drug discovery. We find that convention dominates over innovation, as over 96% of the recorded trials focus on previously tested drug targets, and the tested drugs target only 12% of the human interactome. If current patterns persist, it would take 170 years to target all druggable proteins. We uncover two network-based fundamental mechanisms that currently limit target discovery: preferential attachment, leading to the repeated exploration of previously targeted proteins; and local network effects, limiting exploration to proteins interacting with highly explored proteins. We build on these insights to develop a quantitative network-based model to enhance drug discovery in clinical trials.

A network-based normalized impact measure reveals successful periods of scientific discovery across discipline

Qing Ke, Alexander J. Gates, and Albert-László Barabási

PNAS 120 (48) e2309378120 (2023)

The impact of a scientific publication is often measured by the number of citations it receives from the scientific community. However, citation count is susceptible to well-documented variations in citation practices across time and discipline, limiting our ability to compare different scientific achievements. Previous efforts to account for citation variations often rely on a priori discipline labels of papers, assuming that all papers in a discipline are identical in their subject matter. Here, we propose a network-based methodology to quantify the impact of an article by comparing it with locally comparable research, thereby eliminating the discipline label requirement. We show that the developed measure is not susceptible to discipline bias and follows a universal distribution for all articles published in different years, offering an unbiased indicator for impact across time and discipline. We then use the indicator to identify science-wide high impact research in the past half century and quantify its temporal production dynamics across disciplines, helping us identifying breakthroughs from diverse, smaller disciplines, such as geosciences, radiology, and optics, as opposed to citation-rich biomedical sciences. Our work provides insights into the evolution of science and paves a way for fair comparisons of the impact of diverse contributions across many fields.

Who Supports American Art Museums? Introducing a New Dataset and Data Sources about Museum Funding

Albert-László Barabási, Louis Shekhtman

Panorama: Journal of the Association of Historians of American Art 9, no. 2 (Fall 2023)

“New Scrutiny of Museum Boards Takes Aim at World of Wealth and Status.” “Warren Kanders Quits Whitney Board after Tear Gas Protests.” “Julie Mehretu Becomes Third Artist Ever to Join Whitney Board.” These are all headlines that have run in the New York Times since 2019.1 Whether condemning how trustees have made their money or celebrating new and diverse perspectives added to boards, they are exemplary of the ways in which the funding of art museums in the United States is, of late, a divisive topic. In many other countries—especially in Europe—governments serve as the main source of support for the arts. In the United States, governmental support largely takes a back seat to funding from private individuals and foundations. Private donors, in particular, play a significant role not only as sources of financial support but also in taking on major governance roles as trustees of institutions.

This funding structure leads to important questions about what roles these donors play in museums and how they influence which works are displayed, institutional priorities, and myriad other issues—in addition to ethical questions about the sources of funds used to support art museums.2 For all the discussion of this topic, however, there is a paucity of data available to inform the conversation. This essay seeks to start rectifying that by showing the ways in which public tax filings of both museums and foundations that donate to museums (often called institutional donors) can create a dataset that allows scholars and cultural commentators to understand better who funds and governs art institutions in the United States. To supplement the tax data, we also use a corpus of museum annual reports that have been published online.

As network scientists, we often seek to bring large datasets to bear on subjects that may not have previously had significant quantitative data available as part of their analytical toolkit.3 We came to the topic of museum funding through another project that used crowdsourced data from the LittleSis database to understand how billionaires and their families were connected to a range of not-for-profits, including arts institutions.4 As figure 1 shows, certain institutions, such as the Museum of Modern Art (MoMA) in New York City and the Kennedy Center in Washington, DC, attract many billionaires, serving as the center of an elite network of wealthy donors, while others, like Pérez Art Museum Miami, are supported by just one billionaire—in this case the billionaire for whom the museum is named. This essay builds on that initial work on studying networks of billionaires and their philanthropic giving by focusing on philanthropic giving to art museums in the United States in particular. In line with Panorama’s focus on American art, we center our attention on the funding of “American art” by using a sample of museums that articulate their support of American art in their mission statements.

Impact of physicality on network structure

Márton Pósfai, Balázs Szegedy, Iva Bačić, Luka Blagojević, Miklós Abért, János Kertész, László Lovász & Albert-László Barabási

Nature Physics 20, 142–149 (2024)

The emergence of detailed maps of physical networks, such as the brain connectome, vascular networks or composite networks in metamaterials, whose nodes and links are physical entities, has demonstrated the limits of the current network science toolset. Link physicality imposes a non-crossing condition that affects both the evolution and the structure of a network, in a way that the adjacency matrix alone—the starting point of all graph-based approaches—cannot capture. Here, we introduce a meta-graph that helps us to discover an exact mapping between linear physical networks and independent sets, which is a central concept in graph theory. The mapping allows us to analytically derive both the onset of physical effects and the emergence of a jamming transition, and to show that physicality affects the network structure even when the total volume of the links is negligible. Finally, we construct the meta-graphs of several real physical networks, which allows us to predict functional features, such as synapse formation in the brain connectome, that agree with empirical data. Overall, our results show that, to understand the evolution and behaviour of real complex networks, the role of physicality must be fully quantified.

Non-Coding RNAs Improve the Predictive Power of Network Medicine

Deisy Morselli Gysi and Albert-László Barabási

PNAS, 120 (45) e2301342120 ( 2023)

Network medicine has improved the mechanistic understanding of disease, offering quantitative insights into disease mechanisms, comorbidities, and novel diagnostic tools and therapeutic treatments. Yet, most network-based approaches rely on a comprehensive map of protein–protein interactions (PPI), ignoring interactions mediated by noncoding RNAs (ncRNAs). Here, we systematically combine experimentally confirmed binding interactions mediated by ncRNA with PPI, constructing a comprehensive network of all physical interactions in the human cell. We find that the inclusion of ncRNA expands the number of genes in the interactome by 46% and the number of interactions by 107%, significantly enhancing our ability to identify disease modules. Indeed, we find that 132 diseases lacked a statistically significant disease module in the protein-based interactome but have a statistically significant disease module after inclusion of ncRNA-mediated interactions, making these diseases accessible to the tools of network medicine. We show that the inclusion of ncRNAs helps unveil disease–disease relationships that were not detectable before and expands our ability to predict comorbidity patterns between diseases. Taken together, we find that including noncoding interactions improves both the breath and the predictive accuracy of network medicine.

Quantifying hierarchy and prestige in US ballet academies as social predictors of career success

Yessica Herrera-Guzmán, Alexander J. Gates, Cristian Candia & Albert-László Barabási

Scientific Reports 13, 18594 (2023)

In the recent decade, we have seen major progress in quantifying the behaviors and the impact of scientists, resulting in a quantitative toolset capable of monitoring and predicting the career patterns of the profession. It is unclear, however, if this toolset applies to other creative domains beyond the sciences. In particular, while performance in the arts has long been difficult to quantify objectively, research suggests that professional networks and prestige of affiliations play a similar role to those observed in science, hence they can reveal patterns underlying successful careers. To test this hypothesis, here we focus on ballet, as it allows us to investigate in a quantitative fashion the interplay of individual performance, institutional prestige, and network effects. We analyze data on competition outcomes from 6363 ballet students affiliated with 1603 schools in the United States, who participated in the Youth America Grand Prix (YAGP) between 2000 and 2021. Through multiple logit models and matching experiments, we provide evidence that schools’ strategic network position bridging between communities captures social prestige and predicts the placement of students into jobs in ballet companies. This work reveals the importance of institutional prestige on career success in ballet and showcases the potential of network science approaches to provide quantitative viewpoints for the professional development of careers beyond science.

Network medicine framework reveals generic herb-symptom effectiveness of traditional Chinese medicine

Xiao Gan, Zixin Shu, Xinyan Wang, Dengying Yan, Jun Li, Shany Ofaim, Réka Albert, Xiaodong Li, Baoyan Liu, Xuezhong Zhou, and Albert-lászló Barabási

Science Advances 9, eadh0215(2023)

Understanding natural and traditional medicine can lead to world-changing drug discoveries. Despite the therapeutic effectiveness of individual herbs, traditional Chinese medicine (TCM) lacks a scientific foundation and is often considered a myth. In this study, we establish a network medicine framework and reveal the general TCM treatment principle as the topological relationship between disease symptoms and TCM herb targets on the human protein interactome. We find that proteins associated with a symptom form a network module, and the network proximity of an herb’s targets to a symptom module is predictive of the herb’s effectiveness in treating the symptom. These findings are validated using patient data from a hospital. We highlight the translational value of our framework by predicting herb-symptom treatments with therapeutic potential. Our network medicine framework reveals the scientific foundation of TCM and establishes a paradigm for understanding the molecular basis of natural medicine and predicting disease treatments.

Philanthropy in art: locality, donor retention, and prestige

Louis Michael Shekhtman & Albert-László Barabási

Scientific Reports 13, 12157 (2023)

A significant portion of funding for art comes from foundations, representing a key revenue stream for most art organizations. Little is known, however, about the quantitative patterns that govern art funding, limiting the fundraising efficiency of organizations in need of resources, as well as optimal funding allocation of donors. To address these shortcomings, here we relied on the IRS e-file dataset to identify $36B in grants from 46,643 foundations to 48,766 art recipients between 2010 and 2019, allowing us to quantify donor-recipient relationships in art. We find that philanthropic giving is broadly distributed, following a stable power-law distribution, indicating that some funders give considerably and predictably more than others. Giving is highly localized, with 60% of grants and funds going to recipients in the donor’s state. Furthermore, donors often support multiple local organizations that offer distinct artforms, rather than advancing a particular subarea within art. Donor retention is strong, with nearly 70% of relationships continuing the next year. Finally, we explored the role of institutional prestige in foundation giving, finding that funding does correlate with prestige, with notable exceptions. Our results present the largest and most comprehensive data-driven exploration of giving by foundations to art to date, unveiling multiple insights that could benefit both donors and recipients.

Assessment of community efforts to advance network-based prediction of protein–protein interactions

Xu-Wen Wang, Lorenzo Madeddu, Kerstin Spirohn, Leonardo Martini, Adriano Fazzone, Luca Becchetti, Thomas P. Wytock, István A. Kovács, Olivér M. Balogh, Bettina Benczik, Mátyás Pétervári, Bence Ágg, Péter Ferdinandy, Loan Vulliard, Jörg Menche, Stefania Colonnese, Manuela Petti, Gaetano Scarano, Francesca Cuomo, Tong Hao, Florent Laval, Luc Willems, Jean-Claude Twizere, Marc Vidal, Michael A. Calderwood, Enrico Petrillo, Albert-László Barabási, Edwin K. Silverman, Joseph Loscalzo, Paola Velardi & Yang-Yu Liu

Nature Communications 14, 1582 (2023)

Comprehensive understanding of the human protein-protein interaction (PPI) network, aka the human interactome, can provide important insights into the molecular mechanisms of complex biological processes and diseases. Despite the remarkable experimental efforts undertaken to date to determine the structure of the human interactome, many PPIs remain unmapped. Computational approaches, especially network-based methods, can facilitate the identification of previously uncharacterized PPIs. Many such methods have been proposed. Yet, a systematic evaluation of existing network-based methods in predicting PPIs is still lacking. Here, we report community efforts initiated by the International Network Medicine Consortium to benchmark the ability of 26 representative network-based methods to predict PPIs across six different interactomes of four different organisms: A. thalianaC. elegansS. cerevisiae, and H. sapiens. Through extensive computational and experimental validations, we found that advanced similarity-based methods, which leverage the underlying network characteristics of PPIs, show superior performance over other general link prediction methods in the interactomes we considered.

Machine learning prediction of the degree of food processing

Giulia Menichetti, Babak Ravandi, Dariush Mozaffarian & Albert-László Barabási

Nature Communications 14, 2312 (2023)

Despite the accumulating evidence that increased consumption of ultra-processed food has adverse health implications, it remains difficult to decide what constitutes processed food. Indeed, the current processing-based classification of food has limited coverage and does not differentiate between degrees of processing, hindering consumer choices and slowing research on the health implications of processed food. Here we introduce a machine learning algorithm that accurately predicts the degree of processing for any food, indicating that over 73% of the US food supply is ultra-processed. We show that the increased reliance of an individual’s diet on ultra-processed food correlates with higher risk of metabolic syndrome, diabetes, angina, elevated blood pressure and biological age, and reduces the bio-availability of vitamins. Finally, we find that replacing foods with less processed alternatives can significantly reduce the health implications of ultra-processed food, suggesting that access to information on the degree of processing, currently unavailable to consumers, could improve population health.

Improving the generalizability of protein-ligand binding predictions with AI-Bind

Ayan Chatterjee, Robin Walters, Zohair Shafi, Omair Shafi Ahmed, Michael Sebek, Deisy Gysi, Rose Yu, Tina Eliassi-Rad, Albert-László Barabási, & Giulia Menichetti.

Nature Communications 14, 1989 (2023)

Identifying novel drug-target interactions is a critical and rate-limiting step in drug discovery. While deep learning models have been proposed to accelerate the identification process, here we show that state-of-the-art models fail to generalize to novel (i.e., never-before-seen) structures. We unveil the mechanisms responsible for this shortcoming, demonstrating how models rely on shortcuts that leverage the topology of the protein-ligand bipartite network, rather than learning the node features. Here we introduce AI-Bind, a pipeline that combines network-based sampling strategies with unsupervised pre-training to improve binding predictions for novel proteins and ligands. We validate AI-Bind predictions via docking simulations and comparison with recent experimental evidence, and step up the process of interpreting machine learning prediction of protein-ligand binding by identifying potential active binding sites on the amino acid sequence. AI-Bind is a high-throughput approach to identify drug-target combinations with the potential of becoming a powerful tool in drug discovery.

Accelerating network layouts using graph neural networks

Csaba Both, Nima Dehmamy, Rose Yu & Albert-László Barabási

Nature Communications 14, 1560 (2023)

Graph layout algorithms used in network visualization represent the first and the most widely used tool to unveil the inner structure and the behavior of complex networks. Current network visualization software relies on the force directed layout (FDL) algorithm, whose high computational complexity makes the visualization of large real networks computationally prohibitive and traps large graphs into high energy configurations, resulting in hard-to-interpret “hairball” layouts. Here we use Graph Neural Networks (GNN) to accelerate FDL, showing that deep learning can address both limitations of FDL: it offers a 10 to 100 fold improvement in speed while also yielding layouts which are more informative. We analytically derive the speedup offered by GNN, relating it to the number of outliers in the eigenspectrum of the adjacency matrix, predicting that GNNs are particularly effective for networks with communities and local regularities. Finally, we use GNN to generate a three-dimensional layout of the Internet, and introduce additional measures to assess the layout quality and its interpretability, exploring the algorithm’s ability to separate communities and the link-length distribution. The novel use of deep neural networks can help accelerate other network-based optimization problems as well, with applications from reaction-diffusion systems to epidemics.

Genomics and phenomics of body mass index reveals a complex disease network

Huang, J., Huffman, JE., Huang, Y., Do Valle, I., Assimes, TL., Raghavan, S., Voight, B.F., Liu, C., Barabasi, A.-L., Huang, RDL., Hui, Q., Nguyen, X-M T., Ho, Y.-L., Djousse, L., Lynch, J.A., Vujkovic, M., Techeandjiue, C., Tang, H., Damrauer, SM., Reaven, P.D., Miller, D., Phillips, L.S. Ng, MCY. Graff, M., Haiman, C.A., Loos, RJF., North, KE., Yengo, L., Smith, GD., Saleheen, D., GAziano, JM., Rader, DJ., Tsao, PS., Cho, K., Change, K-M., Wilson, PWF., VA Million Veteran Program, Sun Y.V., O’Donnel, CJ.

Nature Communications 13, 7973 (2022)

Elevated body mass index (BMI) is heritable and associated with many health conditions that impact morbidity and mortality. The study of the genetic association of BMI across a broad range of common disease conditions offers the opportunity to extend current knowledge regarding the breadth and depth of adiposity-related diseases. We identify 906 (364 novel) and 41 (6 novel) genome-wide significant loci for BMI among participants of European (N~1.1 million) and African (N~100,000) ancestry, respectively. Using a BMI genetic risk score including 2446 variants, 316 diagnoses are associated in the Million Veteran Program, with 96.5% showing increased risk. A co-morbidity network analysis reveals seven disease communities containing multiple interconnected diseases associated with BMI as well as extensive connections across communities. Mendelian randomization analysis confirms numerous phenotypes across a breadth of organ systems, including conditions of the circulatory (heart failure, ischemic heart disease, atrial fibrillation), genitourinary (chronic renal failure), respiratory (respiratory failure, asthma), musculoskeletal and dermatologic systems that are deeply interconnected within and across the disease communities. This work shows that the complex genetic architecture of BMI associates with a broad range of major health conditions, supporting the need for comprehensive approaches to prevent and treat obesity.

Research gaps and opportunities in precision nutrition: an NIH workshop report

Bruce Y Lee, José M Ordovás, Elizabeth J Parks, Cheryl A M Anderson, Albert-László Barabási, Steven K Clinton, Kayla de la Haye, Valerie B Duffy, Paul W Franks, Elizabeth M Ginexi, Kristian J Hammond, Erin C Hanlon, Michael Hittle, Emily Ho, Abigail L Horn, Richard S Isaacson, Patricia L Mabry, Susan Malone, Corby K Martin, Josiemer Mattei, Simin Nikbin Meydani, Lorene M Nelson, Marian L Neuhouser, Brendan Parent, Nicolaas P Pronk, Helen M Roche, Suchi Saria, Frank A J L Scheer, Eran Segal, Mary Ann Sevick, Tim D Spector, Linda Van Horn, Krista A Varady, Venkata Saroja Voruganti, Marie F Martinez

The American Journal of Clinical Nutrition, 116 (6), P1877-1900 (2022)

Precision nutrition is an emerging concept that aims to develop nutrition recommendations tailored to different people's circumstances and biological characteristics. Responses to dietary change and the resulting health outcomes from consuming different diets may vary significantly between people based on interactions between their genetic backgrounds, physiology, microbiome, underlying health status, behaviors, social influences, and environmental exposures. On 11–12 January 2021, the National Institutes of Health convened a workshop entitled “Precision Nutrition: Research Gaps and Opportunities” to bring together experts to discuss the issues involved in better understanding and addressing precision nutrition. The workshop proceeded in 3 parts: part I covered many aspects of genetics and physiology that mediate the links between nutrient intake and health conditions such as cardiovascular disease, Alzheimer disease, and cancer; part II reviewed potential contributors to interindividual variability in dietary exposures and responses such as baseline nutritional status, circadian rhythm/sleep, environmental exposures, sensory properties of food, stress, inflammation, and the social determinants of health; part III presented the need for systems approaches, with new methods and technologies that can facilitate the study and implementation of precision nutrition, and workforce development needed to create a new generation of researchers. The workshop concluded that much research will be needed before more precise nutrition recommendations can be achieved. This includes better understanding and accounting for variables such as age, sex, ethnicity, medical history, genetics, and social and environmental factors. The advent of new methods and technologies and the availability of considerably more data bring tremendous opportunity. However, the field must proceed with appropriate levels of caution and make sure the factors listed above are all considered, and systems approaches and methods are incorporated. It will be important to develop and train an expanded workforce with the goal of reducing health disparities and improving precision nutritional advice for all Americans.

Fragmentation of outage clusters during the recovery of power distribution grids

H Wu, X Meng, MM Danziger, SP Cornelius, H Tian, AL Barabási.

Nature Communications 13, 7372 (2022)

The understanding of recovery processes in power distribution grids is limited by the lack of realistic outage data, especially large-scale blackout datasets. By analyzing data from three electrical companies across the United States, we find that the recovery duration of an outage is connected with the downtime of its nearby outages and blackout intensity (defined as the peak number of outages during a blackout), but is independent of the number of customers affected. We present a cluster-based recovery framework to analytically characterize the dependence between outages, and interpret the dominant role blackout intensity plays in recovery. The recovery of blackouts is not random and has a universal pattern that is independent of the disruption cause, the post-disaster network structure, and the detailed repair strategy. Our study reveals that suppressing blackout intensity is a promising way to speed up restoration.

Maximizing Brain Networks engagement via Individualized Connectome-wide Target Search

Menardi, A., Momi, D., Vallesi, A., Barabasi, A.-L., Towlson, E.K., Santarnecchi, E.

Science Direct 15(6) 1418-1431 (2022)

Background

In recent years, the possibility to noninvasively interact with the human brain has led to unprecedented diagnostic and therapeutic opportunities. However, the vast majority of approved interventions and approaches still rely on anatomical landmarks and rarely on the individual structure of networks in the brain, drastically reducing the potential efficacy of neuromodulation.

Objective

Here we implemented a target search algorithm leveraging on mathematical tools from Network Control Theory (NCT) and whole brain connectomics analysis. By means of computational simulations, we aimed to identify the optimal stimulation target(s)— at the individual brain level— capable of reaching maximal engagement of the stimulated networks’ nodes.

Results

At the model level, in silico predictions suggest that stimulation of NCT-derived cerebral sites might induce significantly higher network engagement, compared to traditionally employed neuromodulation sites, demonstrating NCT to be a useful tool in guiding brain stimulation. Indeed, NCT allows us to computationally model different stimulation scenarios tailored on the individual structural connectivity profiles and initial brain states.

Conclusions

The use of NCT to computationally predict TMS pulse propagation suggests that individualized targeting is crucial for more successful network engagement. Future studies will be needed to verify such prediction in real stimulation scenarios.

MilkyBase, a database of human milk composition as a function of maternal-, infant- and measurement conditions

Tünde Pacza, Mayara L. Martins, Maha Rockaya, Katalin Müller, Ayan Chatterjee, Albert-László Barabási & József Baranyi

Scientific Data 9, 557 (2022)

This study describes the development of a database, called MilkyBase, of the biochemical composition of human milk. The data were selected, digitized and curated partly by machine-learning, partly manually from publications. The database can be used to find patterns in the milk composition as a function of maternal-, infant- and measurement conditions and as a platform for users to put their own data in the format shown here. The database is an Excel workbook of linked sheets, making it easy to input data by non-computationally minded nutritionists. The hierarchical organisation of the fields makes sure that statistical inference methods can be programmed to analyse the data. Uncertainty quantification and recording dynamic (time-dependent) compositions offer predictive potentials.

Visualizing Novel Connections and Genetic Similarities Across Diseases Using a Network Medicine Based Approach

Ferolito, B., Do Valle, I.F., Gerlovin, H., Costa, L., Casas, JP, Gaziano, J.M., Gagnon, D.R., Begoli, E. B., Barabasi, A.-L., Cho, K.

Scientific Reports 12, 14914 (2022)

Understanding the genetic relationships between human disorders could lead to better treatment and prevention strategies, especially for individuals with multiple comorbidities. A common resource for studying genetic-disease relationships is the GWAS Catalog, a large and well curated repository of SNP-trait associations from various studies and populations. Some of these populations are contained within mega-biobanks such as the Million Veteran Program (MVP), which has enabled the genetic classification of several diseases in a large well-characterized and heterogeneous population. Here we aim to provide a network of the genetic relationships among diseases and to demonstrate the utility of quantifying the extent to which a given resource such as MVP has contributed to the discovery of such relations. We use a network-based approach to evaluate shared variants among thousands of traits in the GWAS Catalog repository. Our results indicate many more novel disease relationships that did not exist in early studies and demonstrate that the network can reveal clusters of diseases mechanistically related. Finally, we show novel disease connections that emerge when MVP data is included, highlighting methodology that can be used to indicate the contributions of a given biobank.

Identification of potent inhibitors of SARS-CoV-2 infection by combined pharmacological evaluation and cellular network prioritization

J.J. Patten, Patrick T. Keiser, Deisy Morselli-Gysi, Giulia Menichetti, Hiroyuki Mori, Callie J. Donahue, Xiao Gan, Italo do Valle, Kathleen Geoghegan-Barek, Manu Anantpadma, RuthMabel Boytz, Jacob L. Berrigan, Sarah H. Stubbs, Tess Ayazika, Colin O’Leary, Sallieu Jalloh, Florence Wagner, Seyoum Ayehunie, Stephen J. Elledge, Deborah Anderson, Joseph Loscalzo, Marinka Zitnik, Suryaram Gummuluru, Mark N. Namchuk, Albert-László Barabási and Robert A. Davey

iScience 25(9) 104925 (2022)

Pharmacologically active compounds with known biological targets were evaluated for inhibition of SARS-CoV-2 infection in cell and tissue models to help identify potent classes of active small molecules and to better understand host-virus interactions. We evaluated 6,710 clinical and preclinical compounds targeting 2,183 host proteins by immunocytofluorescence-based screening to identify SARS-CoV-2 infection inhibitors. Computationally integrating relationships between small molecule structure, dose-response antiviral activity, host target, and cell interactome produced cellular networks important for infection. This analysis revealed 389 small molecules with micromolar to low nanomolar activities, representing >12 scaffold classes and 813 host targets. Representatives were evaluated for mechanism of action in stable and primary human cell models with SARS-CoV-2 variants and MERS-CoV. One promising candidate, obatoclax, significantly reduced SARS-CoV-2 viral lung load in mice. Ultimately, this work establishes a rigorous approach for future pharmacological and computational identification of host factor dependencies and treatments for viral diseases.

Network-medicine framework for studying disease trajectories in U.S. veterans

Do Valle, I.F., Ferolito, B., Gerlovin, H., Costa, L., Demissie, S., Linares, F., Cohen, J., Gagnon, D.R., Gaziano, J.M., Begoli, E., Cho, K., Barabasi, A.-L.

Scientific Reports 12, 12018 (2022)

A better understanding of the sequential and temporal aspects in which diseases occur in patient’s lives is essential for developing improved intervention strategies that reduce burden and increase the quality of health services. Here we present a network-based framework to study disease relationships using Electronic Health Records from > 9 million patients in the United States Veterans Health Administration (VHA) system. We create the Temporal Disease Network, which maps the sequential aspects of disease co-occurrence among patients and demonstrate that network properties reflect clinical aspects of the respective diseases. We use the Temporal Disease Network to identify disease groups that reflect patterns of disease co-occurrence and the flow of patients among diagnoses. Finally, we define a strategy for the identification of trajectories that lead from one disease to another. The framework presented here has the potential to offer new insights for disease treatment and prevention in large health care systems.

Nutrient concentrations in food display universal behaviour

Giulia Menichetti and Albert-László Barabási

Nature Food 3, 75–382 (2022)

Extensive programmes around the world endeavour to measure and catalogue the composition of food. Here we analyse the
nutrient content of the full US food supply and show that the concentration of each nutrient follows a universal single-parameter
scaling law that accurately captures the eight orders of magnitude in nutrient content variability. We show that the universality
is rooted in the biochemical constraints obeyed by the metabolic pathways responsible for nutrient modulation, allowing us to
confirm the empirically observed scaling law and to predict its variability in agreement with the data. We propose that the natu-
ral nutrient variability in food can be quantitatively formalized. This provides a mathematical rationale for imputing missing
values in food composition databases and paves the way towards a quantitative understanding of the impact of food processing
on nutrient balance and health effects.

Dynamics of ranking

Gerardo Iñiguez, Carlos Pineda, Carlos Gershenson, & Albert-László Barabási

Nature Communications 13, 1646 (2022)

Virtually anything can be and is ranked; people, institutions, countries, words, genes. Rankings reduce complex systems to ordered lists, reflecting the ability of their elements to perform relevant functions, and are being used from socioeconomic policy to knowledge extraction. A century of research has found regularities when temporal rank data is aggregated. Far less is known, however, about how rankings change in time. Here we explore the dynamics of 30 rankings in natural, social, economic, and infrastructural systems, comprising millions of elements and timescales from minutes to centuries. We find that the flux of new elements determines the stability of a ranking: for high flux only the top of the list is stable, otherwise top and bottom are equally stable. We show that two basic mechanisms — displacement and replacement of elements — capture empirical ranking dynamics. The model uncovers two regimes of behavior; fast and large rank changes, or slow diffusion. Our results indicate that the balance between robustness and adaptability in ranked systems might be governed by simple random processes irrespective of system details.

Recovery coupling in multilayer networks

Michael M. Danziger & Albert-László Barabási

Nature Communications 13, 955 (2022)

The increased complexity of infrastructure systems has resulted in critical interdependencies between multiple networks—communication systems require electricity, while the normal functioning of the power grid relies on communication systems. These interdependencies have inspired an extensive literature on coupled multilayer networks, assuming a hard interdependence, where a component failure in one network causes failures in the other network, resulting in a cascade of failures across multiple systems. While empirical evidence of such hard failures is limited, the repair and recovery of a network requires resources typically supplied by other networks, resulting in documented interdependencies induced by the recovery process. In this work, we explore recovery coupling, capturing the dependence of the recovery of one system on the instantaneous functional state of another system. If the support networks are not functional, recovery will be slowed. Here we collected data on the recovery time of millions of power grid failures, finding evidence of universal nonlinear behavior in recovery following large perturbations. We develop a theoretical framework to address recovery coupling, predicting quantitative signatures different from the multilayer cascading failures. We then rely on controlled natural experiments to separate the role of recovery coupling from other effects like resource limitations, offering direct evidence of how recovery coupling affects a system’s functionality.

Quantifying NFT‑driven networks in crypto art

Kishore Vasan, Milán Janosov & Albert‑László Barabási

Scientific Reports 12, 2769 (2022)

The evolution of the art ecosystem is driven by largely invisible networks, defined by undocumented interactions between artists, institutions, collectors and curators. The emergence of cryptoart, and the NFT-based digital marketplace around it, offers unprecedented opportunities to examine the mechanisms that shape the evolution of networks that define artistic practice. Here we mapped the Foundation platform, identifying over 48,000 artworks through the associated NFTs listed by over 15,000 artists, allowing us to characterize the patterns that govern the networks that shape artistic success. We find that NFT adoption by both artists and collectors has undergone major changes, starting with a rapid growth that peaked in March 2021 and the emergence of a new equilibrium in June. Despite significant changes in activity, the average price of the sold art remained largely unchanged, with the price of an artist’s work fluctuating in a range that determines his or her reputation. The artist invitation network offers evidence of rich and poor artist clusters, driven by homophily, indicating that the newly invited artists develop similar engagement and sales patterns as the artist who invited them. We find that successful artists receive disproportional, repeated investment from a small group of collectors, underscoring the importance of artist–collector ties in the digital marketplace. These reproducible patterns allow us to characterize the features, mechanisms, and the networks enabling the success of individual artists, a quantification necessary to better understand the emerging NFT ecosystem.

Network medicine framework for identifying drug-repurposing opportunities for COVID-19

Deisy Morselli Gysi, Ítalo do Valle, Marinka Zitnik, Asher Ameli, Xiao Gan, Onur Varol, Susan Dina Ghiassian, J. J. Patten, Robert A. Davey, Joseph Loscalzo, and Albert-László Barabási

PNAS 118 (19) e2025581118 (2021)

The COVID-19 pandemic has highlighted the need to quickly and reliably prioritize clinically approved compounds for their potential effectiveness for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections. Here, we deployed algorithms relying on artificial intelligence, network diffusion, and network proximity, tasking each of them to rank 6,340 drugs for their expected efficacy against SARS-CoV-2. To test the predictions, we used as ground truth 918 drugs experimentally screened in VeroE6 cells, as well as the list of drugs in clinical trials that capture the medical community’s assessment of drugs with potential COVID-19 efficacy. We find that no single predictive algorithm offers consistently reliable outcomes across all datasets and metrics. This outcome prompted us to develop a multimodal technology that fuses the predictions of all algorithms, finding that a consensus among the different predictive methods consistently exceeds the performance of the best individual pipelines. We screened in human cells the top-ranked drugs, obtaining a 62% success rate, in contrast to the 0.8% hit rate of nonguided screenings. Of the six drugs that reduced viral infection, four could be directly repurposed to treat COVID-19, proposing novel treatments for COVID-19. We also found that 76 of the 77 drugs that successfully reduced viral infection do not bind the proteins targeted by SARS-CoV-2, indicating that these network drugs rely on network-based mechanisms that cannot be identified using docking-based strategies. These advances offer a methodological pathway to identify repurposable drugs for future pathogens and neglected diseases underserved by the costs and extended timeline of de novo drug development.

Network medicine framework shows that proximity of polyphenol targets and disease proteins predicts therapeutic effects of polyphenols

Italo F. do Valle, Harvey G. Roweth, Michael W. Malloy, Sofia Moco, Denis Barron, Elisabeth Battinelli, Joseph Loscalzo & Albert-László Barabási

Nature Food 2, 143–155(2021)

Polyphenols, natural products present in plant-based foods, play a protective role against several complex diseases through their antioxidant activity and by diverse molecular mechanisms. Here we develop a network medicine framework to uncover mechanisms for the effects of polyphenols on health by considering the molecular interactions between polyphenol protein targets and proteins associated with diseases. We find that the protein targets of polyphenols cluster in specific neighbourhoods of the human interactome, whose network proximity to disease proteins is predictive of the molecule’s known therapeutic effects. The methodology recovers known associations, such as the effect of epigallocatechin-3-O-gallate on type 2 diabetes, and predicts that rosmarinic acid has a direct impact on platelet function, representing a novel mechanism through which it could affect cardiovascular health. We experimentally confirm that rosmarinic acid inhibits platelet aggregation and α-granule secretion through inhibition of protein tyrosine phosphorylation, offering direct support for the predicted molecular mechanism. Our framework represents a starting point for mechanistic interpretation of the health effects underlying food-related compounds, allowing us to integrate into a predictive framework knowledge on food metabolism, bioavailability and drug interaction.

A wealth of discovery built on the Human Genome Project — by the numbers

Alexander J. Gates, Deisy Morselli Gysi, Manolis Kellis & Albert-László Barabási

Nature 590, 212-215 (2021)

The 20th anniversary of the publication of the first draft of the human genome offers an opportunity to track how the project has empowered research into the genetic roots of human disease, changed drug discovery and helped to revise the idea of the gene itself.

Here we distill these impacts and trends. We combined several data sets to quantify the different types of genetic element that have been discovered and that generated publications, and how the pattern of discovery and publishing has changed over the years. Our analysis linked together data including RNA transcripts; around 1 million single nucleotide polymorphisms (SNPs); human diseases with documented genetic roots; approved and experimental pharmaceuticals; and scientific publications between 1900 and 2017.

Social network structure and composition in former NFL football players

Amar Dhand, Liam McCafferty, Rachel Grashow, Ian M. Corbin, Sarah Cohan, Alicia J. Whittington, Ann Connor, Aaron Baggish, Mark Weisskopf, Ross Zafonte, Alvaro Pascual-Leone & Albert-László Barabási

Scientific Reports 11, 1630 (2021)

Social networks have broad effects on health and quality of life. Biopsychosocial factors may also modify the effects of brain trauma on clinical and pathological outcomes. However, social network characterization is missing in studies of contact sports athletes. Here, we characterized the personal social networks of former National Football League players compared to non-football US males. In 303 former football players and 269 US males, we found that network structure (e.g., network size) did not differ, but network composition (e.g., proportion of family versus friends) did differ. Football players had more men than women, and more friends than family in their networks compared to US males. Black players had more racially diverse networks than White players and US males. These results are unexpected because brain trauma and chronic illnesses typically cause diminished social relationships. We anticipate our study will inform more multi-dimensional study of, and treatment options for, contact sports athletes. For example, the strong allegiances of former athletes may be harnessed in the form of social network interventions after brain trauma. Because preserving health of contact sports athletes is a major goal, the study of social networks is critical to the design of future research and treatment trials.

Uncovering the genetic blueprint of the C. elegans nervous system

István A. Kovács, Dániel L. Barabási, and Albert-László Barabási

PNAS 117 (52) 33570-33577 (2020)

A fundamental question of neuroscience is how the brain wires itself. Here, we propose a modeling framework that explains how cellular connectivity emerges from neuronal identity, allowing us to offer experimentally falsifiable predictions on the genetic encoding of the connectome. The rapid advances in brain science require quantitative frameworks to integrate genetic and connectome information. The proposed model responds to this need, helping us unveil the genetically driven mechanisms that govern the formation of individual links in the brain.

A systematic comprehensive longitudinal evaluation of dietary factors associated with acute myocardial infarction and fatal coronary heart disease

Soodabeh Milanlouei, Giulia Menichetti, Yanping Li, Joseph Loscalzo, Walter C. Willett & Albert-László Barabási

Nature Communications volume 11, Article number: 6074 (2020)

Environmental factors, and in particular diet, are known to play a key role in the development of Coronary Heart Disease. Many of these factors were unveiled by detailed nutritional epidemiology studies, focusing on the role of a single nutrient or food at a time. Here, we apply an Environment-Wide Association Study approach to Nurses’ Health Study data to explore comprehensively and agnostically the association of 257 nutrients and 117 foods with coronary heart disease risk (acute myocardial infarction and fatal coronary heart disease). After accounting for multiple testing, we identify 16 food items and 37 nutrients that show statistically significant association – while adjusting for potential confounding and control variables such as physical activity, smoking, calorie intake, and medication use – among which 38 associations were validated in Nurses’ Health Study II. Our implementation of Environment-Wide Association Study successfully reproduces prior knowledge of diet-coronary heart disease associations in the epidemiological literature, and helps us detect new associations that were only marginally studied, opening potential avenues for further extensive experimental validation. We also show that Environment-Wide Association Study allows us to identify a bipartite food-nutrient network, highlighting which foods drive the associations of specific nutrients with coronary heart disease risk.

Isotopy and energy of physical networks

Yanchen Liu, Nima Dehmamy & Albert-László Barabási

Nature Physics (2020)

While the structural characteristics of a network are uniquely determined by its adjacency matrix, in physical networks, such as the brain or the vascular system, the network’s three-dimensional layout also affects the system’s structure and function. We lack, however, the tools to distinguish physical networks with identical wiring but different geometrical layouts. To address this need, here we introduce the concept of network isotopy, representing different network layouts that can be transformed into one another without link crossings, and show that a single quantity, the graph linking number, captures the entangledness of a layout, defining distinct isotopy classes. We find that a network’s elastic energy depends linearly on the graph linking number, indicating that each local tangle offers an independent contribution to the total energy. This finding allows us to formulate a statistical model for the formation of tangles in physical networks. We apply the developed framework to a diverse set of real physical networks, finding that the mouse connectome is more entangled than expected based on optimal wiring.

Exploring food contents in scientific literature with foodMine

Forrest Hooton, Giulia Menichetti & Albert‐László Barabási

Scientific Reports 10, 16191 (2020)

Thanks to the many chemical and nutritional components it carries, diet critically affects human health. However, the currently available comprehensive databases on food composition cover only a tiny fraction of the total number of chemicals present in our food, focusing on the nutritional components essential for our health. indeed, thousands of other molecules, many of which have well documented health implications, remain untracked. to explore the body of knowledge available on food composition, we built foodMine, an algorithm that uses natural language processing to identify papers from pubMed that potentially report on the chemical composition of garlic and cocoa. After extracting from each paper information on the reported quantities of chemicals, we find that the scientific literature carries extensive information on the detailed chemical components of food that is currently not integrated in databases. finally, we use unsupervised machine learning to create chemical embeddings, finding that the chemicals identified by FoodMine tend to have direct health relevance, reflecting the scientific community’s focus on health-related chemicals in our food.

 

Science, advocacy, and quackery in nutritional books: an analysis of conflicting advice and purported claims of nutritional best-sellers

Rebecca M. Marton, Xindi Wang, Albert-László Barabási & John P. A. Ioannidis

Palgrave Communications volume 6, Article number: 43 (2020)

Nutritional decisions may be important for health, and yet identifying trustworthy sources of advice can be difficult to achieve. Many people turn to books for nutritional advice, making the contents of these books and the expertise of their authors relevant to public health. Here, the top 100 best-selling books were identified and assessed for both the claims they make in their summaries and the credentials of the authors. Weight loss was a common theme in the summaries of nutritional best-selling books. In addition to weight loss, 31 of the books promised to cure or prevent a host of diseases, including diabetes, heart disease, cancer, and dementia; however, the nutritional advice given to achieve these outcomes varied widely in terms of which types of foods should be consumed or avoided and this information was often contradictory between books. Recommendations regarding the consumption of carbohydrates, dairy, proteins, and fat in particular differed greatly between books. To determine the qualifications of each author in making nutritional claims, the highest earned degree and listed occupations of each author was researched and analyzed. Out of 83 unique authors, 33 had an M.D. or Ph.D degree. Twenty-eight of the authors were physicians, three were dietitians, and other authors held a wide range of jobs, including personal trainers, bloggers, and actors. Of 20 authors who had or claimed university affiliations, seven had a current university appointment that could be verified online in university directories. This study illuminates the range of the incongruous information being dispersed to the public and emphasizes the need for future efforts to improve the dissemination of sound nutritional advice.

Historical comparison of gender inequality in scientific careers across countries and disciplines

Junming Huang, Alexander J. Gates, Roberta Sinatra, and Albert-László Barabási

PNAS March 3, 2020 117 (9) 4609-4616

There is extensive, yet fragmented, evidence of gender differences in academia suggesting that women are underrepresented in most scientific disciplines and publish fewer articles throughout a career, and their work acquires fewer citations. Here, we offer a comprehensive picture of longitudinal gender differences in performance through a bibliometric analysis of academic publishing careers by reconstructing the complete publication history of over 1.5 million gender-identified authors whose publishing career ended between 1955 and 2010, covering 83 countries and 13 disciplines. We find that, paradoxically, the increase of participation of women in science over the past 60 years was accompanied by an increase of gender differences in both productivity and impact. Most surprisingly, though, we uncover two gender invariants, finding that men and women publish at a comparable annual rate and have equivalent career-wise impact for the same size body of work. Finally, we demonstrate that differences in publishing career lengths and dropout rates explain a large portion of the reported career-wise differences in productivity and impact, although productivity differences still remain. This comprehensive picture of gender inequality in academia can help rephrase the conversation around the sustainability of women’s careers in academia, with important consequences for institutions and policy makers.

The exposome and health: Where chemistry meets biology

Roel Vermeulen, Emma L. Schymanski, Albert-László Barabási, Gary W. Miller

Science 24 Jan 2020: 367, 6476, 392-396

Despite extensive evidence showing that exposure to specific chemicals can lead to disease, current research approaches and regulatory policies fail to address the chemical complexity of our world. To safeguard current and future generations from the increasing number of chemicals polluting our environment, a systematic and agnostic approach is needed. The “exposome” concept strives to capture the diversity and range of exposures to synthetic chemicals, dietary constituents, psychosocial stressors, and physical factors, as well as their corresponding biological responses. Technological advances such as high-resolution mass spectrometry and network science have allowed us to take the first steps toward a comprehensive assessment of the exposome. Given the increased recognition of the dominant role that nongenetic factors play in disease, an effort to characterize the exposome at a scale comparable to that of the human genome is warranted.

Understanding the Representation Power of Graph Neural Networks in Learning Graph Topology

Nima Dehmamy, Albert-László Barabási, Rose Yu

NeurIPS 32 2019

To deepen our understanding of graph neural networks, we investigate the representation power of Graph Convolutional Networks (GCN) through the looking glass of graph moments, a key property of graph topology encoding path of various lengths. We find that GCNs are rather restrictive in learning graph moments. Without careful design, GCNs can fail miserably even with multiple layers and nonlinear activation functions. We analyze theoretically the expressiveness of GCNs, concluding that a modular GCN design, using different propagation rules with residual connections could significantly improve the performance of GCN. We demonstrate that such modular designs are capable of distinguishing graphs from different graph generation models for surprisingly small graphs, a notoriously difficult problem in network science. Our investigation suggests that, depth is muchmore influential than width, with deeper GCNs being more capable of learning higher order graph moments. Additionally, combining GCN modules with different propagation rules is critical to the representation power of GCNs.

The unmapped chemical complexity of our diet

Albert-László Barabási, Giulia Menichetti & Joseph Loscalzo

Nature Food 1, 33-37 (2019)

Our understanding of how diet affects health is limited to 150 key nutritional components that are tracked and catalogued by the United States Department of Agriculture and other national databases. Although this knowledge has been transformative for health sciences, helping unveil the role of calories, sugar, fat, vitamins and other nutritional factors in the emergence of common diseases, these nutritional components represent only a small fraction of the more than 26,000 distinct, definable biochemicals present in our food—many of which have documented effects on health but remain unquantified in any systematic fashion across different individual foods. Using new advances such as machine learning, a high-resolution library of these biochemicals could enable the systematic study of the full biochemical spectrum of our diets, opening new avenues for understanding the composition of what we eat, and how it affects health and disease.

A Genetic Model of the Connectome

Dániel L. Barabási, Albert-László Barabási

Neuron 105, 1-11 2019

The connectomes of organisms of the same species show remarkable architectural and often local wiring similarity, raising the question: where and how is neuronal connectivity encoded? Here, we start from the hypothesis that the genetic identity of neurons guides synapse and gap-junction formation and show that such genetically driven wiring predicts the existence of specific biclique motifs in the connectome. We identify a family of large, statistically significant biclique subgraphs in the connectomes of three species and show that within many of the observed bicliques the neurons share statistically significant expression patterns and morphological characteristics, supporting our expectation of common genetic factors that drive the synapse formation within these subgraphs. The proposed connectome model offers a self-consistent framework to link the genetics of an organism to the reproducible architecture of its connectome, offering experimentally falsifiable predictions on the genetic factors that drive the formation of individual neuronal circuits.

Synthetic ablations in the C. elegans nervous system

Emma K. Towlson and Albert-László Barabási

Network Neuroscience 2020, pp. 1–17

Synthetic lethality, the finding that the simultaneous knockout of two or more individually nonessential genes leads to cell or organism death, has offered a systematic framework to explore cellular function, and also offered therapeutic applications. Yet the concept lacks its parallel in neuroscience—a systematic knowledge base on the role of double or higher order ablations in the functioning of a neural system. Here, we use the framework of network control to systematically predict the effects of ablating neuron pairs and triplets on the gentle touch response. We find that surprisingly small sets of 58 pairs and 46 triplets can reduce muscle controllability in this context, and that these sets are localized in the nervous system in distinct groups. Further, they lead to highly specific experimentally testable predictions about mechanisms of loss of control, and which muscle cells are expected to experience this loss.

Nature’s reach: narrow work has broad impact

Alexander J. Gates, Qing Ke, Onur Varol & Albert-László Barabási

Nature 575, 32-34 (2019)

How knowledge informs and alters disciplines is itself an enlightening, and vibrant field. This type of meta research into new findings, insights, conceptual frameworks and techniques is important, among other things, for policymakers who fund research in the hope of tackling society’s most pressing challenges, which inevitably span disciplines.

Since its founding in 1869, Nature has offered a venue for publishing major advances from many fields. To mark its anniversary, we track here how papers cite and are cited across disciplines, using data on tens of millions of scientific articles indexed in Clarivate Analytics’ Web of Science (WoS), a bibliometric database that encompasses many thousands of research journals starting from 1900. We pay particular attention to articles that appeared in Nature. In our view, this snapshot, for all its idiosyncrasies, reveals how scientific work is ever more becoming a mixture of disciplines.

Success in books: predicting book sales before publication

Xindi Wang, Burcu Yucesoy, Onur Varol, Tina Eliassi-Rad, Albert-László Barabási

EPJ Data Science 8: 31 (2019)

Reading remains a preferred leisure activity fueling an exceptionally competitive publishing market: among more than three million books published each year, only a tiny fraction are read widely. It is largely unpredictable, however, which book will that be, and how many copies it will sell. Here we aim to unveil the features that affect the success of books by predicting a book’s sales prior to its publication. We do so by employing the Learning to Place machine learning approach, that can predicts sales for both fiction and nonfiction books as well as explaining the predictions by comparing and contrasting each book with similar ones. We analyze features contributing to the success of a book by feature importance analysis, finding that a strong driving factor of book sales across all genres is the publishing house. We also uncover differences between genres: for thrillers and mystery, the publishing history of an author (as measured by previous book sales) is highly important, while in literary fiction and religion, the author’s visibility plays a more central role. These observations provide insights into the driving forces behind success within the current publishing industry, as well as how individuals choose what books to read.

Network-based prediction of protein interactions

István A. Kovács, Katja Luck, Kerstin Spirohn, Yang Wang, Carl Pollis, Sadie Schlabach, Wenting Bian, Dae-Kyum Kim, Nishka Kishore, Tong Hao, Michael A. Calderwood, Marc Vidal & Albert-László Barabási

Nature Communications 10, Article number: 1240 (2019)

Despite exceptional experimental efforts to map out the human interactome, the continued data incompleteness limits our ability to understand the molecular roots of human disease. Computational tools offer a promising alternative, helping identify biologically significant, yet unmapped protein-protein interactions (PPIs). While link prediction methods connect proteins on the basis of biological or network-based similarity, interacting proteins are not necessarily similar and similar proteins do not necessarily interact. Here, we offer structural and evolutionary evidence that proteins interact not if they are similar to each other, but if one of them is similar to the other’s partners. This approach, that mathematically relies on network paths of length three (L3), significantly outperforms all existing link prediction methods. Given its high accuracy, we show that L3 can offer mechanistic insights into disease mechanisms and can complement future experimental efforts to complete the human interactome.

Network-based prediction of drug combinations

Feixiong Chen, István A. Kovács & Albert László Barabási

Nature Communications 10, 1197 (2019)

Drug combinations, offering increased therapeutic efficacy and reduced toxicity, play an important role in treating multiple complex diseases. Yet, our ability to identify and validate effective combinations is limited by a combinatorial explosion, driven by both the large number of drug pairs as well as dosage combinations. Here we propose a network-based methodology to identify clinically efficacious drug combinations for specific diseases. By quantifying the network-based relationship between drug targets and disease proteins in the human protein–protein interactome, we show the existence of six distinct classes of drug–drug–disease combinations. Relying on approved drug combinations for hypertension and cancer, we find that only one of the six classes correlates with therapeutic effects: if the targets of the drugs both hit disease module, but target separate neighborhoods. This finding allows us to identify and validate antihypertensive combinations, offering a generic, powerful network methodology to identify efficacious combination therapies in drug development.

Taking Census of Physics

Federico Battiston, Federico Musciotto, Dashun Wang, Albert-László Barabási, Michael Szell, and Roberta Sinatra

Nature Reviews Physics 1, 89-97 (2019)

Over the past decades, the diversity of areas explored by physicists has exploded, encompassing new topics from biophysics and chemical physics to network science. However, it is unclear how these new subfields emerged from the traditional subject areas and how physicists explore them. To map out the evolution of physics subfields, here, we take an intellectual census of physics by studying physicists’ careers. We use a large-scale publication data set, identify the subfields of 135,877 physicists and quantify their heterogeneous birth, growth and migration patterns among research areas. We find that the majority of physicists began their careers in only three subfields, branching out to other areas at later career stages, with different rates and transition times. Furthermore, we analyse the productivity, impact and team sizes across different subfields, finding drastic changes attributable to the recent rise in large-scale collaborations. This detailed, longitudinal census of physics can inform resource allocation policies and provide students, editors and scientists with a broader view of the field’s internal dynamics.

The Universal Decay of Collective Memory and Attention

Cristian Candia, C. Jara-Figueroa, Carlos Rodriguez-Sickert, Albert-László Barabási, and César A. Hidalgo

Nature Human Behavior 3, 82–91 (2019)

Collective memory and attention are sustained by two channels: oral communication (communicative memory) and the physical recording of information (cultural memory). Here, we use data on the citation of academic articles and patents, and on the online attention received by songs, movies and biographies, to describe the temporal decay of the attention received by cultural products. We show that, once we isolate the temporal dimension of the decay, the attention received by cultural products decays following a universal biexponential function. We explain this universality by proposing a mathematical model based on communicative and cultural memory, which fits the data better than previously proposed log-normal and exponential models. Our results reveal that biographies remain in our communicative memory the longest (20–30 years) and music the shortest (about 5.6 years). These findings show that the average attention received by cultural products decays following a universal biexponential function.

The Chaperone Effect in Scientific Publishing

Vedran Sekara, Pierre Deville, Sebastian E. Ahnert, Albert-László Barabási, Roberta Sinatra, and Sune Lehmann

PNAS 115:50, 12603-12607 (2018)

Experience plays a critical role in crafting high-impact scientific work. This is particularly evident in top multidisciplinary journals, where a scientist is unlikely to appear as senior author if he or she has not previously published within the same journal. Here, we develop a quantitative understanding of author order by quantifying this “chaperone effect,” capturing how scientists transition into senior status within a particular publication venue. We illustrate that the chaperone effect has a different magnitude for journals in different branches of science, being more pronounced in medical and biological sciences and weaker in natural sciences. Finally, we show that in the case of high-impact venues, the chaperone effect has significant implications, specifically resulting in a higher average impact relative to papers authored by new principal investigators (PIs). Our findings shed light on the role played by experience in publishing within specific scientific journals, on the paths toward acquiring the necessary experience and expertise, and on the skills required to publish in prestigious venues.

A Structural Transition in Physical Networks

Nima Dehmamy, Soodabeh Milanlouei & Albert-László Barabási

Nature 563, pages676–680 (2018)

In many physical networks, including neurons in the brain three-dimensional integrated circuits and underground hyphal networks, the nodes and links are physical objects that cannot intersect or overlap with each other. To take this into account, non-crossing conditions can be imposed to constrain the geometry of networks, which consequently affects how they form, evolve and function. However, these constraints are not included in the theoretical frameworks that are currently used to characterize real networks. Most tools for laying out networks are variants of the force-directed layout algorithm—which assumes dimensionless nodes and links—and are therefore unable to reveal the geometry of densely packed physical networks. Here we develop a modelling framework that accounts for the physical sizes of nodes and links, allowing us to explore how non-crossing conditions affect the geometry of a network. For small link thicknesses, we observe a weakly interacting regime in which link crossings are avoided via local link rearrangements, without altering the overall geometry of the layout compared to the force-directed layout. Once the link thickness exceeds a threshold, a strongly interacting regime emerges in which multiple geometric quantities, such as the total link length and the link curvature, scale with the link thickness. We show that the crossover between the two regimes is driven by the non-crossing condition, which allows us to derive the transition point analytically and show that networks with large numbers of nodes will ultimately exist in the strongly interacting regime. We also find that networks in the weakly interacting regime display a solid-like response to stress, whereas in the strongly interacting regime they behave in a gel-like fashion. Networks in the weakly interacting regime are amenable to 3D printing and so can be used to visualize network geometry, and the strongly interacting regime provides insights into the scaling of the sizes of densely packed mammalian brains.

Quantifying Reputation and Success in Art

Samuel P. Fraiberger, Roberta Sinatra, Magnus Resch, Christoph Riedl, Albert-László Barabási

Science 08 Nov 2018: eaau7224 DOI: 10.1126/science.aau7224

In areas of human activity where performance is difficult to quantify in an objective fashion, reputation and networks of influence play a key role in determining access to resources and rewards. To understand the role of these factors, we reconstructed the exhibition history of half a million artists, mapping out the coexhibition network that captures the movement of art between institutions. Centrality within this network captured institutional prestige, allowing us to explore the career trajectory of individual artists in terms of access to coveted institutions. Early access to prestigious central institutions offered life-long access to high-prestige venues and reduced dropout rate. By contrast, starting at the network periphery resulted in a high dropout rate, limiting access to central institutions. A Markov model predicts the career trajectory of individual artists and documents the strong path and history dependence of valuation in art.

Functional Structures for US state governments

Stephen Kosack, Michele Coscia, Evann Smith, Kim Albrecht, Albert-László Barabási, and Ricardo Hausmann

Proceedings of the National Academy of Sciences Oct 2018, 201803228; DOI: 10.1073/pnas.1803228115

  • ABSTRACT

Governments in modern societies undertake an array of complex functions that shape politics and economics, individual and group behavior, and the natural, social, and built environment. How are governments structured to execute these diverse responsibilities? How do those structures vary, and what explains the differences? To examine these longstanding questions, we develop a technique for mapping Internet “footprint” of government with network science methods. We use this approach to describe and analyze the diversity in functional scale and structure among the 50 US state governments reflected in the webpages and links they have created online: 32.5 million webpages and 110 million hyperlinks among 47,631 agencies. We first verify that this extensive online footprint systematically reflects known characteristics: 50 hierarchically organized networks of state agencies that scale with population and are specialized around easily identifiable functions in accordance with legal mandates. We also find that the footprint reflects extensive diversity among these state functional hierarchies. We hypothesize that this variation should reflect, among other factors, state income, economic structure, ideology, and location. We find that government structures are most strongly associated with state economic structures, with location and income playing more limited roles. Voters’ recent ideological preferences about the proper roles and extent of government are not significantly associated with the scale and structure of their state governments as reflected online. We conclude that the online footprint of governments offers a broad and comprehensive window on how they are structured that can help deepen understanding of those structures.

Caenorhabditis elegans and the network control framework—FAQs

Emma K. Towlson, Petra E. Vértes, Gang Yan, Yee Lian Chew, Denise S. Walker, William R. Schafer, and Albert-László Barabási

Philosophical Transactions of the Royal Society B 373: 20170372

Control is essential to the functioning of any neural system. Indeed, under healthy conditions the brain must be able to continuously maintain a tight functional control between the system’s inputs and outputs. One may therefore hypothesize that the brain’s wiring is predetermined by the need to maintain control across multiple scales, maintaining the stability of key internal variables, and producing behaviour in response to environmental cues. Recent advances in network control have offered a powerful mathematical framework to explore the structure – function relationship in complex biological, social and technological networks, and are beginning to yield important and precise insights on neuronal systems. The network control paradigm promises a predictive, quantitative framework to unite the distinct datasets necessary to fully describe a nervous system, and provide mechanistic explanations for the observed structure and function relationships. Here, we provide a thorough review of the network control framework as applied to Caenorhabditis elegans (Yan et al. 2017 Nature 550,519 –523. (doi:10.1038/nature24056)), in the style of Frequently Asked Questions.We present the theoretical, computational and experimental aspects of network control, and discuss its current capabilities and limitations, together with the next likely advances and improvements. We further present thePython code to enable exploration of control principles in a manner specific to this prototypical organism.This article is part of a discussion meeting issue ‘Connectome to behaviour: modelling C. elegans at cellular resolution’.

Network-based approach to prediction and population-based validation of in silico drug repurposing

Feixiong Cheng, Rishi J. Desai, Diane E. Handy, Ruisheng Wang, Sebastian Schneeweiss, Albert-László Barabási & Joseph Loscalzo

Nature Communications vol. 9, Article number: 2691 (2018)

Here we identify hundreds of new drug-disease associations for over 900 FDA-approved drugs by quantifying the network proximity of disease genes and drug targets in the human (protein–protein) interactome. We select four network-predicted associations to test their causal relationship using large healthcare databases with over 220 million patients and state-of-the-art pharmacoepidemiologic analyses. Using propensity score matching, two of four network-based predictions are validated in patient-level data: carbamazepine is associated with an increased risk of coronary artery disease (CAD) [hazard ratio (HR) 1.56, 95% confidence interval (CI) 1.12–2.18], and hydroxychloroquine is associated with a decreased risk of CAD (HR 0.76, 95% CI 0.59–0.97). In vitro experiments show that hydroxychloroquine attenuates pro-inflammatory cytokine-mediated activation in human aortic endothelial cells, supporting mechanistically its potential beneficial effect in CAD. In summary, we demonstrate that a unique integration of protein-protein interaction network proximity and large-scale patient-level longitudinal data complemented by mechanistic in vitro studies can facilitate drug repurposing.

Predicting Perturbation Patterns from the Topology of Biological Networks

Marc Santolini and Albert-Laszlo Barabasi

PNAS | vol. 115 | no. 27 | E6375–E6383

High-throughput technologies, offering an unprecedented wealth of quantitative data underlying the makeup of living systems, are changing biology. Notably, the systematic mapping of the relationships between biochemical entities has fueled the rapid development of network biology, offering a suitable framework to describe disease phenotypes and predict potential drug targets. However, our ability to develop accurate dynamical models remains limited, due in part to the limited knowledge of the kinetic parameters underlying these interactions. Here, we explore the degree to which we can make reasonably accurate predictions in the absence of the kinetic parameters. We find that simple dynamically agnostic models are sufficient to recover the strength and sign of the biochemical perturbation patterns observed in 87 biological models for which the underlying kinetics are known. Surprisingly, a simple distance-based model achieves 65% accuracy. We show that this predictive power is robust to topological and kinetic parameter perturbations, and we identify key network properties that can increase up to 80% the recovery rate of the true perturbation patterns. We validate our approach using experimental data on the chemotactic pathway in bacteria, finding that a network model of perturbation spreading predicts with ∼80% accuracy the directionality of gene expression and phenotype changes in knock-out and overproduction experiments. These findings show that the steady advances in mapping out the topology of biochemical interaction networks opens avenues for accurate perturbation spread modeling, with direct implications for medicine and drug development.

Success In Books: A Big Data Approach to Bestsellers

Burcu Yucesoy, Xindi Wang, Junming Huan, Albert-Laszlo Barabasi

EPJ Data Science 7:7

Reading remains the preferred leisure activity for most individuals, continuing to offera unique path to knowledge and learning. As such, books remain an importantcultural product, consumed widely. Yet, while over 3 million books are published eachyear, very few are read widely and less than 500 make it to the New York Timesbestseller lists. And once there, only a handful of authors can command the lists formore than a few weeks. Here we bring a big data approach to book success byinvestigating the properties and sales trajectories of bestsellers. We find that there areseasonal patterns to book sales with more books being sold during holidays, andeven among bestsellers, fiction books sell more copies than nonfiction books. Generalfiction and biographies make the list more often than any other genre books, and thehigher a book’s initial place in the rankings, the longer the book stays on the list aswell. Looking at patterns characterizing authors, we find that fiction writers are moreproductive than nonfiction writers, commonly achieving bestseller status withmultiple books. Additionally, there is no gender disparity among bestselling fictionauthors but nonfiction, most bestsellers are written by male authors. Finally we findthat there is a universal pattern to book sales. Using this universality we introduce astatistical model to explain the time evolution of sales. This model not onlyreproduces the entire sales trajectory of a book but also predicts the total number ofcopies it will sell in its lifetime, based on its early sales numbers. The analysis of thebestseller characteristics and the discovery of the universal nature of sales patternswith its driving forces are crucial for our understanding of the book industry, andmore generally, of how we as a society interact with cultural products.

Science of Science

Santo Fortunato, Carl T. Bergstrom, Katy Borner, James A. Evans, Dirk Helbing, Stasa Milojevic, Alexander M. Petersen, Filippo Radicchi, Roberta Sinatra, Brian Uzzi, Alessandro Vespignani, Luda Waltman, Dashun Wang, Albert-Laszlo Barabasi

Science 359: 6379 (2018)

The science of science (SciSci) is based on a transdisciplinary approach that uses large data sets to study the mechanisms underlying the doing of science--from the choice of a research problem to career trajectories and progress within a field. In a Review, Fortunato et al. explain that the underlying rationale is that with a deeper understanding of the precursors of impactful science, it will be possible to develop systems and policies that improve each scientist's ability to succeed and enhance the prospects of science as a whole.

The Fundamental Advantages of Temporal Networks

A. Li, S. P. Cornelius, Y.-Y. Liu, L. Wang, A.-L. Barabasi

Science 358:6366, 1042-1046 (2017).

Most networked systems of scientific interest are characterized by temporal links, meaning the network’s structure changes over time. Link temporality has been shown to hinder many dynamical processes, from information spreading to accessibility, by disrupting network paths. Considering the ubiquity of temporal networks in nature, we ask: Are there any advantages of the networks’ temporality? We use an analytical framework to show that temporal networks can, compared to their static counterparts, reach controllability faster, demand orders of magnitude less control energy, and have control trajectories, that are considerably more compact than those characterizing static networks. Thus, temporality ensures a degree of flexibility that would be unattainable in static networks, enhancing our ability to control them.

Network Control Principles Predict Neuron Function in the Caenorhabditis elegans Connectome

G. Yan, P. E. Vertes, E. K. Towlson, Y. L. Chew, S. Walker, W. R. Schafer, A.-L. Barabasi

Nature 550, 519–523 (2017)

Recent studies on the controllability of complex systems offer a powerful mathematical framework to systematically explore the structure–function relationship in biological, social, and technological networks 1, 2, 3. Despite theoretical advances, we lack direct experimental proof of the validity of these widely used control principles. Here we fill this gap by applying a control framework to the connectome of the nematode Caenorhabditis elegans 4, 5, 6, allowing us to predict the involvement of each C. elegans neuron in locomotor behaviours. We predict that control of the muscles or motor neurons requires 12 neuronal classes, which include neuronal groups previously implicated in locomotion by laser ablation 7, 8, 9, 10, 11, 12, 13, as well as one previously uncharacterized neuron, PDB. We validate this prediction experimentally, finding that the ablation of PDB leads to a significant loss of dorsoventral polarity in large body bends. Importantly, control principles also allow us to investigate the involvement of individual neurons within each neuronal class. For example, we predict that, within the class of DD motor neurons, only three (DD04, DD05, or DD06) should affect locomotion when ablated individually. This prediction is also confirmed; single cell ablations of DD04 or DD05 specifically affect posterior body movements, whereas ablations of DD02 or DD03 do not. Our predictions are robust to deletions of weak connections, missing connections, and rewired connections in the current connectome, indicating the potential applicability of this analytical framework to larger and less well-characterized connectomes.

The Elegant Law that Governs Us All

A.-L. Barabasi

Science 357:6347 (2017)

A physicist probes a phenomenon seen in cells, cities, and almost everything in between.

Academia Under Fire in Hungary

A.-L. Barabasi

Science 356: 6338 (2017)

On 10 April, Hungarian President Janos Ader signed into law an amendment to the National Higher Education Law that would outlaw the Central European University (CEU). Although portrayed by the government as a purely administrative step, the "Lex-CEU" law is a strident attempt to curtail academic freedom and limit the independence of academic institutions.

Identifying and modeling the structural discontinuities of human interactions

S. Grauwin, M. Szell, S. Sobolevsky, P. Hovel, F. Simini, M. Vanhoof, Z. Smoreda, A.-L. Barabasi & C. Ratti

Scientific Reports 7: 46677 (2017)

The idea of a hierarchical spatial organization of society lies at the core of seminal theories in human geography that have strongly influenced our understanding of social organization. Along the same line, the recent availability of large-scale human mobility and communication data has offered novel quantitative insights hinting at a strong geographical confinement of human interactions within neighboring regions, extending to local levels within countries. However, models of human interaction largely ignore this effect. Here, we analyze several country-wide networks of telephone calls - both mobile and landline - and in either case uncover a systematic decrease of communication induced by borders we identify as the missing variable in state-of-the-art models. Using this empirical evidence, we propose an alternative modeling framework that naturally stylizes the damping effect of borders. We show that this new notion substantially improves the predictive power of widely used interaction models. This increases our ability to understand, model and predict social activities and to plan the development of infrastructures across multiple scales.

Integrating Personalized Gene Expression Profiles into Predictive Disease-associated Gene Pools

J. Menche, E. Guney, A. Sharma, P. J. Branigan, M. J. Loza, F. Baribaud, R. Dobrin, A.-L. Barabasi

Systems Biology and Applications 3:10 (2017)

Gene expression data are routinely used to identify genes that on average exhibit different expression levels between a case and a control group. Yet, very few of such differentially expressed genes are detectably perturbed in individual patients. Here, we develop a framework to construct personalized perturbation profiles for individual subjects, identifying the set of genes that are significantly perturbed in each individual. This allows us to characterize the heterogeneity of the molecular manifestations of complex diseases by quantifying the expression-level similarities of complex diseases by quantifying the expression-level similarities and differences among patients with the same phenotype. We show that despite the high heterogeneity of the individual perturbation profiles, patients with asthma, Parkinson and Huntington's disease share a broadpool of sporadically disease-associated genes, and that individuals with statistically significant overlap with this pool have a 80-100% chance of being diagnosed with the disease. The developed framework opens up the possibility to apply gene expression data in the context of precision medicine, with important implications for biomarker identification, drug development, diagnosis and treatment.

From Comorbidities of Chronic Obstructive Pulmonary Disease to Identification of Shared Molecular Mechanisms by Data Integration

D. Gomez-Cabrero, J. Menche, C. Vargas, I. Cano, D. Maier, A.-L. Barabasi, J. Tegner, J. Roca (Synergy-COPD Consortia)

BMC Bioinformatics 17: 1291 (2016)

Background Deep mining of healthcare data has provided maps of comorbidity relationships between diseases. In parallel, integrative multi-omics investigations have generated high-resolution molecular maps of putative relevance for understanding disease initiation and progression. Yet, it is unclear how to advance an observation of comorbidity relations (one disease to others) to a molecular understanding of the driver processes and associated biomarkers. Results Since Chronic Obstructive Pulmonary disease (COPD) has emerged as a central hub in temporal comorbidity networks, we developed a systematic integrative data-driven framework to identify shared disease-associated genes and pathways, as a proxy for the underlying generative mechanisms inducing comorbidity. We integrated records from approximately 13 M patients from the Medicare database with disease-gene maps that we derived from several resources including a semantic-derived knowledge-base. Using rank-based statistics we not only recovered known comorbidities but also discovered a novel association between COPD and digestive diseases. Furthermore, our analysis provides the first set of COPD co-morbidity candidate biomarkers, including IL15, TNF and JUP, and characterizes their association to aging and life-style conditions, such as smoking and physical activity. Conclusions The developed framework provides novel insights in COPD and especially COPD co-morbidity associated mechanisms. The methodology could be used to discover and decipher the molecular underpinning of other comorbidity relationships and furthermore, allow the identification of candidate co-morbidity biomarkers.

Quantifying the Evolution of Individual Scientific Impact

R. Sinatra, D. Wang, P. Deville, C. Song, A.-L. Barabasi

Science 4: 354, 6312 (November 2016)

Despite the frequent use of numerous quantitative indicators to gauge the professional impact of a scientist, little is known about how scientific impact emerges and evolves in time. Here, we quantify the changes in impact and productivity throughout a career in science, finding that impact, as measured by influential publications, is distributed randomly within a scientist’s sequence of publications. This random-impact rule allows us to formulate a stochastic model that uncouples the effects of productivity, individual ability, and luck and unveils the existence of universal patterns governing the emergence of scientific success. The model assigns a unique individual parameter Q to each scientist, which is stable during a career, and it accurately predicts the evolution of a scientist’s impact, from the h-index to cumulative citations, and independent recognitions, such as prizes.

Controllability of multiplex, multi-time-scale networks

M. Posfai, J. Gao, S. P. Cornelius, A.-L. Barabasi, R. D'Souza

Physical Review E 94: 3, 032316 (2016)

The paradigm of layered networks is used to describe many real-world systems, from biological networks to social organizations and transportation systems. While recently there has been much progress in understanding the general properties of multilayer networks, our understanding of how to control such systems remains limited. One fundamental aspect that makes this endeavor challenging is that each layer can operate at a different time scale; thus, we cannot directly apply standard ideas from structural control theory of individual networks. Here we address the problem of controlling multilayer and multi-time-scale networks focusing on two-layer multiplex networks with one-to-one interlayer coupling. We investigate the practically relevant case when the control signal is applied to the nodes of one layer. We develop a theory based on disjoint path covers to determine the minimum number of inputs (Ni) necessary for full control. We show that if both layers operate on the same time scale, then the network structure of both layers equally affect controllability. In the presence of time-scale separation, controllability is enhanced if the controller interacts with the faster layer: Ni decreases as the time-scale difference increases up to a critical time-scale difference, above which Ni remains constant and is completely determined by the faster layer. We show that the critical time-scale difference is large if layer I is easy and layer II is hard to control in isolation. In contrast, control becomes increasingly difficult if the controller interacts with the layer operating on the slower time scale and increasing time-scale separation leads to increased Ni, again up to a critical value, above which Ni still depends on the structure of both layers. This critical value is largely determined by the longest path in the faster layer that does not involve cycles. By identifying the underlying mechanisms that connect time-scale difference and controllability for a simplified model, we provide crucial insight into disentangling how our ability to control real interacting complex systems is affected by a variety of sources of complexity.

Control Principles of Complex Systems

Y.-Y. Liu and A.-L. Barabasi

Review of Modern Physics 88: 3, 035006-035064 (2016)

A reflection of our ultimate understanding of a complex system is our ability to control its behavior. Typically, control has multiple prerequisites: it requires an accurate map of the network that governs the interactions between the system’s components, a quantitative description of the dynamical laws that govern the temporal behavior of each component, and an ability to influence the state and temporal behavior of a selected subset of the components. With deep roots in dynamical systems and control theory, notions of control and controllability have taken a new life recently in the study of complex networks, inspiring several fundamental questions: What are the control principles of complex systems? How do networks organize themselves to balance control with functionality? To address these questions here recent advances on the controllability and the control of complex networks are reviewed, exploring the intricate interplay between the network topology and dynamical laws. The pertinent mathematical results are matched with empirical findings and applications. Uncovering the control principles of complex systems can help us explore and ultimately understand the fundamental laws that govern their behavior.

Control of Fluxes in Metabolic Networks

G. Basler, Z. Nikoloski, A. Larhlimi, A.-L. Barabasi, and Y.-Y. Liu

Genome Research 7: 26, 956-968 (2016)

Understanding the control of large-scale metabolic networks is central to biology and medicine. However, existing approaches either require specifying a cellular objective or can only be used for small networks. We introduce new coupling types describing the relations between reaction activities, and develop an efficient computational framework, which does not require any cellular objective for systematic studies of large-scale metabolism. We identify the driver reactions facilitating control of 23 metabolic networks from all kingdoms of life. We find that unicellular organisms require a smaller degree of control than multicellular organisms. Driver reactions are under complex cellular regulation in Escherichia coli, indicating their preeminent role in facilitating cellular control. In human cancer cells, driver reactions play pivotal roles in malignancy and represent potential therapeutic targets. The developed framework helps us gain insights into regulatory principles of diseases and facilitates design of engineering strategies at the interface of gene regulation, signaling, and metabolism.

Scaling Identity Connects Human Mobility and Social Interactions

P. Deville, C. Song, N. Eagle, V. D. Blondel, A.-L. Barabasi, D. Wang

PNAS 113: 26, 7047-7052 (2016)

Both our mobility and communication patterns obey spatial constraints: Most of the time, our trips or communications occur over a short distance, and occasionally, we take longer trips or call a friend who lives far away. These spatial dependencies, best described as power laws, play a consequential role in broad areas ranging from how an epidemic spreads to diffusion of ideas and information. Here we established the first formal link, to our knowledge, between mobility and communication patterns by deriving a scaling relationship connecting them. The uncovered scaling theory not only allows us to derive human movements from communication volumes, or vice versa, but it also documents a new degree of regularity that helps deepen our quantitative understanding of human behavior. Massive datasets that capture human movements and social interactions have catalyzed rapid advances in our quantitative understanding of human behavior during the past years. One important aspect affecting both areas is the critical role space plays. Indeed, growing evidence suggests both our movements and communication patterns are associated with spatial costs that follow reproducible scaling laws, each characterized by its specific critical exponents. Although human mobility and social networks develop concomitantly as two prolific yet largely separated fields, we lack any known relationships between the critical exponents explored by them, despite the fact that they often study the same datasets. Here, by exploiting three different mobile phone datasets that capture simultaneously these two aspects, we discovered a new scaling relationship, mediated by a universal flux distribution, which links the critical exponents characterizing the spatial dependencies in human mobility and social networks. Therefore, the widely studied scaling laws uncovered in these two areas are not independent but connected through a deeper underlying reality.

Untangling performance from success

B. Yucesoy, A.-L. Barabási

EPJ Data Science 5 (1), 17

Fame, popularity and celebrity status, frequently used tokens of success, are often loosely related to, or even divorced from professional performance. This dichotomy is partly rooted in the difficulty to distinguish performance, an individual measure that captures the actions of a performer, from success, a collective measure that captures a community’s reactions to these actions. Yet, finding the relationship between the two measures is essential for all areas that aim to objectively reward excellence, from science to business. Here we quantify the relationship between performance and success by focusing on tennis, an individual sport where the two quantities can be independently measured. We show that a predictive model, relying only on a tennis player’s performance in tournaments, can accurately predict an athlete’s popularity, both during a player’s active years and after retirement. Hence the model establishes a direct link between performance and momentary popularity. The agreement between the performance-driven and observed popularity suggests that in most areas of human achievement exceptional visibility may be rooted in detectable performance measures.

Controllability Analysis of the Directed Human Protein Interaction Network Identifies Disease Genes and Drug Targets

A. Vinayagama, T.E. Gibsonb, H.-J. Lee, B. Yilmazeld, C. Roeseld, Y. Hua, Y. Kwona, A. Sharma, Y.-Y. Liu, N. Perrimona, A.-L. Barabasi

Proceedings of the National Academy of Sciences 10.1073/pnas.1603992113, 1-6 (2016)

The protein-protein interaction (PPI) network is crucial for cellular information processing and decision-making. With suitable inputs, PPI networks drive the cells to diverse functional outcomes such as cell proliferation or cell death. Here, we characterize the structural controllability of a large directed human PPI network comprising 6,339 proteins and 34,813 interactions. This network allows us to classify proteins as "indispensable," "neutral," or "dispensable," which correlates to increasing, no effect, or decreasing the number of driver nodes in the network upon removal of that protein. We find that 21% of the proteins in the PPI network are indispensable. Interestingly, these indispensable proteins are the primary targets of disease-causing mutations, human viruses, and drugs, suggesting that altering a networks control property is critical for the transition between healthy and disease states. Furthermore, analyzing copy number alterations data from 1,547 cancer patients reveals that 56 genes that are frequently amplified or deleted in nine different cancers are indispensable. Among the 56 genes, 46 of them have not been previously associated with cancer. This suggests that controllability analysis is very useful in identifying novel disease genes and potential drug targets.

The Network Behind the Cosmic Web

B.C. Coutinho, S. Hong, K. Albrecht, A. Day, A.-L. Barabasi, P. Torrey, M. Vogelsberger, L. Hernquist

arXiv:1604.03236v2 (13 April 2016)

The concept of the cosmic web, viewing the universe as a set of discrete galaxies held together by gravity, is deeply ingrained in cosmology. Yet, little is known about the most effective construction and the characteristics of the underlying network. Here we explore seven network construction algorithms that use various galaxy distributions provided by both simulations and observations. We find that a model relying only on spatial proximity offers the best correlations between the physical characteristics of the connected galaxies. We show that the properties of the networks generated and from simulations and observations are identical, unveiling a deep universality of the cosmic web.

Universal resilience patterns in complex networks

J. Gao, B. Barzel, A.-L. Barabási

Nature 530, 307-312 (2016)

Resilience, a system’s ability to adjust its activity to retain its basic functionality when errors, failures and environmental changes occur, is a defining property of many complex systems. Despite widespread consequences for human health, the economy and the environment, events leading to loss of resilience—from cascading failures in technological systems to mass extinctions in ecological networks—are rarely predictable and are often irreversible. These limitations are rooted in a theoretical gap: the current analytical framework of resilience is designed to treat low-dimensional models with a few interacting components, and is unsuitable for multi-dimensional systems consisting of a large number of components that interact through a complex network. Here we bridge this theoretical gap by developing a set of analytical tools with which to identify the natural control and state parameters of a multi-dimensional complex system, helping us derive effective one-dimensional dynamics that accurately predict the system’s resilience. The proposed analytical framework allows us systematically to separate the roles of the system’s dynamics and topology, collapsing the behaviour of different networks onto a single universal resilience function. The analytical results unveil the network characteristics that can enhance or diminish resilience, offering ways to prevent the collapse of ecological, biological or economic systems, and guiding the design of technological systems resilient to both internal failures and environmental changes.

Network-based in silico drug efficacy screening

E. Guney, J. Menche, M. Vidal, A.-L. Barabási

Nature Communications 7:10331, 1-13 (2016)

The increasing cost of drug development together with a significant drop in the number of new drug approvals raises the need for innovative approaches for target identification and efficacy prediction. Here, we take advantage of our increasing understanding of the network-based origins of diseases to introduce a drug-disease proximity measure that quantifies the interplay between drugs targets and diseases. By correcting for the known biases of the interactome, proximity helps us uncover the therapeutic effect of drugs, as well as to distinguish palliative from effective treatments. Our analysis of 238 drugs used in 78 diseases indicates that the therapeutic effect of drugs is localized in a small network neighborhood of the disease genes and highlights efficacy issues for drugs used in Parkinson and several inflammatory disorders. Finally, network-based proximity allows us to predict novel drug-disease associations that offer unprecedented opportunities for drug repurposing and the detection of adverse effects.

Tissue Specificity of Human Disease Module

M. Kitsak, A. Sharma, J. Menche, E. Guney, S. D. Ghiassian, J. Loscalzo, A.-L. Barabasi

Scientific Reports 6: 35241 (2016)

Genes carrying mutations associated with genetic diseases are present in all human cells; yet, clinical manifestations of genetic diseases are usually highly tissue-specific. Although some disease genes are expressed only in selected tissues, the expression patterns of disease genes alone cannot explain the observed tissue specificity of human diseases. Here we hypothesize that for a disease to manifest itself in a particular tissue, a whole functional subnetwork of genes (disease module) needs to be expressed in that tissue. Driven by this hypothesis, we conducted a systematic study of the expression patterns of disease genes within the human interactome. We find that genes expressed in a specific tissue tend to be localized in the same neighborhood of the interactome. By contrast, genes expressed in different tissues are segregated in distinct network neighborhoods. Most important, we show that it is the integrity and the completeness of the expression of the disease module that determines disease manifestation in selected tissues. This approach allows us to construct a disease-tissue network that confirms known and predicts unexpected disease-tissue associations.

Endophenotype Network Models: Common Core of Complex Diseases

S. D. Ghiassian, J. Menche, D. I. Chasman, F. Giulianini, R. Wang, P. Ricchiuto, M. Aikawa, H. Iwata, C. Muller, T. Zeller, A. Sharma, P. Wild, K. Lackner, S. Singh, P. M. Ridker, S. Blankenberg, A.-L. Barabasi, J. Loscalzo

Scientific Reports 6: 27414, 1-13 (2016)

Historically, human diseases have been differentiated and categorized based on the organ system in which they primarily manifest. Recently, an alternative view is emerging that emphasizes that different diseases often have common underlying mechanisms and shared intermediate pathophenotypes, or endo(pheno)types. Within this framework, a specific disease’s expression is a consequence of the interplay between the relevant endophenotypes and their local, organ-based environment. Important examples of such endophenotypes are inflammation, fibrosis, and thrombosis and their essential roles in many developing diseases. In this study, we construct endophenotype network models and explore their relation to different diseases in general and to cardiovascular diseases in particular. We identify the local neighborhoods (module) within the interconnected map of molecular components, i.e., the subnetworks of the human interactome that represent the inflammasome, thrombosome, and fibrosome. We find that these neighborhoods are highly overlapping and significantly enriched with disease-associated genes. In particular they are also enriched with differentially expressed genes linked to cardiovascular disease (risk). Finally, using proteomic data, we explore how macrophage activation contributes to our understanding of inflammatory processes and responses. The results of our analysis show that inflammatory responses initiate from within the cross-talk of the three identified endophenotypic modules.

Canonical genetic signatures of the adult human brain

M. Hawrylycz, J. A. Miller, V. Menon, D. Feng, T. Dolbeare, A. L. Guillozet-Bongaarts, A. G. Jegga, B. J. Aronow, C.-K. Lee, A. Bernard, M. F. Glasser, D. L. Dierker, J. Menche, A. Szafer, F. Collman, P. Grange, K. A. Berman, S. Mihalas, Z. Yao, L. Stewart, A.-L. Barabási, J. Schulkin, J. Phillips, L. Ng, C. Dang, D. R. Haynor, A. Jones, D. C. Van Essen, C. Koch, D. Lein

Nature Neuroscience 4171, 1-15 (2015)

The structure and function of the human brain are highly stereotyped, implying a conserved molecular program responsible for its development, cellular structure and function. We applied a correlation-based metric called differential stability to assess reproducibility of gene expression patterning across 132 structures in six individual brains, revealing mesoscale genetic organization. The genes with the highest differential stability are highly biologically relevant, with enrichment for brain-related annotations, disease associations, drug targets and literature citations. Using genes with high differential stability, we identified 32 anatomically diverse and reproducible gene expression signatures, which represent distinct cell types, intracellular components and/or associations with neurodevelopmental and neurodegenerative disorders. Genes in neuron-associated compared to non-neuronal networks showed higher preservation between human and mouse; however, many diversely patterned genes displayed marked shifts in regulation between species. Finally, highly consistent transcriptional architecture in neocortex is correlated with resting state functional connectivity, suggesting a link between conserved gene expression and functionally relevant circuitry.

Returners and explorers dichotomy in human mobility

L. Pappalardo, F. Simini, S. Rinzivillo, D. Pedreschi, F. Giannotti, A.-L. Barabási

Nature Communications 6:8166, 1-8 (2015)

The availability of massive digital traces of human whereabouts has offered a series of novel insights on the quantitative patterns characterizing human mobility. In particular, numerous recent studies have lead to an unexpected consensus: the considerable variability in the characteristic travelled distance of individuals coexists with a high degree of predictability of their future locations. Here we shed light on this surprising coexistence by systematically investigating the impact of recurrent mobility on the characteristic distance travelled by individuals. Using both mobile phone and GPS data, we discover the existence of two distinct classes of individuals: returners and explorers. As existing models of human mobility cannot explain the existence of these two classes, we develop more realistic models able to capture the empirical findings. Finally, we show that returners and explorers play a distinct quantifiable role in spreading phenomena and that a correlation exists between their mobility patterns and social interactions.

Spectrum of controlling and observing complex networks

G. Yan, G. Tsekenis, B. Barzel, J.-J. Slotine, Y.-Y. Liu, A.-L. Barabási

Nature Physics 11, 779-796 (2015)

Recent studies have made important advances in identifying sensor or driver nodes, through which we can observe or control a complex system. But the observational uncertainty induced by measurement noise and the energy required for control continue to be significant challenges in practical applications. Here we show that the variability of control energy and observational uncertainty for different directions of the state space depend strongly on the number of driver nodes. In particular, we find that if all nodes are directly driven, control is energetically feasible, as the maximum energy increases sub-linearly with the system size. If, however, we aim to control a system through a single node, control in some directions is energetically prohibitive, increasing exponentially with the system size. For the cases in between, the maximum energy decays exponentially when the number of driver nodes increases. We validate our findings in several model and real networks, arriving at a series of fundamental laws to describe the control energy that together deepen our understanding of complex systems.

Constructing minimal models for complex system dynamics

B. Barzel, Y.-Y. Liu, A.-L. Barabási

Nature Communications 6:7186, 1-8 (2015)

One of the strengths of statistical physics is the ability to reduce macroscopic observations into microscopic models, offering a mechanistic description of a system’s dynamics. This paradigm, rooted in Boltzmann’s gas theory, has found applications from magnetic phenomena to subcellular processes and epidemic spreading. Yet, each of these advances were the result of decades of meticulous model building and validation, which are impossible to replicate in most complex biological, social or technological systems that lack accurate microscopic models. Here we develop a method to infer the microscopic dynamics of a complex system from observations of its response to external perturbations, allowing us to construct the most general class of nonlinear pairwise dynamics that are guaranteed to recover the observed behavior. The result, which we test against both numerical and empirical data, is an effective dynamic model that can predict the system’s behavior and provide crucial insights into its inner workings.
The observation that disease associated proteins often interact with each other has fueled the development of network-based approaches to elucidate the molecular mechanisms of human disease. Such approaches build on the assumption that protein interaction networks can be viewed as maps in which diseases can be identified with localized perturbation within a certain neighborhood. The identification of these neighborhoods, or disease modules, is therefore a prerequisite of a detailed investigation of a particular pathophenotype. While numerous heuristic methods exist that successfully pinpoint disease associated modules, the basic underlying connectivity patterns remain largely unexplored. In this work we aim to fill this gap by analyzing the network properties of a comprehensive corpus of 70 complex diseases. We find that disease associated proteins do not reside within locally dense communities and instead identify connectivity significance as the most predictive quantity. This quantity inspires the design of a novel Disease Module Detection (DIAMOnD) algorithm to identify the full disease module around a set of known disease proteins. We study the performance of the algorithm using well-controlled synthetic data and systematically validate the identified neighborhoods for a large corpus of diseases.

Uncovering disease-disease relationships through the incomplete interactome

J. Menche, A. Sharma, M. Kitsak, D. Ghiassian, M. Vidal, J. Loscazlo, A.-L. Barabasi

Science 347:6224, 1257601-1 (2015)

According to the disease module hypothesis, the cellular components associated with a disease segregate in the same neighborhood of the human interactome, the map of biologically relevant molecular interactions. Yet, given the incompleteness of the interactome and the limited knowledge of disease-associated genes, it is not obvious if the available data have sufficient coverage to map out modules associated with each disease. Here we derive mathematical conditions for the identifiability of disease modules and show that the network-based location of each disease module determines its pathobiological relationship to other diseases. For example, diseases with overlapping network modules show significant coexpression patterns, symptom similarity, and comorbidity, whereas diseases residing in separated network neighborhoods are phenotypically distinct. These tools represent an interactome-based platform to predict molecular commonalities between phenotypically related diseases, even if they do not share primary disease genes.

A disease module in the interactome explains disease heterogeneity, drug response and captures novel pathways and genes in asthma

A. Sharma, J. Menche, C. C. Huang, T. Ort, X. Zhou, M. Kitsak, N. Sahni, D. Thibault, L. Voung, F. Guo, S. D. Ghiassian, N. Gulbahce, F. Baribaud, J. Tocker, R. Dobrin, E. Barnathan, H. Liu, R. A. Panettieri Jr., K. G. Tantisira, W. Qiu, B. A. Raby, E. K. Silverman, M. Vidal, S. T. Weiss, and A.-L. Barabási

Human Molecular Genetics 101093, 1-16 (2015)

Recent advances in genetics have spurred rapid progress towards the systematic identification of genes involved in complex diseases. Still, the detailed understanding of the molecular and physiological mechanisms through which these genes affect disease phenotypes remains a major challenge. Here, we identify the asthma disease module, i.e. the local neighborhood of the interactome whose perturbation is associated with asthma, and validate it for functional and pathophysiological relevance, using both computational and experimental approaches. We find that the asthma disease module is enriched with modest GWAS P-values against the background of random variation, and with differentially expressed genes from normal and asthmatic fibroblast cells treated with an asthma-specific drug. The asthma module also contains immune response mechanisms that are shared with other immune-related disease modules. Further, using diverse omics (genomics,gene-expression, drug response) data,we identify the GAB1 signaling pathway as an important novel modulator in asthma. The wiring diagram of the uncovered asthma module suggests a relatively close link between GAB1 and glucocorticoids (GCs), which we experimentally validate, observing an increase in the level of GAB1 after GC treatment in BEAS-2B bronchial epithelial cells. The siRNA knockdown of GAB1 in the BEAS-2B ce

Destruction perfected

I. A. Kovács, A.-L. Barabási

Nature (News & Views) 524, 38-39 (2015)

Pinpointing the nodes whose removal most effectively disrupts a network has become a lot easier with the development of an efficient algorithm. Potential applications might include cybersecurity and disease control. See Letter p.65, by F. Morone and H. A. Makse (Supplementary 1).

A proteome-scale map of the human interactome network

T. Rolland, M. Tasan, , B. Charloteaux, S. J. Pevzner,, Q. Zhong, N. Sahni, S. Yi,, I. Lemmens, C. Fontanillo,, R. Mosca, A. Kamburov, , S. D. Ghiassian, X. Yang,, L. Ghamsari, D. Balcha,, B. E. Begg, P. Braun, M. Brehm, M. P. Froly, A.-R. Carvunis, D, Convery-Zupan, R. Carominas,, J. Coulombe-Huntington, , E. Dann, M. Dreze, A. Dricot,, C. Fan, E. Franzosa, F. Gebrea, B. J. Gutierrez, M. F. Hardy,, M. Jin, S. Kang, R. Kiros, G. , Lin, K. Luck, A. MacWilliams,, J. Menche, R R. Murray, A., Palagi, M. M. Poulin, X. , Rambout, J. Rasla, P. Reichert, V. Romero, E. Ruyssinck, J. M., Sahalie, plus 20 more co-authors

Cell 159:5, 1212-1226 (2014)

Just as reference genome sequences revolutionized human genetics, reference maps of interactome networks will be critical to fully understand genotype-phenotype relationships. Here, we describe a systematic map of ∼14,000 high-quality human binary protein-protein interactions. At equal quality, this map is ∼30% larger than what is available from small-scale studies published in the literature in the last few decades. While currently available information is highly biased and only covers a relatively small portion of the proteome, our systematic map appears strikingly more homogeneous, revealing a “broader” human interactome network than currently appreciated. The map also uncovers significant interconnectivity between known and candidate cancer gene products, providing unbiased evidence for an expanded functional cancer landscape, while demonstrating how high-quality interactome models will help “connect the dots” of the genomic revolution.

Collective credit allocation in science

H.-W. Shen, A.-L. Barabasi

Proceedings of the National Academy of Sciences 10.1073/pnas.1401992111, 1-6 (2014)

Collaboration among researchers is an essential component of the modern scientific enterprise, playing a particularly important role in multidisciplinary research. However, we _continue to wrestle with allocating credit to the coauthors of publications with multiple authors, because the relative contribution of each author is difficult to determine. At the same time, the scientific community runs an informal field-dependent credit allocation process that assigns credit in a collective fashion to each work. Here we develop a credit allocation algorithm that captures the coauthors’ contribution to a publication as perceived by the scientific community, reproducing the informal collective credit allocation of science. We validate the method by identifying the authors of Nobel-winning papers that are credited for the discovery, independent of their positions in the author list. The method can also compare the relative impact of researchers working in the same field, even if they did not publish together. The ability to accurately measure the relative credit of researchers could affect many aspects of credit allocation in science, potentially impacting hiring, funding, and promotion decisions.

A network framework of cultural history

M. Schich, C. Song, Y. Y. Ahn, A. Mirsky, M. Martino, A.-L. Barabási, D. Helbing

Science 345, 558-562 (2014)

The emergent processes driving cultural history are a product of complex interactions among large numbers of individuals, determined by difficult-to-quantify historical conditions. To characterize these processes, we have reconstructed aggregate intellectual mobility over two millennia through the birth and death locations of more than 150,000 notable individuals. The tools of network and complexity theory were then used to identify characteristic statistical patterns and determine the cultural and historical relevance of deviations. The resulting network of locations provides a macroscopic perspective of cultural history, which helps us to retrace cultural narratives of Europe and North America using large-scale visualization and quantitative dynamical tools and to derive historical trends of cultural centers beyond the scope of specific events or narrow time intervals.

A genetic epidemiology approach to cyber-security

S. Gil, A. Kott, A.-L. Barabási

Scientific Reports 4:5659, 1-7 (2014)

While much attention has been paid to the vulnerability of computer networks to node and link failure, there is limited systematic understanding of the factors that determine the likelihood that a node (computer) is compromised. We therefore collect threat log data in a university network to study the patterns of threat activity for individual hosts. We relate this information to the properties of each host as observed through network-wide scans, establishing associations between the network services a host is running and the kinds of threats to which it is susceptible. We propose a methodology to associate services to threats inspired by the tools used in genetics to identify statistical associations between mutations and diseases. The proposed approach allows us to determine probabilities of infection directly from observation, offering an automated high-throughput strategy to develop comprehensive metrics for cyber-security.

Human symptoms–disease network

X. Z. Zhou, J. Menche, A.-L. Barabási, A. Sharma

Nature Communications 5:4212, 1-10 (2014)

In the post-genomic era, the elucidation of the relationship between the molecular origins of diseases and their resulting phenotypes is a crucial task for medical research. Here, we use a large-scale biomedical literature database to construct a symptom-based human disease network and investigate the connection between clinical manifestations of diseases and their underlying molecular interactions. We find that the symptom-based similarity of two diseases correlates strongly with the number of shared genetic associations and the extent to which their associated proteins interact. Moreover, the diversity of the clinical manifestations of a disease can be related to the connectivity patterns of the underlying protein interaction network. The comprehensive, high-quality map of disease–symptom relations can further be used as a resource helping to address important questions in the field of systems medicine, for example, the identification of unexpected associations between diseases, disease etiology research or drug design.

Career on the move: Geography, stratification, and scientific impact

P. Deville, D. Wang, R. Sinatra, C. Song, V. Blondel, A.-L. Barabási

Scientific Reports 4, 1-7 (2014)

Changing institutions is an integral part of an academic life. Yet little is known about the mobility patterns of scientists at an institutional level and how these career choices affect scientific outcomes. Here, we examine over 420,000 papers, to track the affiliation information of individual scientists, allowing us to reconstruct their career trajectories over decades. We find that career movements are not only temporally and spatially localized, but also characterized by a high degree of stratification in institutional ranking. When cross-group movement occurs, we find that while going from elite to lower-rank institutions on average associates with modest decrease in scientific performance, transitioning into elite institutions does not result in subsequent performance gain. These results offer empirical evidence on institutional level career choices and movements and have potential implications for science policy.

A diVIsive Shuffling Approach (VIStA) for gene expression analysis to identify subtypes in Chronic Obstructive Pulmonary Disease

J. Mench, A. Sharma, M. H. Cho, R. J. Mayer, S. I. Rennard, B. Celli, B. E. Miller, N. Locantore, R. Tal-Singer, S. Ghosh, C. Larminie, G. Bradley, J. H. Riley, A. Agusti, E. K. Silverman, A.-L. Barabási

BMC Systems Biology 8, 1-13 (2014)

Background: An important step toward understanding the biological mechanisms underlying a complex disease is a refined understanding of its clinical heterogeneity. Relating clinical and molecular differences may allow us to define more specific subtypes of patients that respond differently to therapeutic interventions. Results: We developed a novel unbiased method called diVIsive Shuffling Approach (VIStA) that identifies subgroups of patients by maximizing the difference in their gene expression patterns. We tested our algorithm on 140 subjects with Chronic Obstructive Pulmonary Disease (COPD) and found four distinct, biologically and clinically meaningful combinations of clinical characteristics that are associated with large gene expression differences. The dominant characteristic in these combinations was the severity of airflow limitation. Other frequently identified measures included emphysema, fibrinogen levels, phlegm, BMI and age. A pathway analysis of the differentially expressed genes in the identified subtypes suggests that VIStA is capable of capturing specific molecular signatures within in each group. Conclusions: The introduced methodology allowed us to identify combinations of clinical characteristics that correspond to clear gene expression differences. The resulting subtypes for COPD contribute to a better understanding of its heterogeneity.

Bordering Fiction

Barabasi, A.-L.

Science 343: 6169 (2014)

Eggers portrays a world--in which an omnipotent social networking company encourages everyone to monitor everybody everywhere--that feels eerily everyday.

Quantifying information flow during emergencies

L. Gao, C. Song, Z. Gao, A.-L. Barabasi, J. P. Bagrow, D. Wang

Scientific Reports 4, 1-6 (2014)

Recent advances on human dynamics have focused on the normal patterns of human activities, with the quantitative understanding of human behavior under extreme events remaining a crucial missing chapter. This has a wide array of potential applications, ranging from emergency response and detection to traffic control and management. Previous studies have shown that human communications are both temporally and spatially localized following the onset of emergencies, indicating that social propagation is a primary means to propagate situational awareness. We study real anomalous events using country-wide mobile phone data, finding that information flow during emergencies is dominated by repeated communications. We further demonstrate that the observed communication patterns cannot be explained by inherent reciprocity in social networks, and are universal across different demographics.

Modeling and predicting popularity dynamics via reinforced poisson processes

H. Shen, D. Wang, C. Song, A.-L. Barabási

Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence , 291-297 (2014)

An ability to predict the popularity dynamics of individual items within a complex evolving system has important implications in an array of areas. Here we propose a generative probabilistic framework using a reinforced Poisson process to explicitly model the process through which individual items gain their popularity. This model distinguishes itself from existing models via its capability of modeling the arrival process of popularity and its remarkable power at predicting the popularity of individual items. It possesses the flexibility of applying Bayesian treatment to further improve the predictive power using a conjugate prior. Extensive experiments on a longitudinal citation dataset demonstrate that this model consistently outperforms existing popularity prediction methods.

Target control of complex networks

Jianxi Gao, Y.-Y.Liu, R. M. D'Souza, A.-L. Barabási

Nature Communications 5:5415, 1-7 (2014)

Controlling large natural and technological networks is an outstanding challenge. It is typically neither feasible nor necessary to control the entire network, prompting us to explore target control: the efficient control of a preselected subset of nodes. We show that the structural controllability approach used for full control overestimates the minimum number of driver nodes needed for target control. Here we develop an alternate ‘k-walk’ theory for directed tree networks, and we rigorously prove that one node can control a set of target nodes if the path length to each target node is unique. For more general cases, we develop a greedy algorithm to approximate the minimum set of driver nodes sufficient for target control. We find that degree heterogeneous networks are target controllable with higher efficiency than homogeneous networks and that the structure of many real-world networks are suitable for efficient target control.

Network-based analysis of genome wide association data provides novel candidate genes for lipid and lipoprotein traits

A. Sharma, N. Gulbahce, S. J. Pevzner, J. Menche, C. Ladenvall, L. Folkdersen, P. Eriksson, M. Orho-Melander, A.-L. Barabási

Molecular & Cellular Proteomics 12, 3398-3408 (2013)

Genome wide association studies (GWAS) identify susceptibility loci for complex traits, but do not identify particular genes of interest. Integration of functional and network information may help in overcoming this limitation and identifying new susceptibility loci. Using GWAS and comorbidity data, we present a network-based approach to predict candidate genes for lipid and lipoprotein traits. We apply a prediction pipeline incorporating interactome, co-expression, and comorbidity data to Global Lipids Genetics Consortium (GLGC) GWAS for four traits of interest, identifying phenotypically coherent modules. These modules provide insights regarding gene involvement in complex phenotypes with multiple susceptibility alleles and low effect sizes. To experimentally test our predictions, we selected four candidate genes and genotyped representative SNPs in the Malmö Diet and Cancer Cardiovascular Cohort. We found significant associations with LDL-C and total-cholesterol levels for a synonymous SNP (rs234706) in the cystathionine beta-synthase (CBS) gene (p = 1 × 10−5 and adjusted-p = 0.013, respectively). Further, liver samples taken from 206 patients revealed that patients with the minor allele of rs234706 had significant dysregulation of CBS (p = 0.04). Despite the known biological role of CBS in lipid metabolism, SNPs within the locus have not yet been identified in GWAS of lipoprotein traits. Thus, the GWAS-based Comorbidity Module (GCM) approach identifies candidate genes missed by GWAS studies, serving as a broadly applicable tool for the investigation of other complex disease phenotypes.

Uncovering the role of elementary processes in network evolution

G. Ghoshal, L. Chi, A.-L. Barabási

Scientifc Reports 3, 1-8 (2013)

The growth and evolution of networks has elicited considerable interest from the scientific community and a number of mechanistic models have been proposed to explain their observed degree distributions. Various microscopic processes have been incorporated in these models, among them, node and edge addition, vertex fitness and the deletion of nodes and edges. The existing models, however, focus on specific combinations of these processes and parameterize them in a way that makes it difficult to elucidate the role of the individual elementary mechanisms. We therefore formulated and solved a model that incorporates the minimal processes governing network evolution. Some contribute to growth such as the formation of connections between existing pair of vertices, while others capture deletion; the removal of a node with its corresponding edges, or the removal of an edge between a pair of vertices. We distinguish between these elementary mechanisms, identifying their specific role on network evolution.

Quantifying Long-Term Scientific Impact

D. Wang, C. Song, A.-L. Barabási

Science 342, 127-131 (2013)

The lack of predictability of citation-based measures frequently used to gauge impact, from impact factors to short-term citations, raises a fundamental question: Is there long-term predictability in citation patterns? Here, we derive a mechanistic model for the citation dynamics of individual papers, allowing us to collapse the citation histories of papers from different journals and disciplines into a single curve, indicating that all papers tend to follow the same universal temporal pattern. The observed patterns not only help us uncover basic mechanisms that govern scientific impact but also offer reliable measures of influence that may have potential policy implications.
Controlling complex systems is a fundamental challenge of network science. Recent advances indicate that control over the system can be achieved through a minimum driver node set (MDS). The existence of multiple MDS's suggests that nodes do not participate in control equally, prompting us to quantify their participations. Here we introduce control capacity quantifying the likelihood that a node is a driver node. To efficiently measure this quantity, we develop a random sampling algorithm. This algorithm not only provides a statistical estimate of the control capacity, but also bridges the gap between multiple microscopic control configurations and macroscopic properties of the network under control. We demonstrate that the possibility of being a driver node decreases with a node's in-degree and is independent of its out-degree. Given the inherent multiplicity of MDS's, our findings offer tools to explore control in various complex systems.

Network Science

Albert-László Barabási

Philosophical Transactions of The Royal Society 371, 1-3 (2013)

Professor Barabási's talk described how the tools of network science can help understand the Web's structure, development and weaknesses. The Web is an information network, in which the nodes are documents (at the time of writing over one trillion of them), connected by links. Other well-known network structures include the Internet, a physical network where the nodes are routers and the links are physical connections, and organizations, where the nodes are people and the links represent communications.

Observability of complex systems

Y.-Y. Liu, J.-J. Slotine, A.-L. Barabási

Proceedings of the National Academy of Sciences 110, 1-6 (2013)

A quantitative description of a complex system is inherently limited by our ability to estimate the system’s internal state from experimentally accessible outputs. Although the simultaneous measurement of all internal variables, like all metabolite concentrations in a cell, offers a complete description of a system’s state, in practice experimental access is limited to only a subset of variables, or sensors. A system is called observable if we can reconstruct the system’s complete internal state from its outputs. Here, we adopt a graphical approach derived from the dynamical laws that govern a system to determine the sensors that are necessary to reconstruct the full internal state of a complex system. We apply this approach to biochemical reaction systems, finding that the identified sensors are not only necessary but also sufficient for observability. The developed approach can also identify the optimal sensors for target or partial observability, helping us reconstruct selected state variables from appropriately chosen outputs, a prerequisite for optimal biomarker design. Given the fundamental role observability plays in complex systems, these results offer avenues to systematically explore the dynamics of a wide range of natural, technological and socioeconomic systems.

Network link prediction by global silencing of indirect correlations

B. Barzel, A.-L. Barabási

Nature Biotechnology 31: Num 8, 1-8 (2013)

Predictions of physical and functional links between cellular components are often based on correlations between experimental measurements, such as gene expression. However, correlations are affected by both direct and indirect paths, confounding our ability to identify true pairwise interactions. Here we exploit the fundamental properties of dynamical correlations in networks to develop a method to silence indirect effects. The method receives as input the observed correlations between node pairs and uses a matrix transformation to turn the correlation matrix into a highly discriminative silenced matrix, which enhances only the terms associated with direct causal links. Against empirical data for Escherichia coli regulatory interactions, the method enhanced the discriminative power of the correlations by twofold, yielding >50% predictive improvement over traditional correlation measures and 6% over mutual information. Overall this silencing method will help translate the abundant correlation data into insights about a system's interactions, with applications ranging from link prediction to inferring the dynamical mechanisms governing biological networks.

Effect of correlations on network controllability

M. Pósfai, Y.-Y. Liu, J.-J. Slotine, A.-L. Barabási

Scientific Reports 3:1067, 1-7 (2013)

A dynamical system is controllable if by imposing appropriate external signals on a subset of its nodes, it can be driven from any initial state to any desired state in finite time. Here we study the impact of various network characteristics on the minimal number of driver nodes required to control a network. We find that clustering and modularity have no discernible impact, but the symmetries of the underlying matching problem can produce linear, quadratic or no dependence on degree correlation coefficients, depending on the nature of the underlying correlations. The results are supported by numerical simulations and help narrow the observed gap between the predicted and the observed number of driver nodes in real networks.

Emergence of bimodality in controlling complex networks

T. Jia, Y.-Y. Liu, E. Csóka, M. Pósfai, J.-J. Slotine, A.-L. Barabási

Nature Communications 4:2002, 1-6 (2013)

Our ability to control complex systems is a fundamental challenge of contemporary science. Recently introduced tools to identify the driver nodes, nodes through which we can achieve full control, predict the existence of multiple control configurations, prompting us to classify each node in a network based on their role in control. Accordingly a node is critical, intermittent or redundant if it acts as a driver node in all, some or none of the control configurations. Here we develop an analytical framework to identify the category of each node, leading to the discovery of two distinct control modes in complex systems: centralized versus distributed control. We predict the control mode for an arbitrary network and show that one can alter it through small structural perturbations. The uncovered bimodality has implications from network security to organizational research and offers new insights into the dynamics and control of complex systems.

Universality in network dynamics

B. Barzel, A.-L. Barabási

Nature Physics 9, 673-681 (2013)

Despite significant advances in characterizing the structural properties of complex networks, a mathematical framework that uncovers the universal properties of the interplay between the topology and the dynamics of complex systems continues to elude us. Here we develop a self-consistent theory of dynamical perturbations in complex systems, allowing us to systematically separate the contribution of the network topology and dynamics. The formalism covers a broad range of steady-state dynamical processes and offers testable predictions regarding the system’s response to perturbations and the development of correlations. It predicts several distinct universality classes whose characteristics can be derived directly from the continuum equation governing the system’s dynamics and which are validated on several canonical network-based dynamical systems, from biochemical dynamics to epidemic spreading. Finally, we collect experimental data pertaining to social and biological systems, demonstrating that we can accurately uncover their universality class even in the absence of an appropriate continuum theory that governs the system’s dynamics.

Handful of papers dominates citation

A.-L. Barabási, C. Song, D. Wang

Nature 491, 40 (2012)

An ‘impact disparity’ is emerging in science — only a few papers earn the largest share of citations. This is comparable to the income disparity in the United States, known as the 1% phenomenon, where 1% of the population earns a disproportionate 17.4% of total income.

Control centrality and hierarchical structure in complex networks

Y.-Y. Liu, J.-J. Slotine, A.-L. Barabasi

PLoS One 7, e44459 (2012)

We introduce the concept of control centrality to quantify the ability of a single node to control a directed weighted network. We calculate the distribution of control centrality for several real networks and find that it is mainly determined by the network’s degree distribution. We show that in a directed network without loops the control centrality of a node is uniquely determined by its layer index or topological position in the underlying hierarchical structure of the network. Inspired by the deep relation between control centrality and hierarchical structure in a general directed network, we design an efficient attack strategy against the controllability of malicious networks.

Network science: Luck or reason

Albert-László Barabási

Nature 489, 1-2 (2012)

The concept of preferential attachment is behind the hubs and power laws seen in many networks. New results fuel an old debate about its origin, and beg the question of whether it is based on randomness or optimization.

Dynamics of ranking processes in complex systems

N. Blumm, G. Ghoshal, Z. Forro, M. Schich, G. Bianconi, J.-P. Bouchard, A.-L. Barabasi

Physical Review Letters 109, 128701:1-5 (2012)

The world is addicted to ranking: everything, from the reputation of scientists, journals, and universities to purchasing decisions is driven by measured or perceived differences between them. Here, we analyze empirical data capturing real time ranking in a number of systems, helping to identify the universal characteristics of ranking dynamics. We develop a continuum theory that not only predicts the stability of the ranking process, but shows that a noise-induced phase transition is at the heart of the observed differences in ranking regimes. The key parameters of the continuum theory can be explicitly measured from data, allowing us to predict and experimentally document the existence of three phases that govern ranking stability.

Interpreting cancer genomes using systematic host network perturbations by tumour virus proteins

O. Rozenblatt-Rosen, R. C. Deo, M. Padi, G. Adelmant, T. Rolland, M. Grace, A. Dricot, M. Askenazi, M. Tavares, S. J. Pevzner, F. Abderazzaq, D. Byrdsong, A.-R. Carvunis, A. A. Chen, J. Cheng, M. Correll, M. Durate, C. Fan, M. C. Feltkamp, S. B. Ficarro, R. Franchi, B. K. Garg, N. Gulbahce, T. Hao, A. M. Holthaus, R. James, A. Korkhin, L. Litovchick, J. C. Mar, T. R. Pak, S. Rabello, R. Rubio, Y. Shen, S. Singh, J. M. Spangle, M. Tasan, S. Wanamakter, J. T. Webber, J. Roecklein-Canfield,, E. Johannsen, A.-L. Barabasi,, R. Beroukhim, E. Kieff,, M. E. Cusick, D. E. Hill,, K. Munger, J. A. Marto,, J. Quackenbush, F. P. Roth,, J. A. DeCaprio, M. Vidal

Nature 487, 491-495 (2012)

Genotypic differences greatly influence susceptibility and resistance to disease. Understanding genotype–phenotype relationships requires that phenotypes be viewed as manifestations of network properties, rather than simply as the result of individual genomic variations. Genome sequencing efforts have identified numerous germline mutations, and large numbers of somatic genomic alterations, associated with a predisposition to cancer. However, it remains difficult to distinguish background, or ‘passenger’, cancer mutations from causal, or ‘driver’, mutations in these data sets. Human viruses intrinsically depend on their host cell during the course of infection and can elicit pathological phenotypes similar to those arising from mutations. Here we test the hypothesis that genomic variations and tumour viruses may cause cancer through related mechanisms, by systematically examining host interactome and transcriptome network perturbations caused by DNA tumour virus proteins. The resulting integrated viral perturbation data reflects rewiring of the host cell networks, and highlights pathways, such as Notch signalling and apoptosis, that go awry in cancer. We show that systematic analyses of host targets of viral proteins can identify cancer genes with a success rate on a par with their identification through functional genomics and large-scale cataloguing of tumour mutations. Together, these complementary approaches increase the specificity of cancer gene identification. Combining systems-level studies of pathogen-encoded gene products with genomic approaches will facilitate the prioritization of cancer causing driver genes to advance the understanding of the genetic basis of human cancer.

A universal model for mobility and migration patterns

Albert-László Barabási

Nature 484, 96-100 (2012)

Reductionism, as a paradigm, is expired, and complexity, as a field, is tired. Data-based mathematical models of complex systems are offering a fresh perspective, rapidly developing into a new discipline: network science.

Sex differences in intimate relationships

V. Palchykov, K. Kaski, J. Kertesz, A.-L. Barabási, R. Dunbar

Scientific Reports 2:370, 105 (2012)

Social networks based on dyadic relationships are fundamentally important for understanding of human sociality. However, we have little understanding of the dynamics of close relationships and how these change over time. Evolutionary theory suggests that, even in monogamous mating systems, the pattern of investment in close relationships should vary across the lifespan when post-weaning investment plays an important role in maximizing fitness. Mobile phone data sets provide a unique window into the structure and dynamics of relationships. We here use data from a large mobile phone dataset to demonstrate striking sex differences in the gender-bias of preferred relationships that reflect the way the reproductive investment strategies of both sexes change across the lifespan, i.e. women’s shifting patterns of investment in reproduction and parental care. These results suggest that human social strategies may have more complex dynamics than previously assumed and a life-history perspective is crucial for understanding them.

Graph theory properties of cellular networks (Chapter 9)

B. Barzel, A. Sharma, A.-L. Barabási

Handbook of Systems Biology – Concepts and Insights (Academic Press, Elsevier) , 177-193 (2013)

Flavor network and the principles of food pairing

Y.-Y. Ahn, S. E. Ahnert, J. P. Bagrow, A.-L. Barabási

Scientific Reports 196, (2011)

The cultural diversity of culinary practice, as illustrated by the variety of regional cuisines, raises the question of whether there are any general patterns that determine the ingredient combinations used in food today or principles that transcend individual tastes and recipes. We introduce a flavor network that captures the flavor compounds shared by culinary ingredients. Western cuisines show a tendency to use ingredient pairs that share many flavor compounds, supporting the so-called food pairing hypothesis. By contrast, East Asian cuisines tend to avoid compound sharing ingredients. Given the increasing availability of information on food preparation, our data-driven investigation opens new avenues towards a systematic understanding of culinary practice.

Systems biology and the future of medicine

J. Loscalzo, A.-L. Barabási

WIREs Systems Biology and Medicine 3, 619-627 (2011)

Contemporary views of human disease are based on simple correlation between clinical syndromes and pathological analysis dating from the late 19th century. Although this approach to disease diagnosis, prognosis, and treatment has served the medical establishment and society well for many years, it has serious shortcomings for the modern era of the genomic medicine that stem from its reliance on reductionist principles of experimentation and analysis. Quantitative, holistic systems biology applied to human disease offers a unique approach for diagnosing established disease, defining disease predilection, and developing individualized (personalized) treatment strategies that can take full advantage of modern molecular pathobiology and the comprehensive data sets that are rapidly becoming available for populations and individuals. In this way, systems pathobiology offers the promise of redefining our approach to disease and the field of medicine.

Few inputs can reprogram biological networks (reply by Liu et al.)

Y.-Y. Liu, J.-J. Slotine, A.-L. Barabási

Nature 473, 167-173 (2011)

Reply to Franz-Josef Muller and Andreas Schuppert (Nature 478, Pg. E4, Oct. 2011)

Human Mobility, Social Ties, and Link Prediction

D. Wang, D. Pedreschi, C. Song, F. Giannotti, A.-L. Barabasi

ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) , (2011)

Our understanding of how individual mobility patterns shape and impact the social network is limited, but is essential for a deeper understanding of network dynamics and evolution. This question is largely unexplored, partly due to the difficulty in obtaining large-scale society-wide data that simultaneously capture the dynamical information on individual movements and social interactions. Here we address this challenge for the first time by tracking the trajectories and communication records of 6 Million mobile phone users. We find that the similarity between two individuals' movements strongly correlates with their proximity in the social network. We further investigate how the predictive power hidden in such correlations can be exploited to address a challenging problem: which new links will develop in a social network. We show that mobility measures alone yield surprising predictive power, comparable to traditional network-based measures. Furthermore, the prediction accuracy can be significantly improved by learning a supervised classifier based on combined mobility and network measures. We believe our findings on the interplay of mobility patterns and social ties offer new perspectives on not only link prediction but also network dynamics.

Ranking stability and super-stable nodes in complex networks

G. Ghoshal, A.-L. Barabási

Nature Communications 2, 1-7 (2011)

Pagerank, a network-based diffusion algorithm, has emerged as the leading method to rank web content, ecological species and even scientists. Despite its wide use, it remains unknown how the structure of the network on which it operates affects its performance. Here we show that for random networks the ranking provided by pagerank is sensitive to perturbations in the network topology, making it unreliable for incomplete or noisy systems. In contrast, in scale-free networks we predict analytically the emergence of super-stable nodes whose ranking is exceptionally stable to perturbations. We calculate the dependence of the number of super-stable nodes on network characteristics and demonstrate their presence in real networks, in agreement with the analytical predictions. These results not only deepen our understanding of the interplay between network topology and dynamical processes but also have implications in all areas where ranking has a role, from science to marketing.

Controllability of complex networks

Y.-Y. Liu, J.-J. Slotine, A.-L. Barabási

Nature 473, 167-173 (2011)

The ultimate proof of our understanding of natural or technological systems is reflected in our ability to control them. Although control theory offers mathematical tools for steering engineered and natural systems towards a desired state, a framework to control complex self-organized systems is lacking. Here we develop analytical tools to study the controllability of an arbitrary complex directed network, identifying the set of driver nodes with time-dependent control that can guide the system’s entire dynamics. We apply these tools to several real networks, finding that the number of driver nodes is determined mainly by the network’s degree distribution. We show that sparse inhomogeneous networks, which emerge in many real complex systems, are the most difficult to control, but that dense and homogeneous networks can be controlled using a few driver nodes. Counterintuitively, we find that in both model and real systems the driver nodes tend to avoid the high-degree nodes.

Geographic Constraints on Social Network Groups

J. P. Onnela, S. Arbesman, M. C. Gonzalez, A.-L. Barabasi, N. A. Christakis

PLoS One 6:4, 1-7 (2011)

Social groups are fundamental building blocks of human societies. While our social interactions have always been constrained by geography, it has been impossible, due to practical difficulties, to evaluate the nature of this restriction on social group structure. We construct a social network of individuals whose most frequent geographical locations are also known. We also classify the individuals into groups according to a community detection algorithm. We study the variation of geographical span for social groups of varying sizes, and explore the relationship between topological positions and geographic positions of their members. We find that small social groups are geographically very tight, but become much more clumped when the group size exceeds about 30 members. Also, we find no correlation between the topological positions and geographic positions of individuals within network communities. These results suggest that spreading processes face distinct structural and spatial constraints.

Collective response of human populations to large-scale emergencies

J. P. Bagrow, D. Wang, A.-L. Barabasi

PLoS One 6:3, 1-8 (2011)

Despite recent advances in uncovering the quantitative features of stationary human activity patterns, many applications,from pandemic prediction to emergency response, require an understanding of how these patterns change when thepopulation encounters unfamiliar conditions. To explore societal response to external perturbations we identified real-timechanges in communication and mobility patterns in the vicinity of eight emergencies, such as bomb attacks andearthquakes, comparing these with eight non-emergencies, like concerts and sporting events. We find that communicationspikes accompanying emergencies are both spatially and temporally localized, but information about emergencies spreadsglobally, resulting in communication avalanches that engage in a significant manner the social network of eyewitnesses.These results offer a quantitative view of behavioral changes in human activity under extreme conditions, with potentiallong-term impact on emergency detection and response.

Interactome Networks and Human Disease

M. Vidal, M. E. Cusick, A.-L. Barabasi

Cell 144, 986-995 (2011)

Complex biological systems and cellular networks may underlie most genotype to phenotype relationships. Here, we review basic concepts in network biology, discussing different types of interactome networks and the insights that can come from analyzing them. We elaborate on why interactome networks are important to consider in biology, how they can be mapped and integratedwith each other, what global properties are starting to emerge from interactome network models, and how these properties may relate to human disease.

Small but slow world: How network topology and burstiness slow down spreading

M. Karsai, M. Kivelä, R. K. Pan, K. Kaski, J. Kertész, A.-L. Barabási, J. Saramäki

Physical Review E 83, 1-4 (2011)

While communication networks show the small-world property of short paths, the spreading dynamics in them turns out slow. Here, the time evolution of information propagation is followed through communication networks by using empirical data on contact sequences and the susceptible-infected model. Introducing null models where event sequences are appropriately shuffled, we are able to distinguish between the contributions of different impeding effects. The slowing down of spreading is found to be caused mainly by weight-topology correlations and the bursty activity patterns of individuals.

Comparison of an expanded ataxia interactome with patient medical records reveals a relationship between macular degeneration and ataxia

J. J. Kahle, N. Gulbahce, C. A. Shaw, J. Lim, D. E. Hill, A.-L. Barabás, H. Y. Zoghbi

Human Molecular Genetics 20, 510-527 (2011)

Spinocerebellar ataxias 6 and 7 (SCA6 and SCA7) are neurodegenerative disorders caused by expansion of CAG repeats encoding polyglutamine (polyQ) tracts in CACNA1A, the alpha1A subunit of the P/Q-type calcium channel, and ataxin-7 (ATXN7), a component of a chromatin-remodeling complex, respectively. We hypothesized that finding new protein partners for ATXN7 and CACNA1A would provide insight into the biology of their respective diseases and their relationship to other ataxia-causing proteins. We identified 118 protein interactions for CACNA1A and ATXN7 linking them to other ataxia-causing proteins and the ataxia network. To begin to understand the biological relevance of these protein interactions within the ataxia network, we used OMIM to identify diseases associated with the expanded ataxia network. We then used Medicare patient records to determine if any of these diseases co-occur with hereditary ataxia. We found that patients with ataxia are at 3.03-fold greater risk of these diseases than Medicare patients overall. One of the diseases comorbid with ataxia is macular degeneration (MD). The ataxia network is significantly (P= 7.37 × 10(-5)) enriched for proteins that interact with known MD-causing proteins, forming a MD subnetwork. We found that at least two of the proteins in the MD subnetwork have altered expression in the retina of Ataxin-7(266Q/+) mice suggesting an in vivo functional relationship with ATXN7. Together these data reveal novel protein interactions and suggest potential pathways that can contribute to the pathophysiology of ataxia, MD, and diseases comorbid with ataxia.

Network medicine: a network-based approach to human disease

A.-L. Barabási, N. Gulbahce, J. Loscalzo

Nature Reviews Genetics 12, 56-68 (2011)

Given the functional interdependencies between the molecular components in a human cell, a disease is rarely a consequence of an abnormality in a single gene, but reflects the perturbations of the complex intracellular and intercellular network that links tissue and organ systems. The emerging tools of network medicine offer a platform to explore systematically not only the molecular complexity of a particular disease, leading to the identification of disease modules and pathways, but also the molecular relationships among apparently distinct (patho)phenotypes. Advances in this direction are essential for identifying new disease genes, for uncovering the biological significance of disease-associated mutations identified by genome-wide association studies and full-genome sequencing, and for identifying drug targets and biomarkers for complex diseases.

Information Spreading in Context

D. Wang, Z. Wen, H. Tong, C.-Y. Lin, C. Song, A.-L. Barabási

Proceeding for the 20th International World Wide Web Conference, 2011 , 1-10 (2011)

Information spreading processes are central to human interactions. Despite recent studies in online domains, little is known about factors that could affect the dissemination of a single piece of information. In this paper, we address this challenge by combining two related but distinct datasets, collected from a large scale privacy-preserving distributed social sensor system. We find that the social and organizational context significantly impacts to whom and how fast people forward information. Yet the structures within spreading processes can be well captured by a simple stochastic branching model, indicating surprising independence of context. Our results build the foundation of future predictive models of information flow and provide significant insights towards design of communication platforms.

Modelling the scaling properties of human mobility

C. Song, Z. Qu, N. Blumm, A.-L. Barabási

Nature Physics 7, 713- (2010)

A range of applications, from predicting the spread of human and electronic viruses to city planning and resource management in mobile communications, depend on our ability to foresee the whereabouts and mobility of individuals, raising a fundamental question: To what degree is human behavior predictable? Here we explore the limits of predictability in human dynamics by studying the mobility patterns of anonymized mobile phone users. By measuring the entropy of each individual’s trajectory, we find a 93% potential predictability in user mobility across the whole user base. Despite the significant differences in the travel patterns, we find a remarkable lack of variability in predictability, which is largely independent of the distance users cover on a regular basis.

Blueprint for antimicrobial hit discovery targeting metabolic networks

Y. Shen, L. Liu, G. Estiu, B. Isin, Y.-Y. Ahn, D.-S. Lee, A.-L. Barabásii, v. Kapatral, O. Wiest, Z. N. Oltvai

Proceedings of the National Academy of Sciences of the United States of America 10.1073, 1-6 (2010)

Proceedings of the National Academy of Sciences of the United States of America 10.1073, 1-6 (2010) Advances in genome analysis, network biology, and computational chemistry have the potential to revolutionize drug discovery by combining system-level identification of drug targets with the atomistic modeling of small molecules capable of modulating their activity. To demonstrate the effectiveness of such a discovery pipeline, we deduced common antibiotic targets in Escherichia coli and Staphylococcus aureus by identifying shared tissue-specific or uniformly essential metabolic reactions in their metabolic networks. We then predicted through virtual screening dozens of potential inhibitors for several enzymes of these reactions and showed experimentally that a subset of these inhibited both enzyme activities in vitro and bacterial cell viability. This blueprint is applicable for any sequenced organism with high-quality metabolic reconstruction and suggests a general strategy for strain-specific antiinfective therapy.

Cancer metastasis networks and the prediction of progression patterns

L. L. Chen, N. Blumm, N. A. Christakis, A.-L. Barabási, T. S. Deisboeck

British Journal of Cancer 101, 749-758 (2009)

Background: Metastasis patterns in cancer vary both spatially and temporally. Network modeling may allow the incorporation of the temporal dimension in the analysis of these patterns.METHODS: We used Medicare claims of 2 265 167 elderly patients aged X65 years to study the large-scale clinical pattern of metastases. We introduce the concept of a cancer metastasis network, in which nodes represent the primary cancer site and the sites of subsequent metastases, connected by links that measure the strength of co-occurrence.RESULTS: These cancer metastasis networks capture both temporal and subtle relational information, the dynamics of which differ between cancer types. Using these networks as entities on which the metastatic disease of individual patients may evolve, we show that they may be used, for certain cancer types, to make retrograde predictions of a primary cancer type given a sequence ofmetastases, as well as anterograde predictions of future sites of metastasis.

Comparative Genome-Scale Metabolic Reconstruction and Flux Balance Analysis of Multiple Staphylococcus aureus Genomes Identify Novel Antimicrobial Drug Targets

D.-S. Lee, H. Burd, J. Liu, E. Almass, O. Weist, A.-L. Barabási, Z. N. Oltvai, V. Kapatra

Journal of Bacteriology 191:12, 4015–4024 (2009)

Mortality due to multidrug-resistant Staphylococcus aureus infection is predicted to surpass that of human immunodeficiency virus/AIDS in the United States. Despite the various treatment options for S. aureus infections, it remains a major hospital- and community-acquired opportunistic pathogen. With the emergence of multidrug-resistant S. aureus strains, there is an urgent need for the discovery of new antimicrobial drug targets in the organism. To this end, we reconstructed the metabolic networks of multidrug-resistant S. aureus strains using genome annotation, functional-pathway analysis, and comparative genomic approaches, followed by flux balance analysis-based in silico single and double gene deletion experiments. We identified 70 single enzymes and 54 pairs of enzymes whose corresponding metabolic reactions are predicted to be unconditionally essential for growth. Of these, 44 single enzymes and 10 enzyme pairs proved to be common to all 13 S. aureus strains, including many that had not been previously identified as being essential for growth by gene deletion experiments in S. aureus. We thus conclude that metabolic reconstruction and in silico analyses of multiple strains of the same bacterial species provide a novel approach for potential antibiotic target identification.

Understanding the spreading patterns of mobile phone viruses

P. Wang, M. Gonzalez, C. A. Hidalgo, A.-L. Barabási

Science 324, 1071-1076 (2009)

We modeled the mobility of mobile phone users in order to study the fundamental spreading patterns that characterize a mobile virus outbreak. We find that although Bluetooth viruses can reach all susceptible handsets with time, they spread slowly because of human mobility, offering ample opportunities to deploy antiviral software. In contrast, viruses using multimedia messaging services could infect all users in hours, but currently a phase transition on the underlying call graph limits them to only a small fraction of the susceptible users. These results explain the lack of a major mobile virus breakout so far and predict that once a mobile operating system’s market share reaches the phase transition point, viruses will pose a serious threat to mobile communications.

A dynamic network approach for the study of human phenotypes

C. A. Hidalgo, N. Blumm, A.-L. Barabási, N. A. Christakis

PLoS Computational Biology 5:4, 1-11 (2009)

The use of networks to integrate different genetic, proteomic, and metabolic datasets has been proposed as a viable path toward elucidating the origins of specific diseases. Here we introduce a new phenotypic database summarizing correlations obtained from the disease history of more than 30 million patients in a Phenotypic Disease Network (PDN). We present evidence that the structure of the PDN is relevant to the understanding of illness progression by showing that (1) patients develop diseases close in the network to those they already have; (2) the progression of disease along the links of the network is different for patients of different genders and ethnicities; (3) patients diagnosed with diseases which are more highly connected in the PDN tend to die sooner than those affected by less connected diseases; and (4) diseases that tend to be preceded by others in the PDN tend to be more connected than diseases that precede other illnesses, and are associated with higher degrees of mortality. Our findings show that disease progression can be represented and studied using network methods, offering the potential to enhance our understanding of the origin and evolution of human diseases. The dataset introduced here, released concurrently with this publication, represents the largest relational phenotypic resource publicly available to the research community.

Scale-Free Networks: A Decade and Beyond

A.-L. Barabási

Science 325, 412-413 (2009)

For decades, we tacitly assumed that the components of such complex systems as the cell, the society, or the Internet are randomly wired together. In the past decade, an avalanche of research has shown that many real networks, independent of their age, function, and scope, converge to similar architectures, a universality that allowed researchers from different disciplines to embrace network theory as a common paradigm. The decade-old discovery of scale-free networks was one of those events that had helped catalyze the emergence of network science, a new research field with its distinct set of challenges and accomplishments.

The impact of cellular networks on disease comorbidity

J. Park, D. S. Lee, N. A. Christakis, A.-L. Barabási

Molecular Systems Biology 5:262, 1-7 (2009)

The impact of disease-causing defects is often not limited to the products of a mutated gene but, thanks to interactions between the molecular components, may also affect other cellular functions, resulting in potential comorbidity effects. By combining information on cellular interactions, disease--gene associations, and population-level disease patterns extracted from Medicare data, we find statistically significant correlations between the underlying structure of cellular networks and disease comorbidity patterns in the human population. Our results indicate that such a combination of population-level data and cellular network information could help build novel hypotheses about disease mechanisms.

Computation Social Science

D. Lazer, A. Pentland, L. Adamic, S. Aral, A.-L. Barabási, D. Brewer, N. Christakis, N. Contractor, J. Fowler, M. Gutmann, T. Jebara, G. King, M. Macy, D. Roy, M. Van Alstyne

Science 323, 721-724 (2009)

We live life in the network. We check our e-mails regularly, make mobile phone calls from almost any location, swipe transit cards to use public transportation, and make purchases with credit cards. Our movements in public places may be captured by video cameras, and our medical records stored as digital files. We may post blog entries accessible to anyone, or maintain friendships through online social networks. Each of these transactions leaves digital traces that can be compiled into comprehensive pictures of both individual and group behavior, with the potential to transform our understanding of ourlives, organizations, and societies.

Time to CARE: a collaborative engine for practical disease prediction

D. A. Davis, N. V. Chawla, N. A. Christakis, A.-L. Barabasi

Data Mining and Knowledge Discovery 30:3, 388-41 (2009)

The monumental cost of health care, especially for chronic disease treatment, is quickly becoming unmanageable. This crisis has motivated the drive towards preventative medicine, where the primary concern is recognizing disease risk and taking action at the earliest signs. However, universal testing is neither time nor cost efficient. We propose CARE, a Collaborative Assessment and Recommendation Engine, which relies only on patient’s medical history using ICD-9-CM codes in order to predict future disease risks. CARE uses collaborative filtering methods to predict each patient’s greatest disease risks based on their own medical history and that of similar patients. We also describe an Iterative version, ICARE, which incorporates ensemble concepts for improved performance. Also, we apply time-sensitive modifications which make the CARE framework practical for realistic long-term use. These novel systems require no specialized information and provide predictions for medical conditions of all kinds in a single run. We present experimental results on a larg Medicare dataset, demonstrating that CARE and ICARE perform well at capturing future disease risks.

An empirical framework for binary interactome mapping

Kavitha Venkatesan, Jean-François Rual, Alexei Vazquez, Ulrich Stelzl, Irma Lemmens, Tomoko Hirozane-Kishikawa, Tong Hao, Martina Zenkner, Xiaofeng Xin, Kwang-Il Goh, Muhammed A Yildirim, Nicolas Simonis, Kathrin Heinzmann, Fana Gebreab, Julie M Sahalie, Sebiha Cevik, Christophe Simon, Anne-Sophie de Smet, Elizabeth Dann, Alex Smolyar, Arunachalam Vinayagam, Haiyuan Yu, David Szeto, Heather Borick, Amélie Dricot, Niels Klitgord, Ryan R Murray, Chenwei Lin, Maciej Lalowski, Jan Timm, Kirstin Rau, Charles Boone, Pascal Braun, Michael E Cusick, Frederick P Roth, David E Hill, Jan Tavernier, Erich E Wanker, Albert-László Barabási & Marc Vidal

Nature Methods volume 6, pages 83–90 (2009)

Several attempts have been made to systematically map protein-protein interaction, or 'interactome', networks. However, it remains difficult to assess the quality and coverage of existing data sets. Here we describe a framework that uses an empirically-based approach to rigorously dissect quality parameters of currently available human interactome maps. Our results indicate that high-throughput yeast two-hybrid (HT-Y2H) interactions for human proteins are more precise than literature-curated interactions supported by a single publication, suggesting that HT-Y2H is suitable to map a significant portion of the human interactome. We estimate that the human interactome contains ∼130,000 binary interactions, most of which remain to be mapped. Similar to estimates of DNA sequence data quality and genome size early in the Human Genome Project, estimates of protein interaction data quality and interactome size are crucial to establish the magnitude of the task of comprehensive human interactome mapping and to elucidate a path toward this goal.

Impact of Limited Solvent Capacity on Metabolic Rate, Enzyme Activities, and Metabolite Concentrations of S.cerevisiae Glycolysis

A. Vazquez, M. A. de Menezes, A.-L. Barabási, Z. N. Oltvai

PLoS Computational Biology 4:10, 1-6 (2008)

The cell’s cytoplasm is crowded by its various molecular components, resulting in a limited solvent capacity for the allocation of new proteins, thus constraining various cellular processes such as metabolism. Here we study the impact of the limited solvent capacity constraint on the metabolic rate, enzyme activities, and metabolite concentrations using a computational model of Saccharomyces cerevisiae glycolysis as a case study. We show that given the limited solvent capacity constraint, the optimal enzyme activities and the metabolite concentrations necessary to achieve a maximum rate of glycolysis are in agreement with their experimentally measured values. Furthermore, the predicted maximum glycolytic rate determined by the solvent capacity constraint is close to that measured in vivo. These results indicate that the limited solvent capacity is a relevant constraint acting on S. cerevisiae at physiological growth conditions, and that a full kinetic model together with the limited solvent capacity constraint can be used to predict both metabolite concentrations and enzyme activities in vivo.

High-Quality Binary Protein Interaction Map of the Yeast Interactome Network

H. Yu, P. Braun, M. A. Yildirim, I. Lemmens, K. Venkatesan, J. Sahalie, T. Hirozane-Kishikawa, F. Gebreab, N. Li, N. Simonis, T. Hao, J.-F. Raul, A. Dricot, A. Vazquez, R. R. Murray, C. Simon, L. Tardivo, S. Tam, N. Svrzikapa, C. Fan, A.-S. de Semt, A. Motyl, M. E. Hudson, J. Park, X. Xin, M. E. Cusick, T. Moore, C. Boone, M. Snyder, F. P. Roth, A.-L. Barabási, J. Tavernier, D. E. Hill, M. Vidal

Science 322, 104-110 (2008)

Current yeast interactome network maps contain several hundred molecular complexes with limited and somewhat controversial representation of direct binary interactions. We carried out a comparative quality assessment of current yeast interactome data sets, demonstrating that high-throughput yeast two-hybrid (Y2H) screening provides high-quality binary interaction information. Because a large fraction of the yeast binary interactome remains to be mapped, we developed an empirically controlled mapping framework to produce a "second-generation" high-quality, high-throughput Y2H data set covering ~20% of all yeast binary interactions. Both Y2H and affinity purification followed by mass spectrometry (AP/MS) data are of equally high quality but of a fundamentally different and complementary nature, resulting in networks with different topological and biological properties. Compared to co-complex interactome models, this binary map is enriched for transient signaling interactions and intercomplex connections with a highly significant clustering between essential proteins. Rather than correlating with essentiality, protein connectivity correlates with genetic pleiotropy.

The implications of human metabolic network topology for disease comorbidity

D.-S. Lee, J. Park, K. A. Kay, N. A. Christakis, Z. N. Oltvai, A.-L. Barabási

Proceedings of the National Academy of Sciences 105, 9880-9885 (2008)

Most diseases are the consequence of the breakdown of cellular processes, but the relationships among genetic/epigenetic defects, the molecular interaction networks underlying them, and the disease phenotypes remain poorly understood. To gain insights into such relationships, here we constructed a bipartite human disease association network in which nodes are diseases and two diseases are linked if mutated enzymes associated with them catalyze adjacent metabolic reactions. We find that connected disease pairs display higher correlated reaction flux rate, corresponding enzyme-encoding gene coexpression, and higher comorbidity than those that have no metabolic link between them. Furthermore, the more connected a disease is to other diseases, the higher is its prevalence and associated mortality rate. The network topology-based approach also helps to

Understanding individual human mobility patterns

M. C. González, C. A. Hidalgo, A.-L. Barabási

Nature 453, 779-782 (2008)

Despite their importance for urban planning, traffic forecasting and the spread of biological and mobile viruses, our understanding of the basic laws governing human motion remains limited owing to the lack of tools to monitor the time-resolved location of individuals. Here we study the trajectory of 100,000 anonymized mobile phone users whose position is tracked for a six-month period. We find that, in contrast with the random trajectories predicted by the prevailing Levy flight and random walk models, human trajectories show a high degree of temporal and spatial regularity, each individual being characterized by a time independent characteristic travel distance and a significant probability to return to a few highly frequented locations. After correcting for differences in travel distances and the inherent anisotropy of each trajectory, the individual travel patterns collapse into a single spatial probability distribution, indicating that, despite the diversity of their travel history, humans follow simple reproducible patterns. This inherent similarity in travel patterns could impact all phenomena driven by human mobility, from epidemic prevention to emergency response, urban planning and agent-based modeling.

Uncovering individual and collective human dynamics from mobile phone records

J. Candia, M. C. Gonzalez, P. Wang, T. Schoenharl, G. Madey, A.-L. Barabási

Journal of Physics A: Mathematical and Theoretical 41, 1-11 (2008)

Novel aspects of human dynamics and social interactions are investigated by means of mobile phone data. Using extensive phone records resolved in both time and space, we study the mean collective behavior at large scales and focus on the occurrence of anomalous events. We discuss how these spatiotemporal anomalies can be described using standard percolation theory tools. We also investigate patterns of calling activity at the individual level and show that the interevent time of consecutive calls is heavy-tailed. This finding, which has implications for dynamics of spreading phenomena in social networks, agrees with results previously reported on other human activities.

Burstiness and memory in complex systems

K.-L. Goh, A.-L. Barabási

Europhysics Letters 81, 48002 (2008)

The dynamics of a wide range of real systems, from email patterns to earthquakes,display a bursty, intermittent nature, characterized by short timeframes of intense activity followed by long times of no or reduced activity. The understanding of the origin of such bursty patterns is hindered by the lack of tools to compare different systems using a common framework. Here we propose to characterize the bursty nature of real signals using orthogonal measures quantifying two distinct mechanisms leading to burstiness: the interevent time distribution and the memory. We find that while the burstiness of natural phenomena is rooted in both the interevent time distribution and memory, for human dynamics memory is weak, and the bursty character is due to the changes in the interevent time distribution. Finally, we show that current models lack in their ability to reproduce the activity pattern observed in real systems, opening up avenues for future work.

Predicting synthetic rescues in metabolic networks

A. Motter, N. Gulbahce, E. Almaas, A.-L. Barabási

Molecular Systems Biology 4:168, 1-10 (2008)

An important goal of medical research is to develop methods to recover the loss of cellular function due to mutations and other defects. Many approaches based on gene therapy aim to repair the defective gene or to insert genes with compensatory function. Here, we propose an alternative, network-based strategy that aims to restore biological function by forcing the cell to either bypass the functions affected by the defective gene, or to compensate for the lost function. Focusing on the metabolism of single-cell organisms, we computationally study mutants that lack an essential enzyme, and thus are unable to grow or have a significantly reduced growth rate. We show that several of these mutants can be turned into viable organisms through additional gene deletions that restore their growth rate. In a rather counterintuitive fashion, this is achieved via additional damage to the metabolic network. Using flux balance-based approaches, we identify a number of synthetically viable gene pairs, in which the removal of one enzyme-encoding gene results in a non-viable phenotype, while the deletion of a second enzyme-encoding gene rescues the organism. The systematic network-based identification of compensatory rescue effects may open new avenues for genetic interventions.

Impact of the solvent capacity constraint on E. coli metabolism

A. Vazquez, Q. K. Beg, M. A. de Menezes, J. Ernst, Z. Bar-Joseph, A.-L. Barabási, L. G. Boros, Z. N. Oltvai

BMC Systems Biology 2:7, 1-10 (2008)

Obtaining quantitative predictions for cellular metabolic activities requires the identification and modeling of the physicochemical constraints that are relevant at physiological growth conditions. Molecular crowding in a cell's cytoplasm is one such potential constraint, as it limits the solvent capacity available to metabolic enzymes.

Drug-target network

M. A. Yildirim, K.-L. Goh, M.E. Cusick, A.-L. Barabási, M. Vidal

Nature Biotechnology 25:10, 1119-1126 (2007)

The global set of relationships between protein targets of all drugs and all disease-gene products in the human protein–protein interaction or ‘interactome’ network remains uncharacterized. We built a bipartite graph composed of US Food and Drug Administration–approved drugs and proteins linked by drug–target binary associations. The resultingnetwork connects most drugs into a highly interlinked giant component, with strong local clustering of drugs of similar types according to Anatomical Therapeutic Chemical classification. Topological analyses of this network quantitatively showed an overabundance of ‘follow-on’ drugs, that is, drugs that target already targeted proteins. By including drugs currently under investigation, we identified a trend toward more functionally diverse targets improving polypharmacology. To analyze the relationships between drug targets and disease-gene products, we measured the shortest distance between both sets of proteins in current models of the human interactome network. Significant differences in distance were found between etiological and palliative drugs. A recent trend toward more rational drug design was observed.

The architecture of complexity

A.-L. Barabási

IEEE Control Systems Magazine 27:4, 33-42 (2007)

We are surrounded by complex systems, from cells made of thousands of molecules to society, a collection of billions of interacting individuals. These systems display signatures of order and self-organization. Understanding and quantifying this complexity is a grand challenge for science. Kinetic theory, developed at the end of the 19th century, shows that the measurable properties of gases, from pressure to temperature, can be reduced to the random motion of atoms and molecules. In the 1960s and 1970s, researchers developed systematic approaches to quantifying the transition from disorder to order in material systems such as magnets and liquids. Chaos theory dominated the quest to understand complex behavior in the 1980s with the message that unpredictable behavior can emerge from the nonlinear interactions of a few components. The 1990s was the decade of fractals, quantifying the geometry of patterns emerging in self-organized systems, from leaves to snowflakes.

Intracellular crowding defines the mode and sequence of substrate uptake by Escherichia coli and constrains its metabolic activity

Q. K. Beg, A. Vazquez, J. Ernst, M. A. de Menezes, Z. Bar-Joseph, A.-L. Barabási, Z. N. Oltvai

Proceedings of the National Academy of Sciences 104, 31 (2007)

The influence of the high intracellular concentration of macromolecules on cell physiology is increasingly appreciated, but its impact on system-level cellular functions remains poorly quantified. To assess its potential effect, here we develop a flux balance model of Escherichia coli cell metabolism that takes into account a systemslevel constraint for the concentration of enzymes catalyzing the various metabolic reactions in the crowded cytoplasm. We demonstrate that the model’s predictions for the relative maximum growth rate of wild-type and mutant E. coli cells in single substratelimited media, and the sequence and mode of substrate uptake and utilization from a complex medium are in good agreement with subsequent experimental observations. These results suggest that molecular crowding represents a bound on the achievable functional states of a metabolic network, and they indicate that models incorporating this constraint can systematically identify alterations in cellular metabolism activated in response to environmental change.

The product space conditions the development of nations

C.A. Hidalgo, R. B. Klinger, A.-L. Barabási, R. Hausmann

Science 317, 482 (2007)

Economies grow by upgrading the products they produce and export. The technology, capital, institutions, and skills needed to make newer products are more easily adapted from some products than from others. Here, we study this network of relatedness between products, or “product space,” finding that more-sophisticated products are located in a densely connected core whereas lesssophisticated products occupy a less-connected periphery. Empirically, countries move through the product space by developing goods close to those they currently produce. Most countries can reach the core only by traversing empirically infrequent distances, which may help explain why poor countries have trouble developing more competitive exports and fail to converge to the income levels of rich countries.

Network Medicine — From Obesity to the “Diseasome”

A.L. Barabási

New England Journal of Medicine 357, 404-407 (2007)

Analysis of a large-scale weighted network of one-to-one human communication

J.-P. Onnela, J. Saramäki, J. Hyvönen, G. Szabó, M A. de Menezes, K. Kaski, A.-L. Barabási, J. Kertész

New Journal of Physics 9, 1-27 (2007)

We construct a connected network of 3.9 million nodes from mobile phone call records, which can be regarded as a proxy for the underlying human communication network at the societal level. We assign two weights on each edge to reflect the strength of social interaction, which are the aggregate call duration and the cumulative number of calls placed between the individuals over a period of 18 weeks. We present a detailed analysis of this weighted network by examining its degree, strength, and weight distributions, as well as its topological assortativity and weighted assortativity, clustering and weighted clustering, together with correlations between these quantities.We give an account of motif intensity and coherence distributions and compare them to a randomized reference system.We also use the concept of link overlap to measure the number of common neighbours any two adjacent nodes have, which serves as a useful local measure for identifying the interconnectedness of communities. We report a positive correlation between the overlap and weight of a link, thus providing strong quantitative evidence for the weak ties hypothesis, a central concept in social network analysis. The percolation properties of the network are found to depend on the type and order of removed links, and they can help understand how the local structure of the network manifests itself at the global level.We hope that our results will contribute to modelling weighted large-scale social networks, and believe that the systematic approach followed here can be adopted to study other weighted networks.

The human disease network

K.-I. Goh, M. E. Cusick, D. Valle, B. Childs, M. Vidal, A.-L. Barabási

Proceedings of the National Academy of Sciences 104:21, 8685 (2007)

A network of disorders and disease genes linked by known disorder–gene associations offers a platform to explore in a single graphtheoretic framework all known phenotype and disease gene associations, indicating the common genetic origin of many diseases. Genes associated with similar disorders show both higher likelihood of physical interactions between their products and higher expression profiling similarity for their transcripts, supporting the existence of distinct disease-specific functional modules. We find that essential human genes are likely to encode hub proteins and are expressed widely in most tissues. This suggests that disease genes also would play a central role in the human interactome. In contrast, we find that the vast majority of disease genes are nonessential and show no tendency to encode hub proteins, and their expression pattern indicates that they are localized in the functional periphery of the network. A selection-based model explains the observed difference between essential and disease genes and also suggests that diseases caused by somatic mutations should not be peripheral, a prediction we confirm for cancer genes.

Genome-scale analysis of invivo spatiotemporal promoter activity in Caenorhabditis elegans

Denis Dupuy, Nicolas Bertin, César A Hidalgo, Kavitha Venkatesan, Domena Tu, David Lee, Jennifer Rosenberg, Nenad Svrzikapa, Aurélie Blanc, Alain Carnec, Anne-Ruxandra Carvunis, Rock Pulak, Jane Shingles, John Reece-Hoyes, Rebecca Hunt-Newbury, Ryan Viveiros, William A Mohler, Murat Tasan, Frederick P Roth, Christian Le Peuch, Ian A Hope, Robert Johnsen, Donald G Moerman, Albert-László Barabási, David Baillie & Marc Vidal

  • ABSTRACT

Differential regulation of gene expression is essential for cell fate specification in metazoans. Characterizing the transcriptional activity of gene promoters, in time and in space, is therefore a critical step toward understanding complex biological systems. Here we present an in vivo spatiotemporal analysis for ∼900 predicted C. elegans promoters (∼5% of the predicted protein-coding genes), each driving the expression of green fluorescent protein (GFP). Using a flow-cytometer adapted for nematode profiling, we generated 'chronograms', two-dimensional representations of fluorescence intensity along the body axis and throughout development from early larvae to adults. Automated comparison and clustering of the obtained in vivo expression patterns show that genes coexpressed in space and time tend to belong to common functional categories. Moreover, integration of this data set with C. elegans protein-protein interactome data sets enables prediction of anatomical and temporal interaction territories between protein partners.

Structure and tie strengths in mobile communication networks

J.-P. Onnela, J. Saramaki, J. Hyvonen, G. Szabo, D. Lazer, K. Kaski, J. Kertesz, A.-L. Barabási

Proceedings of the National Academy of Sciences 104 (18), 7332-7336 (2007)

Electronic databases, from phone to e-mails logs, currently provide detailed records of human communication patterns, offering novel avenues to map and explore the structure of social and communication networks. Here we examine the communication patterns of millions of mobile phone users, allowing us to simultaneously study the local and the global structure of a society-wide communication network. We observe a coupling between interaction strengths and the network's local structure, with the counterintuitive consequence that social networks are robust to the removal of the strong ties but fall apart after a phase transition if the weak ties are removed. We show that this coupling significantly slows the diffusion process, resulting in dynamic trapping of information in communities and find that, when it comes to information diffusion, weak and strong ties are both simultaneously ineffective.

Impact of non-Poissonian activity patterns on spreading processes

A. Vazquez, B. Rácz, A. Lukács, A.-L. Barabási

Physical Review Letters 98:15, 158702 (2007)

Halting a computer or biological virus outbreak requires a detailed understanding of the timing of the interactions between susceptible and infected individuals. While current spreading models assume that users interact uniformly in time, following a Poisson process, a series of recent measurements indicates that the intercontact time distribution is heavy tailed, corresponding to a temporally inhomogeneous bursty contact process. Here we show that the non-Poisson nature of the contact dynamics results in prevalence decay times significantly larger than predicted by the standard Poisson process based models. Our predictions are in agreement with the detailed time resolved prevalence data of computer viruses, which, according to virus bulletins, show a decay time close to a year, in contrast with the 1 day decay predicted by the standard Poisson process based models.

Quantifying social group evolution

G. Palla, A.-L. Barabási, T. Vicsek

Nature 446:7136, 664-667 (2007)

Our focus is on networks capturing the collaboration between scientists and the calls between mobile phone users. We find that large groups persist for longer if they are capable of dynamically altering their membership, suggesting that an ability to change the group composition results in better adaptability. The behaviour of small groups displays the opposite tendency—the condition for stability is that their composition remains unchanged. We also show that knowledge of the time commitment of members to a given community can be used for estimating the community’s lifetime. These findings offer insight into the fundamental differences between the dynamics of small groups and large institutions.

Complex networks - From data to models

M. C. Gonzalez, A.-L. Barabási

Nature Physics 3, 224-225 (2007)

Data on the movement of people becomes ever more detailed, but robust models explaining the observed patterns are still needed. Mapping the problem onto a 'network of networks' could be a promising approach.

Distribution of node characteristics in complex networks

J. Park, A-L. Barabási

Proceedings of the National Academy of Sciences 104, 17916-17920 (2007)

Our enhanced ability to map the structure of various complex networks is increasingly accompanied by the possibility of independently identifying the functional characteristics of each node. Although this led to the observation that nodes with similar characteristics have a tendency to link to each other, in general we lack the tools to quantify the interplay between node properties and the structure of the underlying network. Here we show that when nodes in a network belong to two distinct classes, two independent parameters are needed to capture the detailed interplay between the network structure and node properties. We find that the network structure significantly limits the values of these parameters, requiring a phase diagram to uniquely characterize the configurations available to the system. The phase diagram shows a remarkable independence from the network size, a finding that, together with a proposed heuristic algorithm, allows us to determine its shape even for large networks. To test the usefulness of the developed methods, we apply them to biological and socioeconomic systems, finding that protein functions and mobile phone usage occupy distinct regions of the phase diagram, indicating that the proposed parameters have a strong discriminating power.

Human disease classification in the postgenomic era: A complex systems approach to human pathobiology

J. Loscalzo, I. Kohane, A.-L. Barabási

Molecular Systems Biology 3:124, 1-11 (2007)

Contemporary classification of human disease derives from observational correlation between pathological analysis and clinical syndromes. Characterizing disease in this way established a nosology that has served clinicians well to the current time, and depends on observational skills and simple laboratory tools to define the syndromic phenotype. Yet, this time-honored diagnostic strategy has significant shortcomings that reflect both a lack of sensitivity in identifying preclinical disease, and a lack of specificity in defining disease unequivocally. In this paper, we focus on the latter limitation, viewing it as a reflection both of the different clinical presentations of many diseases (variable phenotypic expression), and of the excessive reliance on Cartesian reductionism in establishing diagnoses. The purpose of this perspective is to provide a logical basis for a new approach to classifying human disease that uses.

Transcription factor modularity in a Gene-Centered C. elegans Protein-DNA interaction network

V. Vermeirssen, M. Inmaculada Barrasa, C. Hidalgo, J.-A. B. Babon, R. Sequerra, L. Doucette-Stamm, A.-L. Barabási, A. J.M. Walhout

Genome Research 17, 061-1071 (2007)

Transcription regulatory networks play a pivotal role in the development, function, and pathology of metazoan organisms. Such networks are comprised of protein–DNA interactions between transcription factors (TFs) and their target genes. An important question pertains to how the architecture of such networks relates to network functionality. Here, we show that a Caenorhabditis elegans core neuronal protein–DNA interaction network is organized into two TF modules. These modules contain TFs that bind to a relatively small number of target genes and are more systems specific than the TF hubs that connect the modules. Each module relates to different functional aspects of the network. One module contains TFs involved in reproduction and target genes that are expressed in neurons as well as in other tissues. The second module is enriched for paired homeodomain TFs and connects to target genes.

Dynamics of information access on the web

Z. Dezso, E. Almaas, A. Lukacs, B. Racz, I. Szakadat, A.-L. Barabási

Physical Review E 73, 066132 (2006)

While current studies on complex networks focus on systems that change relatively slowly in time, the structure of the most visited regions of the web is altered at the time scale from hours to days. Here we investigate the dynamics of visitation of a major news portal, representing the prototype for such a rapidly evolving network. The nodes of the network can be classified into stable nodes, which form the timeindependent skeleton of the portal, and news documents. The visitations of the two node classes are markedly different, the skeleton acquiring visits at a constant rate, while a news document’s visitation peaks after a few hours. We find that the visitation pattern of a news document decays as a power law, in contrast with the exponential prediction provided by simple models of site visitation. This is rooted in the inhomogeneous nature of the browsing pattern characterizing individual users: the time interval between consecutive visits by the same user to the site follows a power-law distribution, in contrast to the exponential expected for Poisson processes. We show that the exponent characterizing the individual user’s browsing patterns determines the power-law decay in a document’s visitation. Finally, our results document the fleeting quality of news and events: while fifteen minutes of fame is still an exaggeration in the online media, we find that access to most news items significantly decays after 36 hours of posting.

WIPER: the integrated wireless phone based emergency response system

G. Madey, G. Szabo, A.-L. Barabási

Lecture Notes in Computer Science 3993, 417-424 (2006)

We describe a prototype emergency response system. This dynamic data driven application system (DDDAS) uses wireless call data, including call volume, who calls whom, call duration, services in use, and cell phone location information. Since all cell phones (that are powered on) maintain contact with one or more local cell towers, location data about each phone is updated periodically and available throughout the cellular phone network. This permits the cell phones of a city to serve as an ad hoc mobile sensor net, measuring the movement and calling patterns of the population. Social network theory and statistical analysis on normal call activity and call locations establish a baseline. A detection and alert system monitors streaming summary cell phone call data. Abnormal call patterns or population movements trigger a simulation and prediction system. Hypotheses about the anomaly are generated by a rule-based system, each initiating an agent-based simulation. Automated dynamic validation of the simulations against incoming streaming data is used to test each hypothesis. A validated simulation is used to predict the evolution of the anomaly and made available to an emergency response decision support system.

A protein-protein interaction network for human inherited ataxias and disorders of Purkinje cell degeneration

J. Lim, T. Hao, C. Shaw, A.J. Patel, G. Szabo, J.F. Rual, C.J. Fisk, N. Li, A. Smolyar, D.E. Hill, A.-L. Barabási, M. Vidal, H.Y. Zoghbi

Cell 125, 801-814 (2006)

Many human inherited neurodegenerative disorders are characterized by loss of balance due to cerebellar Purkinje cell (PC) degeneration. Although the disease-causing mutations have been identified for a number of these disorders, the normal functions of the proteins involved remain, in many cases, unknown. To gain insight into the function of proteins involved in PC degeneration, we developed an interaction network for 54 proteins involved in 23 inherited ataxias and expanded the network by incorporating literature-curated and evolutionarily conserved interactions. We identified 770 mostly novel protein–protein interactions using a stringent yeast two-hybrid screen; of 75 pairs tested, 83% of the interactions were verified in mammalian cells. Many ataxia-causing proteins share interacting partners, a subset of which have been found to modify neurodegeneration in animal models. This interactome thus provides a tool for understanding pathogenic mechanisms common for this class of neurodegenerative disorders and for identifying candidate genes for inherited ataxias.

Modeling bursts and heavy tails in human dynamics

A. Vazquez, J.G. Oliveira, Z. Dezso, K.I. Goh, I. Kondor, A.-L. Barabási

Physical Review E 73, 036127 (2006)

The dynamics of many social, technological and economic phenomena are driven by individual human actions, turning the quantitative understanding of human behavior into a central question of modern science. Current models of human dynamics, used from risk assessment to communications, assume that human actions are randomly distributed in time and thus well approximated by Poisson processes. Here we provide direct evidence that for five human activity patterns, such as email and letter based communications, web browsing, library visits and stock trading, the timing of individual human actions follow non-Poisson statistics, characterized by bursts of rapidly occurring events separated by long periods of inactivity. We show that the bursty nature of human behavior is a consequence of a decision based queuing process: when individuals execute tasks based on some perceived priority, the timing of the tasks will be heavy tailed, most tasks being rapidly executed, while a few experiencing very long waiting times. In contrast, priority blind execution is well approximated by uniform interevent statistics. We discuss two queuing models that capture human activity. The first model assumes that there are no limitations on the number of tasks an individual can hadle at any time, predicting that the waiting time of the individual tasks follow a heavy tailed distribution Pww− with =3/2. The second model imposes limitations on the queue length, resulting in a heavy tailed waiting time distribution characterized by =1. We provide empirical evidence supporting the relevance of these two models to human activity patterns, showing that while emails, web browsing and library visitation display =1, the surface mail based communication belongs to the =3/2 universality class. Finally, we discuss possible extension of the proposed queuing models and outline some future challenges in exploring the statistical mechanics of human dynamics.

Stable evolutionary signal in a Yeast protein interaction network

S. Wuchty, A.-L. Barabási, M.T. Ferdig

BMC Evolutionary Biology 60, 8 (2006)

Background: The recently emerged protein interaction network paradigm can provide novel and important insights into the innerworkings of a cell. Yet, the heavy burden of both false positive and false negative protein-protein interaction data casts doubt on the broader usefulness of these interaction sets. Approaches focusing on one-protein-at-a-time have been powerfully employed to demonstrate the high degree of conservation of proteins participating in numerous interactions; here, we expand his 'node' focused paradigm to investigate the relative persistence of 'link' based evolutionary signals in a protein interaction network of S. cerevisiae and point out the value of this relatively untapped source of information. Results: The trend for highly connected proteins to be preferably conserved in evolution is stable, even in the context of tremendous noise in the underlying protein interactions as well as in the assignment of orthology among five higher eukaryotes. We find that local clustering around interactions correlates with preferred evolutionary conservation of the participating proteins; furthermore the correlation between high local clustering and evolutionary conservation is accompanied by a stable elevated degree of coexpression of the interacting proteins. We use this conserved interaction data, combined with P. falciparum /Yeast orthologs, as proof-of-principle that high-order network topology can be used comparatively to deduce local network structure in nonmodel organisms. Conclusion: High local clustering is a criterion for the reliability of an interaction and coincides with preferred evolutionary conservation and significant coexpression. These strong and stable correlations indicate that evolutionary units go beyond a single protein to include the interactions among them. In particular, the stability of these signals in the face of extreme noise suggests that empirical protein interaction data can be integrated with orthologous clustering around these protein interactions to reliably infer local network structures in non-model organisms.

Correspondence patterns - mechanisms and models of human dynamics - Reply

J. G. Oliveira, A.-L. Barabási

Nature 441, E5-E6 (2006)

Kentsis notes that the response time to an email or a letter depends on the semantic content of the correspondence, as well as the social context in which the communication arises1. We would add that it also depends on deadlines, the time dependence of priorities and the dropping of past-deadline messages2, making human response dynamics sufficiently complicated that no simple model could fully account for it3–6. However, the advantage of the proposed modelling framework is that most of these effects can be incorporated into it, and their impact on the queuing process can be systematically evaluated. Addressing some of these additional mechanisms, including those suggested by Kentsis, requires information that is beyond reach for most researchers at this point.

The activity reaction core and plasticity of metabolic networks

E. Almaas, Z.N. Oltvai, A.-L. Barabási

PLOS Computational Biology 1, 557-563 (2005)

Understanding the system-level adaptive changes taking place in an organism in response to variations in the environment is a key issue of contemporary biology. Current modeling approaches, such as constraint-based fluxbalance analysis, have proved highly successful in analyzing the capabilities of cellular metabolism, including its capacity to predict deletion phenotypes, the ability to calculate the relative flux values of metabolic reactions, and the capability to identify properties of optimal growth states. Here, we use flux-balance analysis to thoroughly assess the activity of Escherichia coli, Helicobacter pylori, and Saccharomyces cerevisiae metabolism in 30,000 diverse simulated environments. We identify a set of metabolic reactions forming a connected metabolic core that carry non-zero fluxes under all growth conditions, and whose flux variations are highly correlated. Furthermore, we find that the enzymes catalyzing the core reactions display a considerably higher fraction of phenotypic essentiality and evolutionary conservation than those catalyzing noncore reactions. Cellular metabolism is characterized by a large number of species-specific conditionally active reactions organized around an evolutionary conserved, but always active, metabolic core. Finally, we find that most current antibiotics interfering with bacterial metabolism target the core enzymes, indicating that our findings may have important implications for antimicrobial drug-target discovery.

Minimum spanning trees of weighted scale-free networks

P.J. Macdonald, E. Almaas, A.-L. Barabási

Europhysics Letters 72, 308-314 (2005)

A complete characterization of real networks requires us to understand the consequences of the uneven interaction strengths between a system’s components. Here we use minimum spanning trees (MSTs) to explore the effect of correlations between link weights and network topology on scale-free networks. Solely by changing the nature of the correlations between weights and network topology, the structure of the MSTs can change from scale-free to exponential. Additionally, for some choices of weight correlations, the efficiency of the MSTs increases with increasing network size, a result with potential implications for the design and scalability of communication networks.

Topological units of environmental signal processing in the transcriptional regulatory network of Escherichia coli

G. Balazsi, A.-L. Barabási, Z.N. Oltvai

Proceedings of the National Academy of Sciences 102, 7841-7846 (2005)

Recent evidence indicates that potential interactions within metabolic, protein–protein interaction, and transcriptional regulatory networks are used differentially according to the environmental conditions in which a cell exists. However, the topological units underlying such differential utilization are not understood. Here we use the transcriptional regulatory network of Escherichia coli to identify such units, called origons, representing regulatory subnetworks that originate at a distinct class of sensor transcription factors. Using microarray data, we find that specific environmental signals affect mRNA expression levels significantly only within the origons responsible for their detection and processing. We also show that small regulatory interaction patterns, called subgraphs and motifs, occupy distinct positions in and between origons, offering insights into their dynamical role in information processing. The identified features are likely to represent a general framework for environmental signal processing in prokaryotes.
Here I show that the bursty nature of human behaviour is a consequence of a decision-based queuing process: when individuals execute tasks based on some perceived priority, the timing of the tasks will be heavy tailed, with most tasks being rapidly executed, whereas a few experience very long waiting times. In contrast, random or priority blind execution is well approximated by uniform inter-event statistics. These finding have important implications, ranging from resource management to service allocation, in both communications and retail.

Network theory-the emergence of creative enterprise

A.-L. Barabási

Science 308, 639 (2005)

Inhomogeneous evolution of subgraphs and cycles in complex networks

A. Vazquez, J. G. Oliveira, A.-L. Barabási

Physical Review E 71, 025103 (2005)

Subgraphs and cycles are often used to characterize the local properties of complex networks. Here we show that the subgraph structure of real networks is highly time dependent: as the network grows, the density of some subgraphs remains unchanged, while the density of others increase at a rate that is determined by the network’s degree distribution and clustering properties. This inhomogeneous evolution process, supported by direct measurements on several real networks, leads to systematic shifts in the overall subgraph spectrum and to an inevitable overrepresentation of some subgraphs and cycles.

Emergence of large-scale vorticity during diffusion in a random potential under an alternating bias

M. A. Makeev, I. Derenyi, A.-L. Barabási

Physical Review E 71, 026112 (2005)

Conventional wisdom indicates that the presence of an alternating driving force will not change the longterm behavior of a Brownian particle moving in a random potential. Although this is true in one dimension, here we offer direct evidence that the inevitable local symmetry breaking present in a two-dimensional random potential leads to the emergence of a local ratchet effect that generates large-scale vorticity patterns consisting of steady-state net diffusive currents. For small fields the spatial correlation function of the current follows a logarithmic distance dependence, while for large external fields both the vorticity and the correlations gradually disappear. We uncover the scaling laws characterizing this unique pattern formation process, and discuss their potential relevance to real systems.

Multiscaling and non-universality in fluctuations of driven complex systems

Z. Eisler, J. Kertesz, S.-H. Yook, A.-L. Barabási

Europhysics Letters 69, 664-670 (2005)

For many externally driven complex systems neither the noisy driving force, nor the internal dynamics are a priori known. Here we focus on systems for which the timedependent activity of a large number of components can be monitored, allowing us to separate each signal into a component attributed to the external driving force and one to the internal dynamics. We propose a formalism to capture the potential multiscaling in the fluctuations and apply it to the high-frequency trading records of the New York Stock Exchange. We find that on the time scale of minutes the dynamics is governed by internal processes, while on a daily or longer scale the external factors dominate. This transition from internal to external dynamics induces systematic changes in the scaling exponents, offering direct evidence of non-universality in the system.

Taming complexity

A.-L. Barabási

Nature Physics 1, 68-70 (2005)

The science of networks is experiencing a boom. But despite the necessary multidisciplinary approach to tackle the theory of complexity, scientists remain largely compartmentalized in their separate disciplines. Can they find a common voice?

Darwin and Einstein correspondence patterns

J. G. Oliveira, A.-L. Barabási

Nature 437, 1251 (2005)

These scientists prioritized their replies to letters in the same way that people rate their e-mails today.

The topological relationship between the large-scale attributes and local interactions patterns of complex networks

A. Vazquez, R. Dobrin, D. Sergi, J.-P. Eckmann, Z. N. Oltvai, A.-L. Barabási

Proceedings of the National Academy of Sciences 101, 17940-17945 (2004)

Recent evidence indicates that the abundance of recurring elementary interaction patterns in complex networks, often called subgraphs or motifs, carry significant information about their function and overall organization. Yet, the underlying reasons for the variable quantity of different subgraph types, their propensity to form clusters, and their relationship with the networks’ global organization remain poorly understood. Here we show that a network’s large-scale topological organization and its local subgraph structure mutually define and predict each other, as confirmed by direct measurements in five well studied cellular networks. We also demonstrate the inherent existence of two distinct classes of subgraphs, and show that, in contrast to the low-density type II subgraphs, the highly abundant type I subgraphs cannot exist in isolation but must naturally aggregate into subgraph clusters. The identified topological framework may have important implications for our understanding of the origin and function of subgraphs in all complex networks.

Reverse engineering of linking preferences from network restructuring

G. Palla, I. Farkas, I. Derenyi, A.-L. Barabási, T. Vicsek

Physical Review E 70, 046115 (2004)

We provide a method to deduce the preferences governing the restructuring dynamics of a network from the observed rewiring of the edges. Our approach is applicable for systems in which the preferences can be formulated in terms of a single-vertex energy function with fskd being the contribution of a node of degree k to the total energy, and the dynamics obeys the detailed balance. The method is first tested by Monte Carlo simulations of restructuring graphs with known energies; then it is used to study variations of real network systems ranging from the coauthorship network of scientific publications to the asset graphs of the New York Stock Exchange. The empirical energies obtained from the restructuring can be described by a universal function fskd,−k ln k, which is consistent with and justifies the validity of the preferential attachment rule proposed for growing networks.

Effect of surface morphology on the sputtering yields: II. Ion sputtering from rippled surfaces

M.A. Makeev, A.-L. Barabási

Nuclear Instruments & Methods In Physics Research Section B 222, 335-354 (2004)

Off-normal ion bombardment of solid targets with energetic particles often leads to development of periodically modulated structures on the surfaces of eroded materials. Ion-induced surface roughening, in its turn, causes sputtering yield changes. We report on a comprehensive theoretical study of the effect of rippled surface morphology on the sputtering yields. The yield is computed as a function of the parameters characterizing the surface morphology and the incident ion beam, using the Sigmund’s theory of ion sputtering. We find that the surface morphology development may cause substantial variations in the sputter yields, depending on a complex interplay between the parameters characterizing the ripple structure and the incident ion beam. For certain realizations of the ripple structure, the surface morphology is found to induce enhanced, relative to the flat surface value, sputtering yields. On the other hand, there exist regimes in which the sputtering yield is suppressed by the surface roughness below the flat surface result. We confront the obtained theoretical results with available experimental data and find that our model provides an excellent qualitative and, in some cases, quantitative agreement with the results of experimental studies.

Effect of surface morphology on the sputtering yields: I. Ion sputtering from self-affine surfaces

M. A. Makeev, A.-L. Barabási

Nuclear Instruments & Methods In Physics Research Section B 222, 316-334 (2004)

As extensive experimental studies have shown, under certain conditions, ion bombardment of solid targets induces a random (self-affine) morphology on the ion-eroded surfaces. The rough morphology development is known to cause substantial variations in the sputtering yields. In this article, we present a theoretical model describing the sputter yields from random, self-affine surfaces subject to energetic ion bombardment. We employ the Sigmund’s theory of ion sputtering, modified for the case of self-affine surfaces, to compute the sputter yields. We find that the changes in the sputtering yield, associated with the non-planar surface morphology, are strongly dependent on the parameters characterizing the surface roughness (such as the saturation width and the correlation length) and the incident ion beam (such as the incident ion energy and the deposited energy widths). It is shown that, for certain ranges of the parameters variations, surface roughness leads to substantial enhancements in the yield, with magnitude of the effect being more than 100%, as compared to the flat surface value. Furthermore, we find that, depending on the interplay between these parameters, the surface roughness can both enhance and suppress the sputter yields.

Separating the internal and external dynamics of complex systems

M. A. de Menezes, A.-L. Barabási

Physical Review Letters 93, 68701 (2004)

The observable behavior of a complex system reflects the mechanisms governing the internal interactions between the system’s components and the effect of external perturbations. Here we show that by capturing the simultaneous activity of several of the system’s components we can separate the internal dynamics from the external fluctuations. The method allows us to systematically determine the origin of fluctuations in various real systems, finding that while the Internet and the computer chip have robust internal dynamics, highway and Web traffic are driven by external demand. As multichannel measurements are becoming the norm in most fields, the method could help uncover the collective dynamics of a wide array of complex systems.

Functional and topological characterization of protein interaction networks

S. Y. Yook, Z. N. Oltvai, A.-L. Barabási

Proteomics 4, 928-942 (2004)

The elucidation of the cell’s large-scale organization is a primary challenge for post-genomic biology, and understanding the structure of protein interaction networks offers an important starting point for such studies. We compare four available databases that approximate the protein interaction network of the yeast, Saccharomyces cerevisiae, aiming to uncover the network’s generic large-scale properties and the impact of the proteins’ function and cellular localization on the network topology. We show how each database supports a scale-free, topology with hierarchical modularity, indicating that these features represent a robust and generic property of the protein interactions network. We also find strong correlations between the network’s structure and the functional role and subcellular localization of its protein constituents, concluding that most functional and/or localization classes appear as relatively segregated subnetworks of the full protein interaction network. The uncovered systematic differences between the four protein interaction databases reflect their relative coverage for different functional and localization classes and provide a guide for their utility in various bioinformatics studies.

Hot spots and universality in network dynamics

A.-L. Barabási, M. A. de Menezes, S. Balensiefer, J. Brockman

European Physical Journal B 38, 169-175 (2004)

Most complex networks serve as conduits for various dynamical processes, ranging from mass transfer by chemical reactions in the cell to packet transfer on the Internet. We collected data on the time dependent activity of five natural and technological networks, finding evidence of orders of magnitude differences in the fluxes of individual nodes. This dynamical inhomogeneity reflects the emergence of localized high flux regions or “hot spots”, carrying an overwhelming fraction of the network’s activity. We find that each system is characterized by a unique scaling law, coupling the flux fluctuations with the total flux on individual nodes, a result of the competition between the system’s internal collective dynamics and changes in the external environment. We propose a method to separate these two components, allowing us to predict the relevant scaling exponents. As high fluctuations can lead to dynamical bottlenecks and jamming, these findings have a strong impact on the predictability and failure prevention of complex transportation networks.

Global organization of metabolic fluxes in the bacterium Escherichia coli

E. Almaas, B. Kovacs, T. Vicsek, Z.N. Oltvai, A.-L. Barabási

Nature 427, 839-843 (2004)

Cellular metabolism, the integrated interconversion of thousands of metabolic substrates through enzyme-catalysed biochemical reactions, is the most investigated complex intracellular web of molecular interactions. Although the topological organization of individual reactions into metabolic networks is well understood, the principles that govern their global functional use under different growth conditions raise many unanswered questions. By implementing a flux balance analysis of the metabolism of Escherichia coli strain MG1655, here we show that network use is highly uneven. Whereas most metabolic reactions have low fluxes, the overall activity of the metabolism is dominated by several reactions with very high fluxes. E. coli responds to changes in growth conditions by reorganizing the rates of selected fluxes predominantly within this high-flux backbone. This behaviour probably represents a universal feature of metabolic activity in all cells, with potential implications for metabolic engineering.

Network biology: understanding the cell’s functional organization

A.-L. Barabási, Z. N. Oltvai

Nature Reviews Genetics 5, 101-113 (2004)

A key aim of postgenomic biomedical research is to systematically catalogue all molecules and their interactions within a living cell. There is a clear need to understand how these molecules and the interactions between them determine the function of this enormously complex machinery, both in isolation and when surrounded by other cells. Rapid advances in network biology indicate that cellular networks are governed by universal laws and offer a new conceptual framework that could potentially revolutionize our view of biology and disease pathologies in the twenty-first century.

Aggregation of topological motifs in the Escherichia coli transcriptional regulatory networks

R. Dobrin, Q. K. Beg, A.-L. Barabási

BMC Bioinformatics 5, 10 (2004)

Background: Transcriptional regulation of cellular functions is carried out through a complex network of interactions among transcription factors and the promoter regions of genes and operons regulated by them.To better understand the system-level function of such networks simplification of their architecture was previously achieved by identifying the motifs present in the network, which are small, overrepresented, topologically distinct regulatory interaction patterns (subgraphs). However, the interaction of such motifs with each other, and their form of integration into the full network has not been previously examined. Results: By studying the transcriptional regulatory network of the bacterium, Escherichia coli, we demonstrate that the two previously identified motif types in the network (i.e., feed-forward loops and bi-fan motifs) do not exist in isolation, but rather aggregate into homologous motif clusters that largely overlap with known biological functions. Moreover, these clusters further coalesce into a supercluster, thus establishing distinct topological hierarchies that show global statistical properties similar to the whole network. Targeted removal of motif links disintegrates the network into small, isolated clusters, while random disruptions of equal number of links do not cause such an effect. Conclusion: Individual motifs aggregate into homologous motif clusters and a supercluster forming the backbone of the E. coli transcriptional regulatory network and play a central role in defining its global topological organization.

Fluctuations in network dynamics

M. A. de Menezes, A.-L. Barabási

Physical Review Letters 92, 28701 (2004)

Most complex networks serve as conduits for various dynamical processes, ranging from mass transfer by chemical reactions in the cell to packet transfer on the Internet.We collected data on the time dependent activity of five natural and technological networks, finding that for each the coupling of the flux fluctuations with the total flux on individual nodes obeys a unique scaling law. We show that the observed scaling can explain the competition between the system’s internal collective dynamics and changes in the external environment, allowing us to predict the relevant scaling exponents.

Experimental determination and system level analysis of essential genes in Escherichia coli MG1655

S. Y. Gerdes, M. D. Scholle, J. W. Campbell, G. Balazsi, E. Ravasz, M. D. Daugherty, A. L. Somera, N. C. Kyrpides, I. Anderson, M. S. Gelfand, A. Bhattacharya, V. Kapatral, M. D'Souza, M. V. Baev, Y. Grechkin, F. Mseeh, M. Y. Fonstein, R. Overbeek, A.-L. Barabási, Z. N. Oltvai, A. L. Osterman

Journal of Bacteriology 185, 5673-5684 (2003)

Defining the gene products that play an essential role in an organism’s functional repertoire is vital to understanding the system level organization of living cells. We used a genetic footprinting technique for a genome-wide assessment of genes required for robust aerobic growth of Escherichia coli in rich media. We identified 620 genes as essential and 3,126 genes as dispensable for growth under these conditions. Functional context analysis of these data allows individual functional assignments to be refined. Evolutionary context analysis demonstrates a ignificant tendency of essential E. coli genes to be preserved throughout the bacterial kingdom. Projection of these data over metabolic subsystems reveals topologic modules with essential and evolutionarily preserved enzymes with reduced capacity for error tolerance.

Evolutionary conservation of motif constituents in the yeast protein interaction network

S. Wuchty, Z. N. Oltvai, A.-L. Barabási

Nature Genetics 35, 176-179 (2003)

Understanding why some cellular components are conserved across species but others evolve rapidly is a key question of modern biology1-3. Here we show that in Saccharomyces cerevisiae, proteins organized in cohesive patterns of interactions are conserved to a substantially higher degree than those that do not participate in such motifs. We find that the conservation of proteins in distinct topological motifs correlates with the interconnectedness and function of that motif and also depends on the structure of the overall interactome topology. These findings indicate that motifs may represent evolutionary conserved topological units of cellular networks molded in accordance with the specific biological function in which they participate.

Spurious spatial periodicity of co-expression in microarray data due to printing design

G. Balazsi, K. A. Kay, A.-L. Barabási, Z. N. Oltvai

Nucleic Acids Research 31, 4425-4433 (2003)

Global transcriptome data is increasingly combined with sophisticated mathematical analyses to extract information about the functional state of a cell. Yet the extent to which the results re¯ect experimental bias at the expense of true biological information remains largely unknown. Here we show that the spatial arrangement of probes on microarrays and the particulars of the printing procedure signi®- cantly affect the log-ratio data of mRNA expression levels measured during the Saccharomyces cerevisiae cell cycle. We present a numerical method that ®lters out these technology-derived contributions from the existing transcriptome data, leading to improved functional predictions. The example presented here underlines the need to routinely search and compensate for inherent experimental bias when analyzing systematically collected,internally consistent biological data sets.

Emerging behavior in electronic bidding

I. Yang, H. Jeong, B. Kahng, A.-L. Barabási

Physical Review E 68, 016102 (2003)

We characterize the statistical properties of a large number of agents on two major online auction sites. The measurements indicate that the total number of bids placed in a single category and the number of distinct auctions frequented by a given agent follow power-law distributions, implying that a few agents are responsible for a significant fraction of the total bidding activity on the online market. We find that these agents exert an unproportional influence on the final price of the auctioned items. This domination of online auctions by an unusually active minority may be a generic feature of all online mercantile processes.

The topology of the transcription regulatory network in the yeast Saccharomyces cerevisiae

I. Farkas, H. Jeong, T. Vicsek, A.-L. Barabási, Z. N. Oltvai

Physica A 318, 601-612 (2003)

A central goal of postgenomic biology is the elucidation of the regulatory relationships among all cellular constituents that together comprise the ‘genetic network’ of a cell or microorganism. Experimental manipulation of gene activity coupled with the assessment of perturbed transcriptome (i.e., global mRNA expression) patterns represents one approach toward this goal, and may provide a backbone into which other measurements can be later integrated. We use microarray data on 287 single gene deletion Saccharomyces cerevisiae mutant strains to elucidate generic relationships among perturbed transcriptomes. Their comparison with a method that preferentially recognizes distinct expression subpatterns allows us to pair those transcriptomes that share localized similarities. Analyses of the resulting transcriptome similarity network identify a continuum hierarchy among the deleted genes, and in the frequency of local similarities that establishes the links among their reorganized transcriptomes. We also find a combinatorial utilization of shared expression subpatterns within individual links, with increasing quantitative similarity among those that connect transcriptome states induced by the deletion of functionally related gene products. This suggests a distinct hierarchical and combinatorial organization of the S. cerevisiae transcriptional activity, and may represent a pattern that is generic to the transcriptional organization of all eukaryotic organisms.

Hierarchical organization in complex networks

E. Ravasz, A.-L. Barabási

Physical Review E 67, 026112 (2003)

Many real networks in nature and society share two generic properties: they are scale-free and they display a high degree of clustering. We show that these two features are the consequence of a hierarchical organization, implying that small groups of nodes organize in a hierarchical manner into increasingly large groups, while maintaining a scale-free topology. In hierarchical networks, the degree of clustering characterizing the different groups follows a strict scaling law, which can be used to identify the presence of a hierarchical organization in real networks. We find that several real networks, such as the Worldwideweb, actor network, the Internet at the domain level, and the semantic web obey this scaling law, indicating that hierarchy is a fundamental characteristic of many complex systems.

Measuring preferential attachment in evolving networks

H. Jeong, Z. Neda, A.-L. Barabási

Europhysics Letters 61, 567-572 (2003)

A key ingredient of many current models proposed to capture the topological evolution of complex networks is the hypothesis that highly connected nodes increase their connectivity faster than their less connected peers, a phenomenon called preferential attachment. Measurements on four networks, namely the science citation network, Internet, actor collaboration and science coauthorship network indicate that the rate at which nodes acquire links depends on the node’s degree, offering direct quantitative support for the presence of preferential attachment. We find that for the first two systems the attachment rate depends linearly on the node degree, while for the last two the dependence follows a sublinear power law.

Prediction of protein essentiality based on genomic data

H. Jeong, Z. N. Oltvai, A.-L. Barabási

ComPlexUs 1, 19-28 (2003)

A major goal of pharmaceutical bioinformatics is to develop computational tools for systematic in silico molecular target identification. Here we demonstrate that in the yeast Saccharomyces cerevisiae the phenotypic effect of single gene deletions simultaneously correlates with fluctuations in mRNA expression profiles, the functional categorization of the gene products, and their connectivity in the yeast’s protein-protein interaction network. Building on these quantitative correlations, we developed a computational method for predicting the phenotypic effect of a given gene’s functional disabling or removal. Our subsequent analyses were in good agreement with the results of systematic gene deletion experiments, allowing us to predict the deletion phenotype of a number of untested yeast genes. The results underscore the utility oflarge genomic databases for in silico systematic drug target identification in the postgenomic era.

Scale-Free Networks

A.-L. Barabási, E. Bonabeau

Scientific American 288, 50-59 (2003)

Scientists have recently discovered that various complex systems have an underlying architecture governed by shared organizing principies. This insight has important implications for a host of applications, from drug development to Internet security.

Prediction of protein essentiality based on genomic data

H. Jeong, Z. N. Oltvai, A.-L. Barabási

ComPlexUs 1, 19-28 (2003)

A major goal of pharmaceutical bioinformatics is to develop computational tools for systematic in silico molecular target identification. Here we demonstrate that in the yeast Saccharomyces cerevisiae the phenotypic effect of single gene deletions simultaneously correlates with fluctuations in mRNA expression profiles, the functional categorization of the gene products, and their connectivity in the yeast’s protein-protein interaction network. Building on these quantitative correlations, we developed a computational method for predicting the phenotypic effect of a given gene’s functional disabling or removal. Our subsequent analyses were in good agreement with the results of systematic gene deletion experiments, allowing us to predict the deletion phenotype of a number of untested yeast genes. The results underscore the utility oflarge genomic databases for in silico systematic drug target identification in the postgenomic era.

Bioinformatics analysis of experimentally determined protein complexes in the yeast Saccharomyces cerevisiae

Z. Dezso, Z. N. Oltvai, A.-L. Barabási

Genome Research 13, 2450-2454 (2003)

Many important cellular functions are implemented by protein complexes that act as sophisticated molecular machines of varying size and temporal stability. Here we demonstrate quantitatively that protein complexes in the yeast Saccharomyces cerevisiae are comprised of a core in which subunits are highly coexpressed, display the same deletion phenotype (essential or nonessential), and share identical functional classification and cellular localization. This core is surrounded by a functionally mixed group of proteins, which likely represent short-lived or spurious attachments. The results allow us to define the deletion phenotype and cellular task of most known complexes, and to identify with high confidence the biochemical role of hundreds of proteins with yet unassigned functionality.

Morphology of ion-sputtered surfaces

M. Makeev, R. Cuerno, A.-L. Barabási

Nuclear Instruments & Methods In Physics Research Section B 197, 185-227 (2002)

We derive a stochastic nonlinear continuum equation to describe the morphological evolution of amorphous surfaces eroded by ion bombardment. Starting from Sigmund s theory of sputter erosion, we calculate the coefficients appearing in the continuum equation in terms of the physical parameters characterizing the sputtering process. We analyze the morphological features predicted by the continuum theory, comparing them with the experimentally reported morphologies. We show that for short time scales, where the effect of nonlinear terms is negligible, the continuum theory predicts ripple formation. We demonstrate that in addition to relaxation by thermal surface diffusion, the sputtering process can also contribute to the smoothing mechanisms shaping the surface morphology. We explicitly calculate an effective surface diffusion constant characterizing this smoothing effect and show that it is responsible for the low temperature ripple formation observed in various experiments. At long time scales the nonlinear terms dominate the evolution of the surface morphology. The nonlinear terms lead to the stabilization of the ripple wavelength and we show that, depending on the experimental parameters, such as angle of incidence and ion energy, different morphologies can be observed: asymptotically, sputter eroded surfaces could undergo kinetic roughening, or can display novel ordered structures with rotated ripples. Finally, we discuss in detail the existing experimental support for the proposed theory and uncover novel features of the surface morphology and evolution, that could be directly tested experimentally.

Nanoscale wire formation on sputter eroded surface

J. Kim, B. Kahng, A.-L. Barabási

Applied Physics Letters 81, 3654-3656 (2002)

Rotated ripple structures (RRS) on sputter-eroded surfaces are potential candidates for nanoscale wire fabrication. We show that the RRS can form when the width of the collision cascade in the longitudinal direction is larger than that in the transverse direction and the incident angle of ion beam is chosen in a specific window. By calculating the structure factor for the RRS, we find that they are more regular and their amplitude is more enhanced compared to the much studied ripple structure forming in the linear regime of sputter erosion.

Life’s complexity pyramid

Z. N. Oltvai, A.-L. Barabási

Science 298, 763-764 (2002)

Cells and microorganisms have an impressive capacity for adjusting their intracellular machinery in response to changes in their environment, food availability, and developmental state. Add to this an amazing ability to correct internal errors — battling the effects of such mistakes as mutations or misfolded proteins — and we arrive at a major issue of contemporary cell biology: our need to comprehend the staggering complexity, versatility, and robustness of living systems. Although molecular biology offers many spectacular successes, it is clear that the detailed inventory of genes, proteins, and metabolites is not sufficient to understand the cell’s complexity (1). As demonstrated by two papers in this issue—Lee et al. (2) on page 799 and Milo et al. (3) on page 824—viewing the cell as a network of genes and proteins offers a viable strategy for addressing the complexity of living systems.

Modeling the internet’s large-scale topology

S. H. Yook, H. Jeong, A.-L. Barabási

Proceedings of the National Academy of Sciences 99, 13382-13386 (2002)

Network generators that capture the Internet’s large-scale topology are crucial for the development of efficient routing protocols and modeling Internet traffic. Our ability to design realistic generators is limited by the incomplete understanding of the fundamental driving forces that affect the Internet’s evolution. By combining several independent databases capturing the time evolution, topology, and physical layout of the Internet, we identify the universal mechanisms that shape the Internet’s router and autonomous system level topology. We find that the physical layout of nodes form a fractal set, determined by population density patterns around the globe. The placement of links is driven by competition between preferential attachment and linear distance dependence, a marked departure from the currently used exponential laws. The universal parameters that we extract significantly restrict the class of potentially correct Internet models and indicate that the networks created by all available topology generators are fundamentally different from the current Internet.

Two degrees of separation in complex food webs

R. J. Williams, N. D. Martinez, E. L. Berlow, J. A. Dunne, A.-L. Barabási

Proceedings of the National Academy of Sciences 99, 12913-12916 (2002)

Feeding relationships can cause invasions, extirpations, and population fluctuations of a species to dramatically affect other species within a variety of natural habitats. Empirical evidence suggests that such strong effects rarely propagate through food webs more than three links away from the initial perturbation. However, the size of these spheres of potential influence within complex communities is generally unknown. Here, we show for that species within large communities from a variety of aquatic and terrestrial ecosystems are on average two links apart, with >95% of species typically within three links of each other. Species are drawn even closer as network complexity and, more unexpectedly, species richness increase. Our findings are based on seven of the largest and most complex food webs available as well as a food-web model that extends the generality of the empirical results. These results indicate that the dynamics of species within ecosystems may be more highly interconnected and that biodiversity loss and species invasions may affect more species than previously thought.

Hierarchical organization of modularity in metabolic networks

E. Ravasz, A. L. Somera, D. A. Mongru, Z. N. Oltvai, A.-L. Barabási

Science 297, 1551-1555 (2002)

Spatially or chemically isolated functional modules composed of several cellular components and carrying discrete functions are considered fundamental building blocks of cellular organization, but their presence in highly integrated biochemical networks lacks quantitative support. Here, we show that the metabolic networks of 43 distinct organisms are organized into many small, highly connected topologic modules that combine in a hierarchical manner into larger, less cohesive units, with their number and degree of clustering following a power law. Within Escherichia coli, the uncovered hierarchical modularity closely overlaps with known metabolic functions. The identified network architecture may be generic to system-level cellular organization.

Evolution of the social network of scientific collaborations

A.L. Barabási, H. Jeong, Z. Neda, E. Ravasz, A. Schubert, T. Vicsek

Physica A 311, 590-614 (2002)

The co-authorship network of scientists represents a prototype of complex evolving networks. In addition, it o8ers one of the most extensive database to date on social networks. By mapping the electronic database containing all relevant journals in mathematics and neuro-science for an 8-year period (1991–98), we infer the dynamic and the structural mechanisms that govern the evolution and topology of this complex system. Three complementary approaches allow us to obtain a detailed characterization. First, empirical measurements allow us to uncover the topological measures that characterize the network at a given moment, as well as the time evolution of these quantities. The results indicate that the network is scale-free, and that the network evolution is governed by preferential attachment, a8ecting both internal and external links. However, in contrast with most model predictions the average degree increases in time, and the node separation decreases. Second, we propose a simple model that captures the network’s time evolution. In some limits the model can be solved analytically, predicting a two-regime scaling in agreement with the measurements. Third, numerical simulations are used to uncover the behavior of quantities that could not be predicted analytically. The combined numerical and analytical results underline the important role internal links play in determining the observed scaling behavior and network topology. The results and methodologies developed in the context of the co-authorship network could be useful for a systematic study of other complex evolving networks as well, such as the world wide web, Internet, or other social networks.

Percolation in directed scale-free networks

N. Schwartz, R. Cohen, D. ben-Avraham, A.-L. Barabási, S. Havlin

Physical Review E 66, 015104 (2002)

Many complex networks in nature have directed links, a property that affects the network’s navigability and large-scale topology. Here we study the percolation properties of such directed scale-free networks with correlated in and out degree distributions. We derive a phase diagram that indicates the existence of three regimes, determined by the values of the degree exponents. In the first regime we regain the known directed percolation mean field exponents. In contrast, the second and third regimes are characterized by anomalous exponents, which we calculate analytically. In the third regime the network is resilient to random dilution, i.e., the percolation threshold is pe-->1.

Halting viruses in scale-free networks

Z. Dezso, A.-L. Barabási

Physical Review E 65, 055103 (2002)

The vanishing epidemic threshold for viruses spreading on scale-free networks indicate that traditional methods, aiming to decrease a virus’ spreading rate cannot succeed in eradicating an epidemic. We demonstrate that policies that discriminate between the nodes, curing mostly the highly connected nodes, can restore a finite epidemic threshold and potentially eradicate a virus. We find that the more biased a policy is towards the hubs, the more chance it has to bring the epidemic threshold above the virus’ spreading rate. Furthermore, such biased policies are more cost effective, requiring less cures to eradicate the virus.

Statistical mechanics of complex networks

R. Albert, A.-L. Barabási

Reviews of Modern Physics 74, 47-97 (2002)

Complex networks describe a wide range of systems in nature and society. Frequently cited examples include the cell, a network of chemicals linked by chemical reactions, and the Internet, a network of routers and computers connected by physical links. While traditionally these systems have been modeled as random graphs, it is increasingly recognized that the topology and evolution of real networks are governed by robust organizing principles. This article reviews the recent advances in the field of complex networks, focusing on the statistical mechanics of network topology and dynamics. After reviewing the empirical data that motivated the recent interest in networks, the authors discuss the main models and analytical tools, covering random graphs, small-world and scale-free networks, the emerging theory of evolving networks, and the interplay between topology and the network’s robustness against failures and attacks.

Monte Carlo simulation of sinusoidally modulated superlattice growth

H. Jeong, B. Kahng, S. Lee, C.Y. Kwak, A.-L. Barabási, J.K. Furdyna

Physical Review E 65, 031602 (2002)

The fabrication of ZnSe/ZnTe superlattices grown by the process of rotating the substrate in the presence of an inhomogeneous flux distribution instead of the successively closing and opening of source shutters is studied via Monte Carlo simulations. It is found that the concentration of each compound is sinusoidally modulated along the growth direction, caused by the uneven arrival of Se and Te atoms at a given point of the sample, and by the variation of the Te/Se ratio at that point due to the rotation of the substrate. In this way we obtain a ZnSe12xTex alloy in which the composition x varies sinusoidally along the growth direction. The period of the modulation is directly controlled by the rate of the substrate rotation. The amplitude of the compositional modulation is monotonic for small angular velocities of the substrate rotation, but is itself modulated for large angular velocities. The average amplitude of the modulation pattern decreases as the angular velocity of substrate rotation increases and the measurement position approaches the center of rotation. The simulation results are in good agreement with previously published experimental measurements on superlattices fabricated in this manner.

Networks in life: scaling properties and eigenvalue spectra

I. Farkas, I. Derenyi, H. Jeong, Z. Neda, Z. N. Oltvai, E. Ravasz, A. Schubert, A.-L. Barabási, T. Vicsek

Physica A 314, 25-34 (2002)

We analyze growing networks ranging from collaboration graphs of scientists to the network ofsimilarities de9ned among the various transcriptional pro9les ofliving cells. For the explicit demonstration ofthe scale-free nature and hierarchical organization ofthese graphs, a deterministic construction is also used. We demonstrate the use ofdetermining the eigenvalue spectra of sparse random graph models for the categorization of small measured networks.

Modeling relaxation and jamming in granular media

B. Kahng, I. Albert, P. Schiffer, A.-L. Barabási

Physical Review E 64, 051303 (2001)

We introduce a stochastic microscopic model to investigate the jamming and reorganization of grains induced by an object moving through a granular medium. The model reproduces the experimentally observed periodic sawtooth fluctuations in the jamming force and predicts the period and the power spectrum in terms of the controllable physical parameters. It also predicts that the avalanche sizes, defined as the number of displaced grains during a single advance of the object, follow a power law P(s);s2t , where the exponent is independent of the physical parameters.

Deterministic scale-free networks

A.-L. Barabási, E. Ravasz, T. Vicsek

Physica A 299, 559–564 (2001)

Scale-free networks are abundant in nature and society, describing such diverse systems as the world wide web, the web of human sexual contacts, or the chemical network of a cell. All models used to generate a scale-free topology are stochastic, that is they create networks in which the nodes appear to be randomly connected to each other. Here we propose a simple model that generates scale-free networks in a deterministic fashion. We solve exactly the model, showing that the tail of the degree distribution follows a power law.

Nanoscale structure formation on sputter eroded surface

B. Kahng, H. Jeong, A.-L. Barabási

Journal of the Korean Physical Society 39, 421-424 (2001)

We investigate the morphological features of sputter eroded surfaces, demonstrating that while at short times ripple formation is described by the linear theory, after a characteristic time, the nonlinear terms determine the surface morphology, by monitoring the surface width and the erosion velocity. Furthermore, we show that sputtering under normal incidence leads to the formation of spatially ordered uniform nanoscale islands or holes. We nd that while the size of these nanostructures is independent of flux and temperature, it can be controlled by ion beam energy.

Spectra of “real-world” graphs: beyond the semicircle law

I. J. Farkas, I. Derenyi, A.-L. Barabási, T. Vicsek

Physical Review E Physical Review, 026704 (2001)

Many natural and social systems develop complex networks that are usually modeled as random graphs. The eigenvalue spectrum of these graphs provides information about their structural properties. While the semicircle law is known to describe the spectral densities of uncorrelated random graphs, much less is known about the spectra of real-world graphs, describing such complex systems as the Internet, metabolic pathways, networks of power stations, scientific collaborations, or movie actors, which are inherently correlated and usually very sparse. An important limitation in addressing the spectra of these systems is that the numerical determination of the spectra for systems with more than a few thousand nodes is prohibitively time and memory consuming. Making use of recent advances in algorithms for spectral characterization, here we develop methods to determine the eigenvalues of networks comparable in size to real systems, obtaining several surprising results on the spectra of adjacency matrices corresponding to models of real-world graphs. We find that when the number of links grows as the number of nodes, the spectral density of uncorrelated random matrices does not converge to the semicircle law. Furthermore, the spectra of real-world graphs have specific features, depending on the details of the corresponding models. In particular, scale-free graphs develop a trianglelike spectral density with a power-law tail, while small-world graphs have a complex spectral density consisting of several sharp peaks. These and further results indicate that the spectra of correlated graphs represent a practical tool for graph classification and can provide useful insight into the relevant structural properties of real networks.

Stick-slip fluctuations in granular drag

I. Albert, P. Tegzes, R. Albert, J. G. Sample, A.-L. Barabási, T. Vicsek, B. Kahng, P. Schiffer

Physical Review E 64, 031307 (2001)

We study fluctuations in the drag force experienced by an object moving through a granular medium. The successive formation and collapse of jammed states give a stick-slip nature to the fluctuations which are periodic at small depths but become ‘‘stepped’’ at large depths, a transition that we interpret as a consequence of the long-range nature of the force chains and the finite size of our experiment. Another important finding is that the mean force and the fluctuations appear to be independent of the properties of the contact surface between the grains and the dragged object. These results imply that the drag force originates in the bulk properties of the granular sample.

The physics of the web

A.-L. Barabási

Physics World 14, 33-38 (2001)

Statistical mechanics is offering new insights into the structure and dynamics of the Internet, the World Wide Web and other complex interacting systems. TH E INTERNET appears to have taken on a life of its own ever since the National Science foundation in the US gave up stewardship of the network in 1995. New lines and routers are added continually by thousands of companies, none of which require permission from anybody to do so, and none of which are obliged to report their activity. This uncontrolled and decentralized growth has turned network designers into scientific explorers. All previous Internet-related research concentrated on designing better protocols and faster components. More recently, an increasing number of scientists have begun to ask an unexpected question: what exactly did we create?'

Weighted evolving networks

S. H. Yook, H. Jeong, A.-L. Barabási, Y. Tu

Physical Review Letters 86, 5835-5838 (2001)

Many biological, ecological, and economic systems are best described by weighted networks, as the nodes interact with each other with varying strength. However, most evolving network models studied so far are binary, the link strength being either 0 or 1. In this paper we introduce and investigate the scaling properties of a class of models which assign weights to the links as the network evolves. The combined numerical and analytical approach indicates that asymptotically the total weight distribution converges to the scaling behavior of the connectivity distribution, but this convergence is hampered by strong logarithmic corrections.

Bose-Einstein condensation in complex networks

G. Bianconi, A.-L. Barabási

Physical Review Letters 86, 5632–5635 (2001)

The evolution of many complex systems, including the World Wide Web, business, and citation networks, is encoded in the dynamic web describing the interactions between the system’s constituents. Despite their irreversible and nonequilibrium nature these networks follow Bose statistics and can undergo Bose-Einstein condensation. Addressing the dynamical properties of these nonequilibrium systems within the framework of equilibrium quantum gases predicts that the “first-mover-advantage,” “fit-get-rich,” and “winner-takes-all” phenomena observed in competitive systems are thermodynamically distinct phases of the underlying evolving networks.

Lethality and centrality in protein networks

H. Jeong, S. P. Mason, A.-L. Barabási, Z. N. Oltvai

Nature 411, 41-42 (2001)

The most highly connected proteins in the cell are the most important for its survival. Proteins are traditionally identified on the basis of their individual actions as catalysts, signalling molecules, or building blocks in cells and microorganisms. But our post-genomic view is expanding the protein’s role into an element in a network of protein–protein interactions as well, in which it has a contextual or cellular function within functional modules1,2. Here we provide quantitative support for this idea by demonstrating that the phenotypic consequence of a single gene deletion in the yeast Saccharomyces cerevisiae is affected to a large extent by the topological position of its protein product in the complex hierarchical web of molecular interactions.

Spatial ordering of stacked quantum dots

B. Kahng, H. Jeong, A.-L. Barabási

Applied Physics Letters 78, 805–807 (2001)

We investigate the growth conditions necessary to form an ordered quantum dot crystal by capping spatially ordered quantum dots and growing a new layer of dots on top of the capping layer. Performing Monte Carlo simulations and developing analytic arguments based on the stress energy function, we demonstrate the existence of an optimal capping layer thickness, external flux, and temperature for the formation of quantum dot crystals.

Quantum dot and hole formation in sputter erosion

B. Kahng, H. Jeong, A.-L. Barabási

Applied Physics Letters 78, 805–807 (2001)

Recently, it was experimentally demonstrated that sputtering under normal incidence leads to the formation of spatially ordered uniform nanoscale islands or holes. Here, we show that these nanostructures have inherently nonlinear origin, first appearing when the nonlinear terms start to dominate the surface dynamics. Depending on the sign of the nonlinear terms, determined by the shape of the collision cascade, the surface can develop regular islands or holes with identical dynamical features, and while the size of these nanostructures is independent of flux and temperature, it can be modified by tuning the ion energy.

Granular drag on a discrete object: shape effects on jamming

I. Albert, J.G. Sample, A.J. Morss, S. Rajagopalan, A.-L. Barabási, P. Schiffer

Physical Review E 64, 061303 (2001)

We study the drag force on discrete objects with circular cross section moving slowly through a spherical granular medium. Variations in the geometry of the dragged object change the drag force only by a small fraction relative to shape effects in fluid drag. The drag force depends quadratically on the object’s diameter as expected. We do observe, however, a deviation above the expected linear depth dependence, and the magnitude of the deviation is apparently controlled by geometrical factors.

Parasitic computing

A.-L. Barabási, V. W. Freeh, H. Jeong, J. Brockman

Nature 412, 894-897 (2001)

Reliable communication on the Internet is guaranteed by a standard set of protocols, used by all computers. Here we show that these protocols can be exploited to compute with the communication infrastructure, transforming the Internet into a distributed computer in which servers unwittingly perform computation on behalf of a remote node. In this model, which we call `parasitic computing, one machine forces target computers to solve a piece of a complex computational problem merely by engaging them in standard communication. Consequently, the target computers are unaware that they have performed computation for the bene®t of a commanding node. As experimental evidence of the principle of parasitic computing, we harness the power of several web servers across the globe, which known to them work together to solve an NP complete problem.

Competition and multiscaling in evolving networks

G. Bianconi, A.-L. Barabási

Europhysics Letters 54, 436-442 (2001)

The rate at which nodes in a network increase their connectivity depends on their fitness to compete for links. For example, in social networks some individuals acquire more social links than others, or on the www some webpages attract considerably more links than others. We find that this competition for links translates into multiscaling, i.e. a fitnessdependent dynamic exponent, allowing fitter nodes to overcome the more connected but less fit ones. Uncovering this fitter-gets-richer phenomenon can help us understand in quantitative terms the evolution of many competitive systems in nature and society.

Comparable system-level organization of Archea and Eucaryotes

J. Podani, Z. N. Oltvai, H. Jeong, B. Tombor, A.-L. Barabási, E. Szathmary

Nature Genetics 29, 54-56 (2001)

A central and long-standing issue in evolutionary theory is the origin of the biological variation upon which natural selection acts1. Some hypotheses suggest that evolutionary change represents an adaptation to the surrounding environment within the constraints of an organism’s innate characteristics1–3. Elucidation of the origin and evolutionary relationship of species has been complemented by nucleotide sequence4 and gene content5 analyses, with profound implications for recognizing life’s major domains4. Understanding of evolutionary relationships may be further expanded by comparing systemic higher-level organization among species. Here we employ multivariate analyses to evaluate the biochemical reaction pathways characterizing 43 species. Comparison of the information transfer pathways of Archaea and Eukaryotes indicates a close relationship between these domains. In addition, whereas eukaryotic metabolic enzymes are primarily of bacterial origin6, the pathway-level organization of archaeal and eukaryotic metabolic networks is more closely related. Our analyses therefore suggest that during the symbiotic evolution of eukaryotes, 7–9 incorporation of bacterial metabolic enzymes into the proto-archaeal proteome was constrained by the host’s pre-existing metabolic architecture.

Topology of evolving networks: local events and universality

R. Albert, A.-L. Barabási

Physical Review Letters 85, 5234-5237 (2000)

Networks grow and evolve by local events, such as the addition of new nodes and links, or rewiring of links from one node to another. We show that depending on the frequency of these processes two topologically different networks can emerge, the connectivity distribution following either a generalized power law or an exponential. We propose a continuum theory that predicts these two regimes as well as the scaling function and the exponents, in good agreement with numerical results. Finally, we use the obtained predictions to fit the connectivity distribution of the network describing the professional links between movie actors.

The large-scale organization of metabolic networks

H. Jeong, B. Tombor, R. Albert, Z. N. Oltvai, A.-L. Barabási

Nature 407, 651–655 (2000)

Here we present a systematic comparative mathematical analysis of the metabolic networks of 43 organisms representing all three domains of life.We show that, despite significant variation in their individual constituents and pathways, these metabolic networks have the same topological scaling properties and show striking similarities to the inherent organization of complex non-biological systems. This may indicate that metabolic organization is not only identical for all living organisms, but also complies with the design principles of robust and error-tolerant scale-free networks, and may represent a common blueprint for the large-scale organization of interactions among all cellular constituents.

Error and attack tolerance of complex networks

R. Albert, H. Jeong, A.-L. Barabási

Nature 406, 378–482 (2000)

Here we demonstrate that error tolerance is not shared by all redundant systems: it is displayed only by a class of inhomogeneouslywired networks,called scale-free networks, which include theWorld-WideWeb, the Internet, social networks and cells. We find that such networks display an unexpected degree of robustness, the ability of their nodes to communicate being unaffected even by unrealistically high failure rates.However, error tolerance comes at a high price in that these networks are extremely vulnerable to attacks (that is, to the selection and removal of a few nodes that play a vital role in maintaining the network’s connectivity). Such error tolerance and attack vulnerability are generic properties of communication networks.

Scale-free characteristics of random networks: the topology of the world wide web

A.-L. Barabási, R. Albert, H. Jeong

Physica A 281, 69-77 (2000)

The world-wide web forms a large directed graph, whose vertices are documents and edges are links pointing from one document to another. Here we demonstrate that despite its apparent random character, the topology of this graph has a number of universal scale-free characteristics. We introduce a model that leads to a scale-free network, capturing in a minimal fashion the self-organization processes governing the world-wide web.

Dynamics of complex systems: scaling laws for the period of boolean networks

R. Albert, A.-L. Barabási

Physical Review Letters 84, 5660-5663 (2000)

Boolean networks serve as models for complex systems, such as social or genetic networks, where each vertex, based on inputs received from selected vertices, makes its own decision about its state. Despite their simplicity, little is known about the dynamical properties of these systems. Here we propose a method to calculate the period of a finite Boolean system, by identifying the mechanisms determining its value. The proposed method can be applied to systems of arbitrary topology, and can serve as a roadmap for understanding the dynamics of large interacting systems in general.

Physics of the rhythmic applause

Z. Neda, E. Ravasz, T. Vicsek, Y. Brechet, A.-L. Barabási

Physical Review E 61, 6987-6992 (2000)

We report on a series of measurements aimed to characterize the development and the dynamics of the rhythmic applause in concert halls. Our results demonstrate that while this process shares many characteristics of other systems that are known to synchronize, it also has features that are unexpected and unaccounted for in many other systems. In particular, we find that the mechanism lying at the heart of the synchronization process is the period doubling of the clapping rhythm. The characteristic interplay between synchronized and unsynchronized regimes during the applause is the result of a frustration in the system. All results are understandable in the framework of the Kuramoto model.

Jamming and fluctuations in granular drag

I. Albert, P. Tegzes, B. Kahng, R. Albert, J.G. Sample, M.A. Pfeifer, A.-L. Barabási, T. Vicsek, P. Schiffer

Physical Review Letters 84, 5122–5125 (2000)

We investigate the dynamic evolution of jamming in granular media through fluctuations in the granular drag force. The successive collapse and formation of jammed states give a stick-slip nature to the fluctuations which is independent of the contact surface between the grains and the dragged object, thus implying that the stress-induced collapse is nucleated in the bulk of the granular sample. We also find that while the fluctuations are periodic at small depths, they become “stepped” at large depths, a transition which we interpret as a consequence of the long-range nature of the force chains.

Power-law distribution of the world wide web

A.-L. Barabási, R. Albert, H. Jeong, G. Bianconi

Science 287, 2115 (2000)

Barabasi and Albert propose an improved version of the Erdos-Renyi theory of random networks to account for the scaling properties of a number of systems, including the link structure of the World Wide Web (WWW). The theory they present, however, is inconsistent with empirically observed properties of the Web link structure.

The sound of many hands clapping

Z. Neda, E. Ravasz, Y. Brechet, T. Vicsek, A.-L. Barabási

Nature 403, 849-850 (2000)

An audience expresses appreciation for a good performance by the strength and nature of its applause. The thunder of applause at the start often turns quite suddenly into synchronized clapping, and this synchronization can disappear and reappear several times during the applause. The phenomenon is a delightful expression of social self-organization that provides an example on a human scale of the synchronization processes that occur in numerous natural systems, ranging from flashing Asian fireflies to oscillating chemical reactions 1–3. Here we explain the dynamics of this rhythmic applause.

Thermodynamic and kinetic mechanisms in self-assembled quantum dot formation

A.-L. Barabási

Materials Science and Engineering B 67, 23–30 (1999)

Heteroepitaxial growth of highly strained structures offers the possibility to fabricate islands with very narrow size distribution, coined self-assembling quantum dots (SAQD). In spite of the high experimental interest, the mechanism of SAQD formation is not well understood. We will show that equilibrium theories can successfully predict the island sizes and densities, the nature and the magnitude of the critical thickness needed to be deposited for SAQD formation, as well as the onset of ripening. Furthermore, the flux and temperature dependence of the SAQDs is described using kinetic Monte Carlo simulations.

Dynamics of ripple formation in sputter erosion: nonlinear phenomena

S. Park, B. Kahng, H. Jeong, A.-L. Barabási

Physical Review Letters 83, 3486–3489 (1999)

Many morphological features of sputter eroded surfaces are determined by the balance between ion-induced linear instability and surface diffusion. However, the impact of the nonlinear terms on the morphology is less understood. We demonstrate that, while at short times ripple formation is described by the linear theory, after a characteristic time the nonlinear terms determine the surface morphology by either destroying the ripples or generating a new rotated ripple structure. We show that the morphological transitions induced by the nonlinear effects can be detected by monitoring the surface width and the erosion velocity.

Emergence of scaling in random networks

A.-L. Barabási, R. Albert

Science 286, 509–512 (1999)

Systems as diverse as genetic networks or the World Wide Web are best described as networks with complex topology. A common property of many large networks is that the vertex connectivities follow a scale-free power-law distribution. This feature was found to be a consequence of two generic mechanisms: (i) networks expand continuously by the addition of new vertices, and (ii) new vertices attach preferentially to sites that are already well connected. A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.

Emergence of Scaling in Random Networks

Albert-László Barabási and Réka Albert

Science 286, 509-512 (1999)

Systems as diverse as genetic networks or the World Wide Web are best described as networks with complex topology. A common property of many large networks is that the vertex connectivities follow a scale-free power-law distribution. This feature was found to be a consequence of two generic mechanisms: (i) networks expand continuously by the addition of new vertices, and (ii) new vertices attach preferentially to sites that are already well connected. A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.

Mean-field theory for scale-free random networks

A.-L. Barabási, R. Albert, H. Jeong

Physica A 272, 173–187 (1999)

Random networks with complex topology are common in Nature, describing systems as diverse as the world wide web or social and business networks. Recently, it has been demonstrated that most large networks for which topological information is available display scale-free features. Here we study the scaling properties of the recently introduced scale-free model, that can account for the observed power-law distribution of the connectivities. We develop a mean-feld method to predict the growth dynamics of the individual vertices, and use this to calculate analytically the connectivity distribution and the scaling exponents. The mean-feld method can be used to address the properties of two variants of the scale-free model, that do not display power-law scaling.

Reducing vortex density in superconductors using the ratchet effect

C.-S. Lee, B. Janko, I. Derenyi, A.-L. Barabási

Nature 400, 337-340 (1999)

A serious obstacle impeding the application of low- and hightemperature superconductor devices is the presence of trapped magnetic flux1,2: flux lines or vortices can be induced by fields as small as the Earth’s magnetic field. Once present, vortices dissipate energy and generate internal noise, limiting the operation of numerous superconducting devices2,3. Methods used to overcome this difficulty include the pinning of vortices by the incorporation of impurities and defects4, the construction of flux ‘dams’5, slots and holes6, and magnetic shields2,3 which block the penetration of new flux lines in the bulk of the superconductor or reduce themagnetic field in the immediate vicinity of the superconducting device. The most desirable method would be to remove the vortices from the bulk of the superconductor, but there was hitherto no known phenomenon that could form the basis for such a process. Here we show that the application of an alternating current to a superconductor patterned with an asymmetric pinning potential can induce vortex motion whose direction is determined only by the asymmetry of the pattern. The mechanism responsible for this phenomenon is the so-called ‘ratchet effect’7–10, and its working principle applies to both low- and high-temperature superconductors. We demonstrate theoretically that, with an appropriate choice of pinning potential, the ratchet effect can be used to remove vortices from low-temperature superconductors in the parameter range required for various applications.

Molecular-dynamics investigation of the surface stress distribution in a GeSi quantum dot superlattice

I. Daruka, A.-L. Barabási, S. J. Zhou, T. C. Germann, P. S. Lomdahl, A. R. Bishop

Physical Review B 60, R2150-R2153 (1999)

The surface stress distribution in an ordered quantum dot superlattice is investigated using classical molecular dynamics simulations. We find that the surface stress field induced by various numbers ~from 1 to 9! of Ge islands embedded in a Si~001! substrate is in good agreement with analytical expressions based on pointlike embedded force dipoles, explaining the tendency of layered arrays to form vertically aligned columns. The short-ranged nature of this stress field implies that only the uppermost layers affect the surface growth and that their influence decreases rapidly with layer depth.

The physics of sandcastles: maximum angle of stability in wet and dry granular media

A.-L. Barabási, R. Albert, P. Schiffer

Physica A 266, 366-371 (1999)

We demonstrate that stability criteria can be used to calculate the maximum angle of stability, m, of a granular medium composed of spherical particles in three dimensions and circular discs in two dimensions. We apply the results to wet granular material by calculating the dependence of m on the liquid content of the material. The results are in good agreement with our experimental data.

Shape transition in growth of strained islands

I. Daruka, J. Tersoff, A.-L. Barabási

Physical Review Letters 82, 2753–2756 (1999)

Strained islands formed in heteroepitaxy sometimes change shape during growth. Here we show that there is typically a first-order shape transition with island size, with the discontinuous introduction of steeper facets at the island edge. We present a phase diagram for island shape as a function of volume and surface energy, showing how surface energy controls the sequence of island shapes with increasing volume. The discontinuous chemical potential at the shape transition drastically affects island coarsening and size distributions.

Collective motion of self-propelled particles: kinetic phase transition in one dimension

A. Czirok, A.-L. Barabási, T. Vicsek

Physical Review Letters 82, 209–212 (1999)

We demonstrate that a system of self-propelled particles exhibits spontaneous symmetry breaking and self-organization in one dimension, in contrast with previous analytical predictions. To explain this surprising result we derive a new continuum theory that can account for the development of the symmetry broken state and belongs to the same universality class as the discrete self-propelled particle model.

Slow drag in a granular medium

R. Albert, M.A. Pfeifer, P. Schiffer, A.-L. Barabási

Physical Review Letters 82, 205–208 (1999)

We have studied the drag force acting on an object moving with low velocity through a granular medium. Although the drag force is a dynamic quantity, its behavior in this regime is dominated by the inhomogeneous distribution of stress in static granular media. We find experimentally that the drag force on a vertical cylinder is linearly dependent on the cylinder diameter, quadratically dependent on the depth of insertion, and independent of velocity. An accompanying analytical calculation based on the static distribution of forces arrives at the same result, demonstrating that the local theory of stress propagation in static granular media can be used to predict this bulk dynamic property.

Liquid-induced transitions in granular media

P. Tegzes, R. Albert, M. Paskvan, A.-L. Barabási, T. Vicsek, P. Schiffer

Physical Review E 60, 5823–5826 (1999)

We investigate the effect of interstitial liquid on the physical properties of granular media by measuring the angle of repose as a function of the liquid content. The resultant adhesive forces lead to three distinct regimes in the observed behavior as the liquid content is increased: a granular regime in which the grains move individually, acorrelated regime in which the grains move in correlated clusters, and a plastic regime in which the grains flow coherently. We discuss these regimes in terms of two proposed theories describing the effects of liquid on the physical properties of granular media.

Diameter of the world wide web

R. Albert, H. Jeong, A.-L. Barabási

Nature 401, 130-131 (1999)

Despite its increasing role in communication, the World-Wide Web remains uncontrolled: any individual or institution can create a website with any number of documents and links. This unregulated growth leads to a huge and complex web, which becomes a large directed graph whose vertices are documents and whose edges are links (URLs) that point from one document to another. The topology of this graph determines the web’s connectivity and consequently how effectively we can locate information on it.

Spatial ordering of islands grown on patterned surfaces

C. Lee, A.-L. Barabási

Applied Physics Letters 73, 2651-2653 (1998)

We demonstrate that growth on a sample patterned with an ordered defect array can lead to islands with rather narrow size distribution. However, improvement in the size distribution is achieved only if the growth conditions ~flux and temperature! have optimal values, determined by the patterning length scale. Since the scanning tunelling and the atomic force microscopes are capable of inducing surface perturbations that act as potential preferential nucleation sites, our work demonstrates that nanoscale surface patterning can improve the ordering of platelets and self-assembled quantum dots.

Effect of surface roughness on the secondary ion yield in ion sputtering

M. A. Makeev, A.-L. Barabási

Applied Physics Letters 73, 1445–1447 (1998)

There is extensive experimental evidence that, at low temperatures, surface erosion by ion bombardment roughens the sputtered substrate, leading to a self-affine surface. These changes in the surface morphology also modify the secondary ion yield. Here, we calculate analytically the secondary ion yield in terms of parameters characterizing the sputtering process and the interface roughness.

Driven interfaces in disordered media: determination of universality classes from experimental data

R. Albert, A.-L. Barabási, N Carle, A. Dougherty

Physical Review Letters 81, 2926-2929 (1998)

While there have been important theoretical advances in understanding the universality classes of interfaces moving in porous media, the developed tools cannot be directly applied to experiments. Here we introduce a method that can distinguish the isotropic and directed percolation universality classes from snapshots of the interface profile. We test the method on discrete models whose universality class is well known, and use it to identify the universality class of interfaces obtained in experiments on fluid flow in porous media.

Secondary ion yield changes on rippled interfaces

M. A. Makeev, A.-L. Barabási

Applied Physics Letters 72, 906–908 (1998)

Sputter erosion often leads to the development of surface ripples. Here we investigate the effect of the ripples on the secondary ion yield, by calculating the yield as a function of the microscopic parameters characterizing the ion cascade ~such as penetration depth, widths of the deposited energy distribution! and the ripples ~ripple amplitude, wavelength!. We find that ripples can trongly enhance the yield, with the magnitude of the effect depending on the interplay between the ion and ripple characteristics. Furthermore, we compare our predictions with existing experimental results.

Equilibrium phase diagrams for dislocation free self-assembled quantum dots

I. Daruka, A.-L. Barabási

Applied Physics Letters 72, 2102–2104 (1998)

The equilibrium theory of self-assembled quantum dot ~SAQD! formation can account for many of the experimentally observed growth modes. Here, we show that despite the large number of material constants entering the free energy of strained islands, there are only four topologically different phase diagrams describing the SAQD formation process. We derive each of these phase diagrams and discuss the physical properties of the predicted growth modes.

Dynamics of ripening of self-assembled II-VI semiconductor quantum dots

S. Lee, I. Daruka, C.S. Kim, A.-L. Barabási, J.L. Merz, J.K. Furdyna

Physical Review Letters 81, 3479-3482 (1998)

We report the systematic investigation of ripening of CdSe self-assembled quantum dots (QDs) on ZnSe. We investigate the size and density of the QDs as a function of time after deposition of CdSe has stopped. The dynamics of the ripening process is interpreted in terms of the theory of Ostwald ripening. Furthermore, the experimental results allow us to identify the growth mode of the QD formation process.

Ratchet effect in surface electromigration: smoothing surfaces by an ac field

I. Derenyi, C.-S. Lee, A.-L. Barabási

Physical Review Letters 80, 1473–1476 (1998)

We demonstrate that for surfaces that have a nonzero Schwoebel barrier the application of an ac field parallel to the surface induces a net electromigration current that points in the descending step direction. The magnitude of the current is calculated analytically and compared with Monte Carlo simulations. Since a downhill current smoothes the surface, our results imply that the application of ac fields can aid the smoothing process during annealing and can slow or eliminate the Schwoebel-barrier-induced mound formation during growth.

Irregularities and power law distributions in the breathing pattern in preterm and term infants

U. Frey, M. Silverman, A.-L. Barabási, B. Suki

Journal of Applied Physiology 85, 789–797 (1998)

Irregularities and power law distributions in the breathing pattern in preterm and term infants. J. Appl. Physiol. 86(3): 789–797, 1998.—Unlike older children, young infants are prone to develop unstable respiratory patterns, suggesting important differences in their control of breathing. We examined the irregular breathing pattern in infants by measuring the time interval between breaths (‘‘interbreath interval; IBI) assessed from abdominal movement during 2 h of sleep in 25 preterm infants at a postconceptional age of 40.5 6 5.2 (SD) wk and in 14 term healthy infants at a postnatal age of 8.2 6 4 wk. In 10 infants we performed longitudinal measurements on two occasions. We developed a threshold algorithm for the detection of a breath so that an IBI included an apneic period and potentially some periods of insufficient tidal breathing excursions (hypopneas). The probability density distribution (P) of IBIs follows a power law, P(IBI),IBI2a, with the exponent a providing a statistical measurement of the relative risk of insufficient breathing.With maturation, a increased from 2.62 6 0.4 at 41.2 6 3.6 wk to 3.22 6 0.4 at 47.3 6 6.4 wk postconceptional age, indicating a decrease in long hypopneas (for paired data P 5 0.002). The statistical properties of IBI were well reproduced in a model of the respiratory oscillator on the basis of two hypotheses: 1) tonic neural inputs to the respiratory oscillator are noisy; and 2) the noise explores a critical region where IBI diverges with decreasing tonic inputs. Accordingly, maturation of infant respiratory control can be explained by the tonic inputs moving away from this critical region. We conclude that breathing irregularities in infants can be characterized by a, which provides a link between clinically accessible data and the neurophysiology of the respiratory oscillator.

Self-assembled growth of II-VI quantum dots

J. K. Furdyna, S. Lee, I. Daruka, C.S. Kim, A.-L. Barabási, M. Dobrowolska, J.L. Merz

Nonlinear Optics 18, 85–92 (1997)

  • ABSTRACT

Maximum angle of stability in wet and dry spherical granular media

R. Albert, I. Albert, D. Hornbaker, P. Schiffer, A.-L. Barabási

Physical Review E 56, R6271–R6274 (1997)

We demonstrate that stability criteria can be used to calculate the maximum angle of stability u m of a granular medium composed of spherical particles in three dimensions and circular disks in two dimensions. The predicted angles are in good agreement with the experimental results. Furthermore, we determine the dependence of u m on cohesive forces, applying the results to wet granular material by calculating the dependence of u m on the liquid content of the material. We have also studied wet granular media experimentally and find good agreement between the theory and our experimental results.

Dislocation free island formation in heteroepitaxial growth: a study at equilibrium

I. Daruka, A.-L. Barabási

Physical Review Letters 79, 3708–3711 (1997)

We investigate the equilibrium properties of strained heteroepitaxial systems, incorporating the formation and the growth of a wetting film, dislocation-free island formation, and ripening. The derived phase diagram provides a detailed characterization of the possible growth modes in terms of the island density, equilibrium island size, and wetting layer thickness. Comparing our predictions with experimental results we discuss the growth conditions that can lead to stable islands as well as ripening.

Ion-induced surface diffusion in ion sputtering

M. A. Makeev, A.-L. Barabási

Applied Physics Letters 71, 2800–2802 (1997)

Ion bombardment is known to enhance surface diffusion and affect the surface morphology. Here we demonstrate that preferential erosion during ion sputtering can lead to a physical phenomenon reminiscent of surface diffusion, what we call effective surface diffusion ~ESD!, that does not imply mass transport along the surface and is independent of the temperature. We calculate the ion-induced ESD constant and its dependence on the ion energy, flux and angle of incidence, showing that sputtering can both enhance and suppress surface diffusion. The influence of ion-induced ESD on ripple formation and roughening of ion-sputtered surfaces is discussed and summarized in a morphological phase diagram.

What keeps sandcastles standing?

D. J. Hornbaker, R. Albert, I. Albert, A.-L. Barabási, P. Schiffer

Nature 387, 765 (1997)

Any child playing on the beach knows that the physical properties of wet and dry sand are very different. Wet sand can be used to build sharp-featured sandcastles that would be unstable in dry sand. We have now quantified the effect of adding small quantities of liquid to a granular medium. Nanometre-scale layers of liquid on millimetre-scale grains dramatically increase the repose angle (the steepest stable slope that the substance can form) and allow the development of long-range correlations, or clumps.

Self-assembled island formation in heteroepitaxial growth

A.-L. Barabási

Applied Physics Letters 70, 2565-2567 (1997)

We investigate island formation during heteroepitaxial growth using an atomistic model that incorporates deposition, activated diffusion, and stress relaxation. For high misfit the system naturally evolves into a state characterized by a narrow island size distribution. The simulations indicate the existence of a strain assisted kinetic mechanism responsible for the self-assembling process, involving enhanced detachment of atoms from the edge of large islands and biased adatom diffusion.

Island formation and critical thickness in heteroepitaxy

I. Daruka, A.-L. Barabási

Physical Review Letters 78, 3027 (1997)

In a recent Letter Chen and Washburn [1] proposed a mechanism for island nucleation in large-mismatch heteroepitaxy. The predicted coverage sQd dependence of the 3D island density ri sQd reproduces the fast increase in the island density near the critical coverage Qc ø 1.6 ML [2]. Here we show that the critical coverage predicted by Ref. [1] depends strongly on the growth rate, thus contradicting, among others, the experimental results of Refs. [2,3].

Self-organized superlattice formation in II-VI and III-V semiconductors

A.-L. Barabási

Applied Physics Letters 70, 764–767 (1997)

There is extensive recent experimental evidence of spontaneous superlattice ~SL! formation in various II–VI and III–V semiconductors. Here we propose an atomistic mechanism responsible for SL formation, and derive a relation predicting the temperature, flux, and miscut dependence of the SL layer thickness. Moreover, the model explains the existence of a critical miscut angle below which no SL is formed, in agreement with results on ZnSeTe, and predicts the formation of a platelet structure for deposition onto high symmetry surfaces, similar to that observed in InAsSb.
We present an analytical study of the interaction of two nonequilibrium conservative fields. Due to the conservative character of the relaxation mechanism, the scaling exponents can be obtained exactly using dynamic renormalization group. We apply our results to surfactant-mediated growth of semiconductors. We find that the coupling between the surfactant thickness and the interface height cannot account for the experimentally observed layered growth, implying that reduced diffusion of the embedded atoms is a key mechanism in surfactant-mediated growth.

Roughening of growing surfaces: kinetic models and continuum theories

A.-L. Barabási

Computational Materials Science 6, 127-134 (1996)

The use of scaling concepts in understanding growth by molecular beam epitaxy (MBE) is increasingly important these days. Here we present a critical discussion on the advantages and disadvantages of kinetic theories and continuum models, two main methods frequently used to study the roughening and scaling of surfaces grown by MBE. Finally, some open problems faced by these approaches are also discussed.

Ballistic random walker

P. Molinas-Mata, M.A. Munoz, D.O. Martinez, A.-L. Barabási

Physical Review E 54, 968–971 (1996)

We introduce and investigate the scaling properties of a random walker that moves allistically on a two-dimensional square lattice. The walker is scattered ~changes direction randomly! every time it reaches a previously unvisited site, and follows ballistic trajectories between two scattering events. The asymptotic properties of the density of unvisited sites and the diffusion exponent can be calculated using a mean-field theory. The obtained predictions are in good agreement with the results of extensive numerical simulations. In particular, we show that this random walk is subdiffusive.

Elastic string in a random medium

H. A. Makse, A.-L. Barabási, H.E. Stanley

Physical Review E 53, 6573–6576 (1996)

We consider a one-dimensional elastic string as a set of massless beads interacting through springs characterized by anisotropic elastic constants. The string, driven by an external force, moves in a medium with quenched disorder. We find that longitudinal fluctuations lead to nonlinear behavior in the equation of motion that is kinematically generated by the motion of the string. The strength of the nonlinear effects depends on the anisotropy of the medium and the distance from the depinning transition. On the other hand, the consideration of restricted solid-on-solid conditions imposed on the string leads to a nonlinear term with a diverging coefficient at the depinning transition.

Invasion percolation and global optimization

A.-L. Barabási

Physical Review Letters 76, 3750–3753 (1996)

Invasion bond percolation (IBP) is mapped exactly into Prim’s algorithm for finding the shortest spanning tree of a weighted random graph. Exploring this mapping, which is valid for arbitrary dimensions and lattices, we introduce a new IBP model that belongs to the same universality class as IBP and generates the minimal energy tree spanning the IBP cluster.

Avalanches in the lung: a statistical mechanical approach

A.-L. Barabási, S.V. Buldyrev, H.E. Stanley, B. Suki

Physical Review Letters 76, 2192–2195 (1996)

We study a statistical mechanical model for the dynamics of lung inflation which incorporates recent experimental observations on the opening of individual airways by a cascade or avalanche mechanism. Using an exact mapping of the avalanche problem onto percolation on a Cayley tree, we analytically derive the exponents describing the size distribution of the first avalanches and test the analytical solution by numerical simulations. We find that the treelike structure of the airways, together with the simplest assumptions concerning opening threshold pressures of each airway, is sufficient to explain the existence of power-law distributions observed experimentally.

Directed surfaces in disordered media

A.-L. Barabási, G. Grinstein, M.A. Munoz

Physical Review Letters 76, 1481–1484 (1996)

A fractal model for the first stages of thin film growth

P. Jensen, A.-L. Barabási, H. Larralde, S. Havlin, H.E. Stanley

Fractals 4, 321–329 (1996)

In this paper, we briefly review a model that describes the diffusion-controlled aggregation exhibited by particles as they are deposited on a surface. This model allows us to understand many experiments of thin film deposition. In the Sec. 1, we describe the model, which incorporates deposition, particle and cluster diffusion, and aggregation. In Sec. 2, we study the dynamical evolution of the model. Finally, we analyze the effects of small cluster mobility and show that the introduction of cluster diffusion dramatically affects the dynamics of film growth. Some of these effects can be tested experimaentally.

Avalanches in the directed percolation depinning and self-organized depinning models of interface roughening

S. V. Buldyrev, L.A.N. Amaral, A.-L. Barabási, S.T. Harrington, S. Havlin, R. Sadr-Lahijani, H.E. Stanley

Fractals 4, 307–319 (1996)

We review the recently introduced Directed Percolation Depinning (DPD) and Self-Organized Depinning (SOD) models for interface roughening with quenched disorder. The difference in the dynamics of the invasion process in these two models are discussed and different avalanche definitions are presented. The scaling properties of the avalanche size distribution and the properties of active cells are discussed.

Growth and percolation of thin films: a model incorporating deposition, diffusion and aggregation

P. Jensen, A.-L. Barabási, H. Larralde, S. Havlin, H.E. Stanley

Chaos, Solitons and Fractals 6, 227–232 (1995)

We propose a model for describing diffusion-controlled aggregation of particles that are continually deposited on a surface. The model, which incorporates deposition, diffusion and aggregation, is motivated by recent thin film deposition experiments. We find, that the diffusion and aggregation of randomly deposited particles “builds” a wide variety of fractal structures, all characterized by a common length scale L1. This length LI scales as the ratio of the diffusion constant over the particle flux to the power l/4. We compare our msults with several recent experiments on two-dimensional nanostructures formed by diffusion-controlled aggregation on surfaces.

Dynamics of Random Networks: Connectivity and First Order Phase Transitions

Albert-László Barabási

arXiv:cond-mat/9511052

  • ABSTRACT

The connectivity of individual neurons of large neural networks determine both the steady state activity of the network and its answer to external stimulus. Highly diluted random networks have zero activity. We show that increasing the network connectivity the activity changes discontinuously from zero to a finite value as a critical value in the connectivity is reached. Theoretical arguments and extensive numerical simulations indicate that the origin of this discontinuity in the activity of random networks is a first order phase transition from an inactive to an active state as the connectivity of the network is increased.

Dynamic scaling of ion-sputtered surfaces

R. Cuerno, A.-L. Barabási

Physical Review Letters 74, 4746–4749 (1995)

We derive a stochastic nonlinear equation to describe the evolution and scaling properties of surfaces eroded by ion bombardment. The coefficients appearing in the equation can be calculated explicitly in terms of the physical parameters characterizing the sputtering process. We find that transitions may take place between various scaling behaviors when experimental parameters, such as the angle of incidence of the incoming ions or their average penetration depth, are varied.

Avalanches and the directed percolation depinning model: experiments, simulations and theory

L.A.N. Amaral, A.-L. Barabási, S.V. Buldyrev, S.T. Harrington, S. Havlin, R. Sadr-Lahijani, H.E. Stanley

Physical Review E 51, 4655–4673 (1995)

Scaling properties of driven interfaces in disordered media

L.A.N. Amaral, A.-L. Barabási, H.A. Makse, H.E. Stanley

Physical Review E 52, 4087–5005 (1995)

Lung tissue viscoelasticity: a mathematical framework and its molecular basis

B. Suki, A.-L. Barabási, K. Lutchen

Journal of Applied Physiology 76, 2749–2759 (1994)

Recent studies indicated that lung tissue stress relaxation is well represented by a simple empirical equation involving a power law, t+ (where t is time). Likewise, tissue impedance is well described by a model having a frequency-independent (constant) phase with impedance proportional to 0 -(r (where w is angular frequency and a! is a constant). These models provide superior descriptions over conventional springdashpot systems. Here we offer a mathematical framework and explore its mechanistic basis for using the power law relaxation function and constant-phase impedance. We show that replacing ordinary time derivatives with fractional time derivatives in the constitutive equation of conventional spring-dashpot systems naturally leads to power law relaxation function, the Fourier transform of which is the constant-phase impedance with a! = 1 - @. We further establish that fractional derivatives have a mechanistic basis with respect to the viscoelasticity of certain polymer systems. This mechanistic basis arises from molecular theories that take into account the complexity and statistical nature of the system at the molecular level. Moreover, because tissues are composed of long flexible biopolymers, we argue that these molecular theories may also apply for soft tissues. In our approach a key parameter is the exponent & which is shown to be directly related to dynamic processes at the tissue fiber and matrix level. By exploring statistical properties of various polymer systems, we offer a molecular basis for several salient features of the dynamic passive mechanical properties of soft tissues.

Connectivity of diffusing particles continually deposited on a surface: relation to LECBD experiments

P. Jensen, A.-L. Barabási, H. Larralde, S. Havlin, H.E. Stanley

Physica A 207, 219–227 (1994)

We generalize the conventional model of two-dimensional site percolation by including both (1) continuous deposition of particles on a two-dimensional substrate, and (2) diffusion of these particles in two-dimensions. This new model is motivated by recent thin film deposition experiments using the low-energy cluster beam deposition (LECBD) technique. Depending on various parameters such as deposition flux, diffusion constant, and system size, we find a rich range of fractal morphologies including diffusion limited aggregation (DLA), cluster-cluster aggregation (CCA), and percolation.

Avalanches and power law behavior in lung inflation

B. Suki, A.-L. Barabási, Z. Hantos, F. Petak, H.E. Stanley

Nature 368, 615–618 (1994)

When lungs are emptied during exhalation, peripheral airways close up. For people with lung disease, they mey not reopen for a significant portion of inhalation, impairing gas exchange. A knowledge of the mechanisms that govern reinflation of collapsed regions of lungs is therefore central to the development of ventilation strategies for combating respiratory problems. Here we report measurements of the terminal airway resistance, Rt, during the opening of isolated dog lungs. When inflated by a constant flow, Rt decreases in discrete jumps. We find that the probability distribution of the sizes of the jumps and of the time intervals between them exhibit power-law behavior over two decades. We develop a model of the inflation process in which 'avalanches' of airway openings are seen--with power-law distributions of both the size of avalanches and the time intervals between them--which agree quantitatively with those seen experimentally, and are reminiscent of the power-law behavior observed for self-organized critical systems. Thus power-law distributions, arising from avalanches associated with threshold phenomena propagating down a branching tree structure, appear to govern the recruitment of terminal airspaces.

New exponent characterizing the effect of evaporation on imbibition experiments

L.A.N. Amaral, A.-L. Barabási, S.V. Buldyrev, S. Havlin, H.E. Stanley

Physical Review Letters 72, 641–644 (1994)

We report imbibition experiments investigating the effect of evaporation on the interface roughness and mean interface height. We observe a new exponent characterizing the scaling of the saturated surface width. Further, we argue that evaporation can be usefully modeled by introducing a gradient in the strength of the disorder, in analogy with the gradient percolation model of Sapoval et al. By incorporating this gradient we predict a new critical exponent and a novel scaling relation for the interface width. Both the exponent value and the form of the scaling agree with the experimental results.

New exponent characterizing the effect of evaporation on imbibition experiments

L.A.N. Amaral, A.-L. Barabási, S.V. Buldyrev, S. Havlin, H.E. Stanley

Physical Review Letters 72, 641–644 (1994)

We report imbibition experiments investigating the effect of evaporation on the interface roughness and mean interface height. We observe a new exponent characterizing the scaling of the saturated surface width. Further, we argue that evaporation can be usefully modeled by introducing a gradient in the strength of the disorder, in analogy with the gradient percolation model of Sapoval et al. By incorporating this gradient we predict a new critical exponent and a novel scaling relation for the interface width. Both the exponent value and the form of the scaling agree with the experimental results.

Universality classes for interface growth with quenched disorder

L.A.N. Amaral, A.-L. Barabási, H.E. Stanley

Physical Review Letters 73, 62–65 (1994)

Controlling nanostructures

P. Jensen, A.-L. Barabási, H. Larralde, S. Havlin, H.E. Stanley

Nature 368, 22 (1994)

Roder et al. report nanometrescales structures built by deposition of diffusing particles that aggregate on surfaces. We have developed a microscopic model that mimics the same process, and produces morphologies that remarkable resemble the experimental structures.

Surfactant-mediated surface growth: nonequilibrium theory

A.-L. Barabási

Fractals 1, 846–859 (1993)

A number of recent experiments have shown that surfactants can modifiy the growth mode of an epitaxial film, suppressing islanding and promoting lyer-by-layer growth. hee, a set of coupled equations are introduced to describe the coupling between a growing interface and a thin surfactant layer deposited on the top of the nonequilibrium surface. The equations are derived using the main experimentally backed characteristics of the system and basic symmetry principles. The system is studied using dynamic-normalization-group scheme, which provides scaling relations between the roughness exponents. It is found that the surfactant may drive the system nto a novel phase, in which the surface roughness is negative, corresponding to a flat surface.

Surfactant-mediated growth of nonequilibrium interfaces

A.-L. Barabási

Physical Review Letters 70, 4102–4105 (1993)

A number of recent experiments have shown that surfactants can modify the growth mode of an epitaxial film, suppressing islanding and promoting layer-by-layer growth. Here I introduce a set of coupled equations to describe the nonequilibrium roughening of an interface covered with a thin surfactant layer. The surfactant may drive the system into a novel phase, in which the surface roughness is negative, corresponding to a flat surface.

Anomalous interface roughening: the role of a gradient in the density of pinning sites

L.A.N. Amaral, A.-L. Barabási, S.V. Buldyrev, S. Havlin, H.E. Stanley

Fractals 1, 818–826 (1993)

Anomalous interface roughening in 3D porous media: experiment and model

S.V. Buldyrev, A.-L. Barabási, S. Havlin, J. Kertesz, H.E. Stanley, H.S. Xenias

Physica A 191, 220–226 (1992)

Dynamic scaling of coupled nonequilibrium interfaces

A.-L. Barabási

Physical Review A 46, R2977– R2980 (1992)

We propose a simple discrete model to study the nonequilibrium fluctuations of two locally coupled (1+1)-dimensional systems (interfaces). Measuring numerically the tilt-dependent velocity we construct a set of stochastic continuum equations describing the fluctuations in the model. The scaling predicted by the equations is studied analitically using dynamic-renormalization-group theory and compared with simulation results.

Three-dimensional Toom model: connection to the Anisotropic Kardar-Parisi-Zhang Equation

A.-L. Barabási, M. Araujo, H.E. Stanley

Physical Review Letters 68, 3729–3732 (1992)

A three-dimensional Toom model is defined and the properties of the interface separeting the two stable phases are investigated. Using symmetry arguments we show that in the zero-noise limit the model has only nonequilibrium fluctuations and that the scaling is decribed by the anisotropic Kardar-Parisi-Zhang equation. The scaling exponents are determined numerically and good agreement with the theoretical predictions is found.

Anomalous interface roughening in porous media: experiment and model

S.V. Buldyrev, A.-L. Barabási, F. Caserta, S. Havlin, H.E. Stanley, T. Vicsek

Physical Review A 45, R8313–R8316 (1992)

We report measurements of the interface formed when a wet front propogates in paper by imbibition and we find anomalous roughening with exponent α=0.63±0.04. We also formulate an imbibition model that agrees with the experimental morphology. The main ingredient of the model is the propogation and pinning of a self-affine interface in the presence of quenched disorder, with erosion of overhangs. By relating our model to directed percolation, we find α~0.63.

Multifractality of growing surfaces

A.-L. Barabási, R. Bourbonnais, M. Jensen, J. Kertesz, T. Vicsek, Y.-C. Zhang

Physical Review A 45, R6951–R6954 (1992)

We have carried out large-scale computer stimulation of experimentally motivated (1+1)- dimensional modes of kinetic surface roughening with power-law-distributed amplitudes of uncorrelated noise. The appropriately normalized qth-order correlation function of the height differences Cq(x)=<|h(x+x')-h(x')|q> shows strong multifractal scaling behavior up to a crossover length depending on the system size, i.e. Cq(x)~xqHq, where Hq is a continuously changing nontrivial function. Beyond the crossover length conventional scaling is found.

Lee et al. reply to Dynamics of ripening of self-assembled II-VI semiconductor quantum dots

S. Lee, I. Daruka, C.S. Kim, A.-L. Barabási, J.K. Furdyna, J.L. Merz

Physical Review Letters 83, 240 (1999)

Despite extensive investigation, little is still known about the physical mechanisms responsible for quantum dot (QD) formation in II-VI semiconductor systems, especially when compared to their group-IV or III-V counterparts. However, the distinct chemical and microscopic features characteristic of the various materials make these diverse systems rather exciting to study and compare. We therefore welcome the Comment by Kratzert et al. [1] that sheds new light on the CdSe island formation on ZnSe. The method used by them—in situ ultrahigh vacuum atomic force microscopy (AFM)—provides valuable information that was not accessible before: it allows one to probe the dynamics of QD formation without external influences (such as the influence of the atmosphere), and it offers minimum delay between QD formations and their characterization. Specifically, and in contrast with our findings [2], these new results do not manifest room temperature ripening of CdSe islands. These new observations, combined with a number of other results recently reported (see below), suggest the existence of three distinct island types:
A stochastic one-dimensional map is introduced to model the steady-state fluctuations of the surface width in far-from-equilibrium surface roughening. The dynamics of the map and the correlations in the time sequence are investigated. In particular, for power law distributed noise a non-trivial multi-affine behaviour is observed.

Multi-affine model for the velocity distribution in fully turbulent flows

T. Vicsek, A.-L. Barabási

Journal of Physics A 24, L845–L851 (1991)

A simple multi-affine model for the velocity distribution in fully developed turbulent flows is introduced to capture the essential features of the underlying geometry of the velocity field. We show that in this model the various relevant quantities characterizing different aspects of turbulence can be readily calculated A simultaneous good agreement is found with the available exprimental data for the velocity structure functions, the Dq spectra obtained from studies of the velocity derivatives, and the exponent describing the scaling of the spectrum of the kinetic energy fluctuations. Our results are obtained analytically assuming a single free parameter. The fractal dimension of the region where the dominating contribution to dissipation comes from is estimated to be D=2.88.

Multifractality of self-affine fractals

A.-L. Barabási, T. Vicsek

Physical Review A 44, 2730–2733 (1991)

The concept of multifractality is extended to self-affine fractals in order to provide a more complete description of fractal surfaces. We show that for a class of iteratively constructed self-affine functions there exists an infinite hierarchy of exponents Hq describing the scaling of the qth order height-height correlationfunction Cq(x)~xqhq. Possible applications to random walks and turbulent flows are discussed. It is demonstratedon on the example of random walks along a chain that for stochastic lattice models leading to self-affine fractals Hq exhibits phase-transition-like behavior.

Multifractal spectra of multi-affine functions

A.-L. Barabási, P. Szepfalusy, T. Vicsek

Physica A 178, 17–28 (1991)

Self-affine fiunctions F(x) with multiscaling height correlations Cq(x)~XqHq are described in terrms of the standard multifractal formalism with a modified assumption for the partition. The corresponding quantities and expressions are shown to exhibit some characteristic differences from the standard ones. According to our calculations the f(a) type spectra are not uniquely determined by the Hq spectrum, but, depend on the particular which is made for the dependence of N on x, where N is the number of points over which the average is taken. Our results are expected to be relevant in the analysis of signal type data obtained in experiments on systems which an underlying multiplicative process.

Self-similarity of the loop structure of diffusion-limited aggregates

T. Vicsek, Albert-Laszlo Barabási

Journal of Physics A 23, L845–L851 (1990)

The structure of fjords in diffusion-limited aggregation (DLA) clusters can be desribed in terms of the loop size distribution nR(x) which is the normalised number of loops with a neck to depth ratio x within a circle of radius R centered at the origin of the cluster. We find from the numerical study of very large off-lattice aggregates that nR(x) converges quickly to a limiting distribution with a well-defined smallest ratio xmin larger than zero indicating the self-similarity of the loop structures, one does not expect a phase transition in multifractal spectrum of growth probabilities of typical DLA clusters generated on the plain. Our study is essentially statistical and we cannot rule out the possibility of such 'rare events' (e.g. the occurence of a few loops with anomalously small x) which may result in a qualitatively different behavior concerning the multifractal spectrum.

Tracing a diffusion-limited-aggregate: self-affine versus self-similar scaling

A.-L. Barabási, T. Vicsek

Physical Review A 41, 6881–6883 (1990)

The geometry of diffusion-limited aggregation clusters is mapped into single-valued functions by tracing the surface of the aggregate and recording the X (or Y) coordinate of the position of a walker moving along perimeter of the cluster as a function of the arc length. Our numerical results and scaling arguments show that the related plots can be considered as self-affine functions whose scaling behavior is determined by the exponent H=1/D, where D is fractal dimension of the aggregates.

Supertracks and nth order windows in the chaotic regime

A.-L. Barabási, L. Nitsch, I.A. Dorobantu

Physics Letters A 139, 53-56 (1989)

The purpose of this paper is to generalize the concept of supertrack functions (STF), to sketch the main lines of a renormalization theory of STF and to obtain a scaling relation yielding nth order windows in the chaotic domain for the large calss of one-dimensional maps.

On crises and supertracks: an attempt of a unified theory

A.-L. Barabási, L. Nitsch, I. A. Dorobantu

Revue Roumanie de Physique 34, 353-357 (1989)

An attempt is presented to study from a unified point of view crises and supertracks. The concept of n-th order crises is introduced and used to establish a general frame or describing the crises of one-dimensional maps.