Data, Knowledge, and the Web

The advent of large-scale data on the Web and elsewhere poses new challenges and opportunities. Concepts, models, and algorithms from several fields, including database systems, information retrieval, natural language processing, statistical learning, and data mining can help us to analyze and learn from this data.

Groups and Researchers in this Field

Algorithms & Inequality

Rediet Abebe is a junior fellow at the Harvard Society of Fellows and an Andrew Carnegie Fellow. Her research examines the interaction of algorithms and inequality, with a focus on contributing to the scientific foundations of this area. Abebe has also co-founded numerous organizations, including the ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO), and the associated international research initiative. Abebe is the recipient of numerous awards and honours, including the Hector Endowed Fellowship by the European Laboratory for Learning and Intelligent Systems (ELLIS), MIT Technology Fellows 35 Innovators under 35, the ACM SIGKDD Dissertation Award, and an honorable mention for the ACM SIGecom Dissertation Award. Abebe is currently leading several large-scale evaluations of ML systems used in commercial, legal, and policy contexts. Read more

Rediet Abebe

MPI-IS, Adjunct Faculty
Personal Website

Responsible Computing

Asia Biega is a tenure-track faculty member at the MPI for Security and Privacy. Through interdisciplinary collaborations, she designs ethically, socially, and legally responsible information and social computing systems and studies how they interact with and influence their users. Before joining Microsoft Research, she completed her PhD summa cum laude at the Max Planck Institute for Informatics and Saarland University. Her doctoral work focused on the issues of privacy and fairness in search systems. She has published her work in leading information retrieval, Web, and data mining venues. Beyond academia, her perspectives and methodological approaches are informed by an industrial experience, including work on privacy infrastructure at Google and consulting for Microsoft product teams on issues related to FATE (Fairness, Accountability, Transparency, and Ethics) and privacy. Read more

Asia Biega

MPI-SP, Faculty
Personal Website

Data Systems

Laurent Bindschaedler is a Research Group Leader at the Max Planck Institute for Software Systems, where he leads the Data Systems Group (DSG). Focused on applications, his group explores a wide range of topics at the intersection of systems, data management, and machine learning, such as systems for big data and machine learning, machine learning for systems, real-time analytics systems, and decentralized systems like blockchains. Laurent is known for building the Chaos graph processing system, which holds a record for the largest graph processed on a small cluster of commodity servers. The Data Systems Group is dedicated to advancing the field of data systems by developing innovative methods, tools, and technologies to manage and analyze large-scale data sets, thereby empowering organizations and researchers to unlock the full potential of their data for innovation, improved decision-making, and complex problem-solving. Read more

Laurent Bindschaedler

MPI-SWS, Research Group Leader
Personal Website

Data Science for Humanity

Meeyoung Cha is a scientific director of MPI-SP in Bochum, Germany. Her interests include data science and computational social science, with a focus on understanding social information and human-machine interactions. Meeyoung’s research on misinformation, poverty mapping, fraud detection, and long-tail content has received wide citations and best paper awards. She is the recipient of the Korean Young Information Scientist Award 2019, the AAAI ICWSM Test-of-Time! Award 2020, and the ACM IMC Test-of-Time Award 2022. Prior to joining MPI, Meeyoung was a chief investigator at IBS (2019-current), a faculty member at KAIST (2010-current), a visiting professor at Facebook (2015-2016), and a postdoctoral researcher at MPI-SWS (2008-2010). She received her Ph.D. in computer science from KAIST in 2008. Read more

Meeyoung Cha

MPI-SP, Scientific Director
Personal Website

Human-Centric Machine Learning

Manuel Gomez Rodriguez is interested in developing machine learning and large-scale data mining methods for analysis and modeling of large real-world networks and processes that take place over them. His research comprises several dimensions: developing models of these networks and processes, assessing their theoretical properties and limitations; developing machine learning algorithms to fit the models and computational methods to influence processes over networks; and validating models and methods on gigabite- and terabyte-scale real-world datasets. Ultimately, he aims to provide computational tools with applications in a variety of domains, e.g. social and information sciences, economics, decision theory, causality, and epidemiology. Read more

Manuel Gomez Rodriguez

MPI-SWS, Faculty
Personal Website

Social Computing

Krishna Gummadi heads the Social Computing research group at the Max Planck Institute for Software Systems. He is broadly interested in understanding and building networked and distributed computer systems. Currently, the group's research focuses on social computing systems: an emerging class of societal-scale human-computer systems that facilitate interactions and knowledge exchange between individuals, organizations, and governments in our society. A few examples include social networking sites like Facebook, blogging and microblogging sites like LiveJournal and Twitter, and content sharing sites like YouTube, among many others. Through user studies, examining data, and building systems, the group aims to understand, predict, and control the behavior of their constituent human users and computer systems. Read more

Krishna Gummadi

MPI-SWS, Faculty
Personal Website

Algorithms and Society

Celestine Mendler-Dünner is a research group leader at MPI-IS, a Principal Investigator at the ELLIS Institute Tübingen, and a faculty member of the Tübingen AI Center. Her research spans machine learning, prediction and algorithmic decision-making with a focus on the societal embedding of technology, broadly scoped. She pursues theoretical as well as empirical questions that shed light on how data-driven systems interact with society, and how to build reliable systems in dynamic environments. She obtained her PhD from ETH Zurich in computer science and before moving to Tübingen she spent two years as a SNSF postdoctoral fellow at UC Berkeley. Her research contributions have been recognized with the ETH Medal, the Fritz Kutter Prize and the IBM Eminence and Excellence award. She is a fellow of the Elisabeth Schiemann Kolleg, and a member of the Tübingen Cluster of Excellence on ML for Science. Read more

Celestine Mendler-Dünner

MPI-IS, Research Group Leader
Personal Website

Bridging AI and Neuroscience

Mariya Toneva’s research is at the intersection of Machine Learning, Natural Language Processing, and Neuroscience. Her group bridges language in machines with language in the brain, with a focus on building computational models of language processing in the brain that can also improve natural language processing systems. Prior to joining MPI-SWS, she is conducting research as a C.V. Starr Fellow at the Princeton Neuroscience Institute. She received her Ph.D. in a joint program between Machine Learning and Neural Computation from Carnegie Mellon University. Read more

Mariya Toneva

MPI-SWS, Faculty
Personal Website

SPRING: Security and Privacy Engineering

Carmela Troncoso heads the SPRING Lab. Her work focuses on building and deploying secure and privacy-preserving systems that minimize societal harms and on critically analyzing technologies with respect to the protection they provide to social values. She received her PhD from KU Leuven in 2011. Her work on privacy has received multiple awards, including the CNIL-INRIA Privacy Protection Award in 2017, and she has been named 40 under 40 in technology by Fortune in 2020. Read more

Carmela Troncoso

MPI-SP, Scientific Director
Personal Website

Digital and Computational Demography

Emilio Zagheni is a scientific director at the Max Planck Institute for Demographic Research (MPIDR), where he heads the Department of Digital and Computational Demography. Zagheni is best known for his work on combining digital trace data and traditional sources to track and understand migrations and to advance population science. The main goal of the Department of Digital and Computational Demography is to advance fundamental population science, through the lens of digital and computational perspectives, for the benefit of everyone. Thematically, a first primary focal area, addressed by the Laboratory of Migration and Mobility, is on measuring, understanding, and predicting the causes and consequences of migration. A second primary focal area, addressed by the Laboratory of Population Dynamics and Sustainable Well-Being, is on monitoring, understanding, and predicting the factors that shape people’s well-being across space, time, and demographic characteristics, and as they relate to mortality and health, fertility, social and economic processes, and sustainable development. Read more

Emilio Zagheni

MPI-DR, Scientific Director
Personal Website

Research at Partner Universities

Data Engineering Group

Database Systems Research Group

Big Data Research Area

Database and Information Systems Group