Project acronym ACCORD
Project Algorithms for Complex Collective Decisions on Structured Domains
Researcher (PI) Edith Elkind
Host Institution (HI) THE CHANCELLOR, MASTERS AND SCHOLARS OF THE UNIVERSITY OF OXFORD
Call Details Starting Grant (StG), PE6, ERC-2014-STG
Summary Algorithms for Complex Collective Decisions on Structured Domains.
The aim of this proposal is to substantially advance the field of Computational Social Choice, by developing new tools and methodologies that can be used for making complex group decisions in rich and structured environments. We consider settings where each member of a decision-making body has preferences over a finite set of alternatives, and the goal is to synthesise a collective preference over these alternatives, which may take the form of a partial order over the set of alternatives with a predefined structure: examples include selecting a fixed-size set of alternatives, a ranking of the alternatives, a winner and up to two runner-ups, etc. We will formulate desiderata that apply to such preference aggregation procedures, design specific procedures that satisfy as many of these desiderata as possible, and develop efficient algorithms for computing them. As the latter step may be infeasible on general preference domains, we will focus on identifying the least restrictive domains that enable efficient computation, and use real-life preference data to verify whether the associated restrictions are likely to be satisfied in realistic preference aggregation scenarios. Also, we will determine whether our preference aggregation procedures are computationally resistant to malicious behavior. To lower the cognitive burden on the decision-makers, we will extend our procedures to accept partial rankings as inputs. Finally, to further contribute towards bridging the gap between theory and practice of collective decision making, we will provide open-source software implementations of our procedures, and reach out to the potential users to obtain feedback on their practical applicability.
Summary
Algorithms for Complex Collective Decisions on Structured Domains.
The aim of this proposal is to substantially advance the field of Computational Social Choice, by developing new tools and methodologies that can be used for making complex group decisions in rich and structured environments. We consider settings where each member of a decision-making body has preferences over a finite set of alternatives, and the goal is to synthesise a collective preference over these alternatives, which may take the form of a partial order over the set of alternatives with a predefined structure: examples include selecting a fixed-size set of alternatives, a ranking of the alternatives, a winner and up to two runner-ups, etc. We will formulate desiderata that apply to such preference aggregation procedures, design specific procedures that satisfy as many of these desiderata as possible, and develop efficient algorithms for computing them. As the latter step may be infeasible on general preference domains, we will focus on identifying the least restrictive domains that enable efficient computation, and use real-life preference data to verify whether the associated restrictions are likely to be satisfied in realistic preference aggregation scenarios. Also, we will determine whether our preference aggregation procedures are computationally resistant to malicious behavior. To lower the cognitive burden on the decision-makers, we will extend our procedures to accept partial rankings as inputs. Finally, to further contribute towards bridging the gap between theory and practice of collective decision making, we will provide open-source software implementations of our procedures, and reach out to the potential users to obtain feedback on their practical applicability.
Max ERC Funding
1 395 933 €
Duration
Start date: 2015-07-01, End date: 2020-06-30
Project acronym ALEXANDRIA
Project Large-Scale Formal Proof for the Working Mathematician
Researcher (PI) Lawrence PAULSON
Host Institution (HI) THE CHANCELLOR MASTERS AND SCHOLARS OF THE UNIVERSITY OF CAMBRIDGE
Call Details Advanced Grant (AdG), PE6, ERC-2016-ADG
Summary Mathematical proofs have always been prone to error. Today, proofs can be hundreds of pages long and combine results from many specialisms, making them almost impossible to check. One solution is to deploy modern verification technology. Interactive theorem provers have demonstrated their potential as vehicles for formalising mathematics through achievements such as the verification of the Kepler Conjecture. Proofs done using such tools reach a high standard of correctness.
However, existing theorem provers are unsuitable for mathematics. Their formal proofs are unreadable. They struggle to do simple tasks, such as evaluating limits. They lack much basic mathematics, and the material they do have is difficult to locate and apply.
ALEXANDRIA will create a proof development environment attractive to working mathematicians, utilising the best technology available across computer science. Its focus will be the management and use of large-scale mathematical knowledge, both theorems and algorithms. The project will employ mathematicians to investigate the formalisation of mathematics in practice. Our already substantial formalised libraries will serve as the starting point. They will be extended and annotated to support sophisticated searches. Techniques will be borrowed from machine learning, information retrieval and natural language processing. Algorithms will be treated similarly: ALEXANDRIA will help users find and invoke the proof methods and algorithms appropriate for the task.
ALEXANDRIA will provide (1) comprehensive formal mathematical libraries; (2) search within libraries, and the mining of libraries for proof patterns; (3) automated support for the construction of large formal proofs; (4) sound and practical computer algebra tools.
ALEXANDRIA will be based on legible structured proofs. Formal proofs should be not mere code, but a machine-checkable form of communication between mathematicians.
Summary
Mathematical proofs have always been prone to error. Today, proofs can be hundreds of pages long and combine results from many specialisms, making them almost impossible to check. One solution is to deploy modern verification technology. Interactive theorem provers have demonstrated their potential as vehicles for formalising mathematics through achievements such as the verification of the Kepler Conjecture. Proofs done using such tools reach a high standard of correctness.
However, existing theorem provers are unsuitable for mathematics. Their formal proofs are unreadable. They struggle to do simple tasks, such as evaluating limits. They lack much basic mathematics, and the material they do have is difficult to locate and apply.
ALEXANDRIA will create a proof development environment attractive to working mathematicians, utilising the best technology available across computer science. Its focus will be the management and use of large-scale mathematical knowledge, both theorems and algorithms. The project will employ mathematicians to investigate the formalisation of mathematics in practice. Our already substantial formalised libraries will serve as the starting point. They will be extended and annotated to support sophisticated searches. Techniques will be borrowed from machine learning, information retrieval and natural language processing. Algorithms will be treated similarly: ALEXANDRIA will help users find and invoke the proof methods and algorithms appropriate for the task.
ALEXANDRIA will provide (1) comprehensive formal mathematical libraries; (2) search within libraries, and the mining of libraries for proof patterns; (3) automated support for the construction of large formal proofs; (4) sound and practical computer algebra tools.
ALEXANDRIA will be based on legible structured proofs. Formal proofs should be not mere code, but a machine-checkable form of communication between mathematicians.
Max ERC Funding
2 430 140 €
Duration
Start date: 2017-09-01, End date: 2022-08-31
Project acronym ALGAME
Project Algorithms, Games, Mechanisms, and the Price of Anarchy
Researcher (PI) Elias Koutsoupias
Host Institution (HI) THE CHANCELLOR, MASTERS AND SCHOLARS OF THE UNIVERSITY OF OXFORD
Call Details Advanced Grant (AdG), PE6, ERC-2012-ADG_20120216
Summary The objective of this proposal is to bring together a local team of young researchers who will work closely with international collaborators to advance the state of the art of Algorithmic Game Theory and open new venues of research at the interface of Computer Science, Game Theory, and Economics. The proposal consists mainly of three intertwined research strands: algorithmic mechanism design, price of anarchy, and online algorithms.
Specifically, we will attempt to resolve some outstanding open problems in algorithmic mechanism design: characterizing the incentive compatible mechanisms for important domains, such as the domain of combinatorial auctions, and resolving the approximation ratio of mechanisms for scheduling unrelated machines. More generally, we will study centralized and distributed algorithms whose inputs are controlled by selfish agents that are interested in the outcome of the computation. We will investigate new notions of mechanisms with strong truthfulness and limited susceptibility to externalities that can facilitate modular design of mechanisms of complex domains.
We will expand the current research on the price of anarchy to time-dependent games where the players can select not only how to act but also when to act. We also plan to resolve outstanding questions on the price of stability and to build a robust approach to these questions, similar to smooth analysis. For repeated games, we will investigate convergence of simple strategies (e.g., fictitious play), online fairness, and strategic considerations (e.g., metagames). More generally, our aim is to find a productive formulation of playing unknown games by drawing on the fields of online algorithms and machine learning.
Summary
The objective of this proposal is to bring together a local team of young researchers who will work closely with international collaborators to advance the state of the art of Algorithmic Game Theory and open new venues of research at the interface of Computer Science, Game Theory, and Economics. The proposal consists mainly of three intertwined research strands: algorithmic mechanism design, price of anarchy, and online algorithms.
Specifically, we will attempt to resolve some outstanding open problems in algorithmic mechanism design: characterizing the incentive compatible mechanisms for important domains, such as the domain of combinatorial auctions, and resolving the approximation ratio of mechanisms for scheduling unrelated machines. More generally, we will study centralized and distributed algorithms whose inputs are controlled by selfish agents that are interested in the outcome of the computation. We will investigate new notions of mechanisms with strong truthfulness and limited susceptibility to externalities that can facilitate modular design of mechanisms of complex domains.
We will expand the current research on the price of anarchy to time-dependent games where the players can select not only how to act but also when to act. We also plan to resolve outstanding questions on the price of stability and to build a robust approach to these questions, similar to smooth analysis. For repeated games, we will investigate convergence of simple strategies (e.g., fictitious play), online fairness, and strategic considerations (e.g., metagames). More generally, our aim is to find a productive formulation of playing unknown games by drawing on the fields of online algorithms and machine learning.
Max ERC Funding
2 461 000 €
Duration
Start date: 2013-04-01, End date: 2019-03-31
Project acronym ALUNIF
Project Algorithms and Lower Bounds: A Unified Approach
Researcher (PI) Rahul Santhanam
Host Institution (HI) THE CHANCELLOR, MASTERS AND SCHOLARS OF THE UNIVERSITY OF OXFORD
Call Details Consolidator Grant (CoG), PE6, ERC-2013-CoG
Summary One of the fundamental goals of theoretical computer science is to
understand the possibilities and limits of efficient computation. This
quest has two dimensions. The
theory of algorithms focuses on finding efficient solutions to
problems, while computational complexity theory aims to understand when
and why problems are hard to solve. These two areas have different
philosophies and use different sets of techniques. However, in recent
years there have been indications of deep and mysterious connections
between them.
In this project, we propose to explore and develop the connections between
algorithmic analysis and complexity lower bounds in a systematic way.
On the one hand, we plan to use complexity lower bound techniques as inspiration
to design new and improved algorithms for Satisfiability and other
NP-complete problems, as well as to analyze existing algorithms better.
On the other hand, we plan to strengthen implications yielding circuit
lower bounds from non-trivial algorithms for Satisfiability, and to derive
new circuit lower bounds using these stronger implications.
This project has potential for massive impact in both the areas of algorithms
and computational complexity. Improved algorithms for Satisfiability could lead
to improved SAT solvers, and the new analytical tools would lead to a better
understanding of existing heuristics. Complexity lower bound questions are
fundamental
but notoriously difficult, and new lower bounds would open the way to
unconditionally secure cryptographic protocols and derandomization of
probabilistic algorithms. More broadly, this project aims to initiate greater
dialogue between the two areas, with an exchange of ideas and techniques
which leads to accelerated progress in both, as well as a deeper understanding
of the nature of efficient computation.
Summary
One of the fundamental goals of theoretical computer science is to
understand the possibilities and limits of efficient computation. This
quest has two dimensions. The
theory of algorithms focuses on finding efficient solutions to
problems, while computational complexity theory aims to understand when
and why problems are hard to solve. These two areas have different
philosophies and use different sets of techniques. However, in recent
years there have been indications of deep and mysterious connections
between them.
In this project, we propose to explore and develop the connections between
algorithmic analysis and complexity lower bounds in a systematic way.
On the one hand, we plan to use complexity lower bound techniques as inspiration
to design new and improved algorithms for Satisfiability and other
NP-complete problems, as well as to analyze existing algorithms better.
On the other hand, we plan to strengthen implications yielding circuit
lower bounds from non-trivial algorithms for Satisfiability, and to derive
new circuit lower bounds using these stronger implications.
This project has potential for massive impact in both the areas of algorithms
and computational complexity. Improved algorithms for Satisfiability could lead
to improved SAT solvers, and the new analytical tools would lead to a better
understanding of existing heuristics. Complexity lower bound questions are
fundamental
but notoriously difficult, and new lower bounds would open the way to
unconditionally secure cryptographic protocols and derandomization of
probabilistic algorithms. More broadly, this project aims to initiate greater
dialogue between the two areas, with an exchange of ideas and techniques
which leads to accelerated progress in both, as well as a deeper understanding
of the nature of efficient computation.
Max ERC Funding
1 274 496 €
Duration
Start date: 2014-03-01, End date: 2019-02-28
Project acronym ASAP
Project Adaptive Security and Privacy
Researcher (PI) Bashar Nuseibeh
Host Institution (HI) THE OPEN UNIVERSITY
Call Details Advanced Grant (AdG), PE6, ERC-2011-ADG_20110209
Summary With the prevalence of mobile computing devices and the increasing availability of pervasive services, ubiquitous computing (Ubicomp) is a reality for many people. This reality is generating opportunities for people to interact socially in new and richer ways, and to work more effectively in a variety of new environments. More generally, Ubicomp infrastructures – controlled by software – will determine users’ access to critical services.
With these opportunities come higher risks of misuse by malicious agents. Therefore, the role and design of software for managing use and protecting against misuse is critical, and the engineering of software that is both functionally effective while safe guarding user assets from harm is a key challenge. Indeed the very nature of Ubicomp means that software must adapt to the changing needs of users and their environment, and, more critically, to the different threats to users’ security and privacy.
ASAP proposes to radically re-conceptualise software engineering for Ubicomp in ways that are cognisant of the changing functional needs of users, of the changing threats to user assets, and of the changing relationships between them. We propose to deliver adaptive software capabilities for supporting users in managing their privacy requirements, and adaptive software capabilities to deliver secure software that underpin those requirements. A key novelty of our approach is its holistic treatment of security and human behaviour. To achieve this, it draws upon contributions from requirements engineering, security & privacy engineering, and human-computer interaction. Our aim is to contribute to software engineering that empowers and protects Ubicomp users. Underpinning our approach will be the development of representations of security and privacy problem structures that capture user requirements, the context in which those requirements arise, and the adaptive software that aims to meet those requirements.
Summary
With the prevalence of mobile computing devices and the increasing availability of pervasive services, ubiquitous computing (Ubicomp) is a reality for many people. This reality is generating opportunities for people to interact socially in new and richer ways, and to work more effectively in a variety of new environments. More generally, Ubicomp infrastructures – controlled by software – will determine users’ access to critical services.
With these opportunities come higher risks of misuse by malicious agents. Therefore, the role and design of software for managing use and protecting against misuse is critical, and the engineering of software that is both functionally effective while safe guarding user assets from harm is a key challenge. Indeed the very nature of Ubicomp means that software must adapt to the changing needs of users and their environment, and, more critically, to the different threats to users’ security and privacy.
ASAP proposes to radically re-conceptualise software engineering for Ubicomp in ways that are cognisant of the changing functional needs of users, of the changing threats to user assets, and of the changing relationships between them. We propose to deliver adaptive software capabilities for supporting users in managing their privacy requirements, and adaptive software capabilities to deliver secure software that underpin those requirements. A key novelty of our approach is its holistic treatment of security and human behaviour. To achieve this, it draws upon contributions from requirements engineering, security & privacy engineering, and human-computer interaction. Our aim is to contribute to software engineering that empowers and protects Ubicomp users. Underpinning our approach will be the development of representations of security and privacy problem structures that capture user requirements, the context in which those requirements arise, and the adaptive software that aims to meet those requirements.
Max ERC Funding
2 499 041 €
Duration
Start date: 2012-10-01, End date: 2018-09-30
Project acronym BAYES-KNOWLEDGE
Project Effective Bayesian Modelling with Knowledge before Data
Researcher (PI) Norman Fenton
Host Institution (HI) QUEEN MARY UNIVERSITY OF LONDON
Call Details Advanced Grant (AdG), PE6, ERC-2013-ADG
Summary This project aims to improve evidence-based decision-making. What makes it radical is that it plans to do this in situations (common for critical risk assessment problems) where there is little or even no data, and hence where traditional statistics cannot be used. To address this problem Bayesian analysis, which enables domain experts to supplement observed data with subjective probabilities, is normally used. As real-world problems typically involve multiple uncertain variables, Bayesian analysis is extended using a technique called Bayesian networks (BNs). But, despite many great benefits, BNs have been under-exploited, especially in areas where they offer the greatest potential for improvements (law, medicine and systems engineering). This is mainly because of widespread resistance to relying on subjective knowledge. To address this problem much current research assumes sufficient data are available to make the expert’s input minimal or even redundant; with such data it may be possible to ‘learn’ the underlying BN model. But this approach offers nothing when there is limited or no data. Even when ‘big’ data are available the resulting models may be superficially objective but fundamentally flawed as they fail to capture the underlying causal structure that only expert knowledge can provide.
Our solution is to develop a method to systemize the way expert driven causal BN models can be built and used effectively either in the absence of data or as a means of determining what future data is really required. The method involves a new way of framing problems and extensions to BN theory, notation and tools. Working with relevant domain experts, along with cognitive psychologists, our methods will be developed and tested experimentally on real-world critical decision-problems in medicine, law, forensics, and transport. As the work complements current data-driven approaches, it will lead to improved BN modelling both when there is extensive data as well as none.
Summary
This project aims to improve evidence-based decision-making. What makes it radical is that it plans to do this in situations (common for critical risk assessment problems) where there is little or even no data, and hence where traditional statistics cannot be used. To address this problem Bayesian analysis, which enables domain experts to supplement observed data with subjective probabilities, is normally used. As real-world problems typically involve multiple uncertain variables, Bayesian analysis is extended using a technique called Bayesian networks (BNs). But, despite many great benefits, BNs have been under-exploited, especially in areas where they offer the greatest potential for improvements (law, medicine and systems engineering). This is mainly because of widespread resistance to relying on subjective knowledge. To address this problem much current research assumes sufficient data are available to make the expert’s input minimal or even redundant; with such data it may be possible to ‘learn’ the underlying BN model. But this approach offers nothing when there is limited or no data. Even when ‘big’ data are available the resulting models may be superficially objective but fundamentally flawed as they fail to capture the underlying causal structure that only expert knowledge can provide.
Our solution is to develop a method to systemize the way expert driven causal BN models can be built and used effectively either in the absence of data or as a means of determining what future data is really required. The method involves a new way of framing problems and extensions to BN theory, notation and tools. Working with relevant domain experts, along with cognitive psychologists, our methods will be developed and tested experimentally on real-world critical decision-problems in medicine, law, forensics, and transport. As the work complements current data-driven approaches, it will lead to improved BN modelling both when there is extensive data as well as none.
Max ERC Funding
1 572 562 €
Duration
Start date: 2014-04-01, End date: 2018-03-31
Project acronym BIGBAYES
Project Rich, Structured and Efficient Learning of Big Bayesian Models
Researcher (PI) Yee Whye Teh
Host Institution (HI) THE CHANCELLOR, MASTERS AND SCHOLARS OF THE UNIVERSITY OF OXFORD
Call Details Consolidator Grant (CoG), PE6, ERC-2013-CoG
Summary As datasets grow ever larger in scale, complexity and variety, there is an increasing need for powerful machine learning and statistical techniques that are capable of learning from such data. Bayesian nonparametrics is a promising approach to data analysis that is increasingly popular in machine learning and statistics. Bayesian nonparametric models are highly flexible models with infinite-dimensional parameter spaces that can be used to directly parameterise and learn about functions, densities, conditional distributions etc, and have been successfully applied to regression, survival analysis, language modelling, time series analysis, and visual scene analysis among others. However, to successfully use Bayesian nonparametric models to analyse the high-dimensional and structured datasets now commonly encountered in the age of Big Data, we will have to overcome a number of challenges. Namely, we need to develop Bayesian nonparametric models that can learn rich representations from structured data, and we need computational methodologies that can scale effectively to the large and complex models of the future. We will ground our developments in relevant applications, particularly to natural language processing (learning distributed representations for language modelling and compositional semantics) and genetics (modelling genetic variations arising from population, genealogical and spatial structures).
Summary
As datasets grow ever larger in scale, complexity and variety, there is an increasing need for powerful machine learning and statistical techniques that are capable of learning from such data. Bayesian nonparametrics is a promising approach to data analysis that is increasingly popular in machine learning and statistics. Bayesian nonparametric models are highly flexible models with infinite-dimensional parameter spaces that can be used to directly parameterise and learn about functions, densities, conditional distributions etc, and have been successfully applied to regression, survival analysis, language modelling, time series analysis, and visual scene analysis among others. However, to successfully use Bayesian nonparametric models to analyse the high-dimensional and structured datasets now commonly encountered in the age of Big Data, we will have to overcome a number of challenges. Namely, we need to develop Bayesian nonparametric models that can learn rich representations from structured data, and we need computational methodologies that can scale effectively to the large and complex models of the future. We will ground our developments in relevant applications, particularly to natural language processing (learning distributed representations for language modelling and compositional semantics) and genetics (modelling genetic variations arising from population, genealogical and spatial structures).
Max ERC Funding
1 918 092 €
Duration
Start date: 2014-05-01, End date: 2019-04-30
Project acronym BIMPC
Project Biologically-Inspired Massively-Parallel Computation
Researcher (PI) Stephen Byram Furber
Host Institution (HI) THE UNIVERSITY OF MANCHESTER
Call Details Advanced Grant (AdG), PE6, ERC-2012-ADG_20120216
Summary "We aim to establish a world-leading research capability in Europe for advancing novel models of asynchronous computation based upon principles inspired by brain function. This work will accelerate progress towards an understanding of how the potential of brain-inspired many-core architectures may be harnessed. The results will include new brain-inspired models of asynchronous computation and new brain- inspired approaches to fault-tolerance and reliability in complex computer systems.
Many-core processors are now established as the way forward for computing from embedded systems to supercomputers. An emerging problem with leading-edge silicon technology is a reduction in the yield and reliability of modern processors due to high variability in the manufacture of the components and interconnect as transistor geometries shrink towards atomic scales. We are faced with the longstanding problem of how to make use of a potentially large array of parallel processors, but with the new constraint that the individual elements are the system are inherently unreliable.
The human brain remains as one of the great frontiers of science – how does this organ upon which we all depend so critically actually do its job? A great deal is known about the underlying technology – the neuron – and we can observe large-scale brain activity through techniques such as magnetic resonance imaging, but this knowledge barely starts to tell us how the brain works. Something is happening at the intermediate levels of processing that we have yet to begin to understand, but the essence of the brain's massively-parallel information processing capabilities and robustness to component failure lies in these intermediate levels.
These two issues draws us towards two high-level research questions:
• Can our growing understanding of brain function point the way to more efficient parallel, fault-tolerant computing?
• Can massively parallel computing resources accelerate our understanding of brain function"
Summary
"We aim to establish a world-leading research capability in Europe for advancing novel models of asynchronous computation based upon principles inspired by brain function. This work will accelerate progress towards an understanding of how the potential of brain-inspired many-core architectures may be harnessed. The results will include new brain-inspired models of asynchronous computation and new brain- inspired approaches to fault-tolerance and reliability in complex computer systems.
Many-core processors are now established as the way forward for computing from embedded systems to supercomputers. An emerging problem with leading-edge silicon technology is a reduction in the yield and reliability of modern processors due to high variability in the manufacture of the components and interconnect as transistor geometries shrink towards atomic scales. We are faced with the longstanding problem of how to make use of a potentially large array of parallel processors, but with the new constraint that the individual elements are the system are inherently unreliable.
The human brain remains as one of the great frontiers of science – how does this organ upon which we all depend so critically actually do its job? A great deal is known about the underlying technology – the neuron – and we can observe large-scale brain activity through techniques such as magnetic resonance imaging, but this knowledge barely starts to tell us how the brain works. Something is happening at the intermediate levels of processing that we have yet to begin to understand, but the essence of the brain's massively-parallel information processing capabilities and robustness to component failure lies in these intermediate levels.
These two issues draws us towards two high-level research questions:
• Can our growing understanding of brain function point the way to more efficient parallel, fault-tolerant computing?
• Can massively parallel computing resources accelerate our understanding of brain function"
Max ERC Funding
2 399 761 €
Duration
Start date: 2013-03-01, End date: 2018-02-28
Project acronym BIONET
Project Network Topology Complements Genome as a Source of Biological Information
Researcher (PI) Natasa Przulj
Host Institution (HI) UNIVERSITY COLLEGE LONDON
Call Details Starting Grant (StG), PE6, ERC-2011-StG_20101014
Summary Genetic sequences have had an enormous impact on our understanding of biology. The expectation is that biological network data will have a similar impact. However, progress is hindered by a lack of sophisticated graph theoretic tools that will mine these large networked datasets.
In recent breakthrough work at the boundary of computer science and biology supported by my USA NSF CAREER award, I developed sensitive network analysis, comparison and embedding tools which demonstrated that protein-protein interaction networks of eukaryotes are best modeled by geometric graphs. Also, they established phenotypically validated, unprecedented link between network topology and biological function and disease. Now I propose to substantially extend these preliminary results and design sensitive and robust network alignment methods that will lead to uncovering unknown biology and evolutionary relationships. The potential ground-breaking impact of such network alignment tools could be parallel to the impact the BLAST family of sequence alignment tools that have revolutionized our understanding of biological systems and therapeutics. Furthermore, I propose to develop additional sophisticated graph theoretic techniques to mine network data and hence complement biological information that can be extracted from sequence. I propose to exploit these new techniques for biological applications in collaboration with experimentalists at Imperial College London: 1. aligning biological networks of species whose genomes are closely related, but that have very different phenotypes, in order to uncover systems-level factors that contribute to pronounced differences; 2. compare and contrast stress response pathways and metabolic pathways in bacteria in a unified systems-level framework and exploit the findings for: (a) bioengineering of micro-organisms for industrial applications (production of bio-fuels, bioremediation, production of biopolymers); (b) biomedical applications.
Summary
Genetic sequences have had an enormous impact on our understanding of biology. The expectation is that biological network data will have a similar impact. However, progress is hindered by a lack of sophisticated graph theoretic tools that will mine these large networked datasets.
In recent breakthrough work at the boundary of computer science and biology supported by my USA NSF CAREER award, I developed sensitive network analysis, comparison and embedding tools which demonstrated that protein-protein interaction networks of eukaryotes are best modeled by geometric graphs. Also, they established phenotypically validated, unprecedented link between network topology and biological function and disease. Now I propose to substantially extend these preliminary results and design sensitive and robust network alignment methods that will lead to uncovering unknown biology and evolutionary relationships. The potential ground-breaking impact of such network alignment tools could be parallel to the impact the BLAST family of sequence alignment tools that have revolutionized our understanding of biological systems and therapeutics. Furthermore, I propose to develop additional sophisticated graph theoretic techniques to mine network data and hence complement biological information that can be extracted from sequence. I propose to exploit these new techniques for biological applications in collaboration with experimentalists at Imperial College London: 1. aligning biological networks of species whose genomes are closely related, but that have very different phenotypes, in order to uncover systems-level factors that contribute to pronounced differences; 2. compare and contrast stress response pathways and metabolic pathways in bacteria in a unified systems-level framework and exploit the findings for: (a) bioengineering of micro-organisms for industrial applications (production of bio-fuels, bioremediation, production of biopolymers); (b) biomedical applications.
Max ERC Funding
1 638 175 €
Duration
Start date: 2012-01-01, End date: 2017-12-31
Project acronym BroadSem
Project Induction of Broad-Coverage Semantic Parsers
Researcher (PI) Ivan Titov
Host Institution (HI) THE UNIVERSITY OF EDINBURGH
Call Details Starting Grant (StG), PE6, ERC-2015-STG
Summary In the last one or two decades, language technology has achieved a number of important successes, for example, producing functional machine translation systems and beating humans in quiz games. The key bottleneck which prevents further progress in these and many other natural language processing (NLP) applications (e.g., text summarization, information retrieval, opinion mining, dialog and tutoring systems) is the lack of accurate methods for producing meaning representations of texts. Accurately predicting such meaning representations on an open domain with an automatic parser is a challenging and unsolved problem, primarily because of language variability and ambiguity. The reason for the unsatisfactory performance is reliance on supervised learning (learning from annotated resources), with the amounts of annotation required for accurate open-domain parsing exceeding what is practically feasible. Moreover, representations defined in these resources typically do not provide abstractions suitable for reasoning.
In this project, we will induce semantic representations from large amounts of unannotated data (i.e. text which has not been labeled by humans) while guided by information contained in human-annotated data and other forms of linguistic knowledge. This will allow us to scale our approach to many domains and across languages. We will specialize meaning representations for reasoning by modeling relations (e.g., facts) appearing across sentences in texts (document-level modeling), across different texts, and across texts and knowledge bases. Learning to predict this linked data is closely related to learning to reason, including learning the notions of semantic equivalence and entailment. We will jointly induce semantic parsers (e.g., log-linear feature-rich models) and reasoning models (latent factor models) relying on this data, thus, ensuring that the semantic representations are informative for applications requiring reasoning.
Summary
In the last one or two decades, language technology has achieved a number of important successes, for example, producing functional machine translation systems and beating humans in quiz games. The key bottleneck which prevents further progress in these and many other natural language processing (NLP) applications (e.g., text summarization, information retrieval, opinion mining, dialog and tutoring systems) is the lack of accurate methods for producing meaning representations of texts. Accurately predicting such meaning representations on an open domain with an automatic parser is a challenging and unsolved problem, primarily because of language variability and ambiguity. The reason for the unsatisfactory performance is reliance on supervised learning (learning from annotated resources), with the amounts of annotation required for accurate open-domain parsing exceeding what is practically feasible. Moreover, representations defined in these resources typically do not provide abstractions suitable for reasoning.
In this project, we will induce semantic representations from large amounts of unannotated data (i.e. text which has not been labeled by humans) while guided by information contained in human-annotated data and other forms of linguistic knowledge. This will allow us to scale our approach to many domains and across languages. We will specialize meaning representations for reasoning by modeling relations (e.g., facts) appearing across sentences in texts (document-level modeling), across different texts, and across texts and knowledge bases. Learning to predict this linked data is closely related to learning to reason, including learning the notions of semantic equivalence and entailment. We will jointly induce semantic parsers (e.g., log-linear feature-rich models) and reasoning models (latent factor models) relying on this data, thus, ensuring that the semantic representations are informative for applications requiring reasoning.
Max ERC Funding
1 457 185 €
Duration
Start date: 2016-05-01, End date: 2021-04-30