Project acronym ABCTRANSPORT
Project Minimalist multipurpose ATP-binding cassette transporters
Researcher (PI) Dirk Jan Slotboom
Host Institution (HI) RIJKSUNIVERSITEIT GRONINGEN
Call Details Starting Grant (StG), LS1, ERC-2011-StG_20101109
Summary Many Gram-positive (pathogenic) bacteria are dependent on the uptake of vitamins from the environment or from the infected host. We have recently discovered the long-elusive family of membrane protein complexes catalyzing such transport. The vitamin transporters have an unprecedented modular architecture consisting of a single multipurpose energizing module (the Energy Coupling Factor, ECF) and multiple exchangeable membrane proteins responsible for substrate recognition (S-components). The S-components have characteristics of ion-gradient driven transporters (secondary active transporters), whereas the energizing modules are related to ATP-binding cassette (ABC) transporters (primary active transporters).
The aim of the proposal is threefold: First, we will address the question how properties of primary and secondary transporters are combined in ECF transporters to obtain a novel transport mechanism. Second, we will study the fundamental and unresolved question how protein-protein recognition takes place in the hydrophobic environment of the lipid bilayer. The modular nature of the ECF proteins offers a natural system to study the driving forces used for membrane protein interaction. Third, we will assess whether the ECF transport systems could become targets for antibacterial drugs. ECF transporters are found exclusively in prokaryotes, and their activity is often essential for viability of Gram-positive pathogens. Thus they could turn out to be an Achilles’ heel for the organisms.
Structural and mechanistic studies (X-ray crystallography, microscopy, spectroscopy and biochemistry) will reveal how the different transport modes are combined in a single protein complex, how transport is energized and catalyzed, and how protein-protein recognition takes place. Microbiological screens will be developed to search for compounds that inhibit prokaryote-specific steps of the mechanism of ECF transporters.
Summary
Many Gram-positive (pathogenic) bacteria are dependent on the uptake of vitamins from the environment or from the infected host. We have recently discovered the long-elusive family of membrane protein complexes catalyzing such transport. The vitamin transporters have an unprecedented modular architecture consisting of a single multipurpose energizing module (the Energy Coupling Factor, ECF) and multiple exchangeable membrane proteins responsible for substrate recognition (S-components). The S-components have characteristics of ion-gradient driven transporters (secondary active transporters), whereas the energizing modules are related to ATP-binding cassette (ABC) transporters (primary active transporters).
The aim of the proposal is threefold: First, we will address the question how properties of primary and secondary transporters are combined in ECF transporters to obtain a novel transport mechanism. Second, we will study the fundamental and unresolved question how protein-protein recognition takes place in the hydrophobic environment of the lipid bilayer. The modular nature of the ECF proteins offers a natural system to study the driving forces used for membrane protein interaction. Third, we will assess whether the ECF transport systems could become targets for antibacterial drugs. ECF transporters are found exclusively in prokaryotes, and their activity is often essential for viability of Gram-positive pathogens. Thus they could turn out to be an Achilles’ heel for the organisms.
Structural and mechanistic studies (X-ray crystallography, microscopy, spectroscopy and biochemistry) will reveal how the different transport modes are combined in a single protein complex, how transport is energized and catalyzed, and how protein-protein recognition takes place. Microbiological screens will be developed to search for compounds that inhibit prokaryote-specific steps of the mechanism of ECF transporters.
Max ERC Funding
1 500 000 €
Duration
Start date: 2012-01-01, End date: 2017-12-31
Project acronym ALGILE
Project Foundations of Algebraic and Dynamic Data Management Systems
Researcher (PI) Christoph Koch
Host Institution (HI) ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE
Call Details Starting Grant (StG), PE6, ERC-2011-StG_20101014
Summary "Contemporary database query languages are ultimately founded on logic and feature an additive operation – usually a form of (multi)set union or disjunction – that is asymmetric in that additions or updates do not always have an inverse. This asymmetry puts a greater part of the machinery of abstract algebra for equation solving outside the reach of databases. However, such equation solving would be a key functionality that problems such as query equivalence testing and data integration could be reduced to: In the current scenario of the presence of an asymmetric additive operation they are undecidable. Moreover, query languages with a symmetric additive operation (i.e., which has an inverse and is thus based on ring theory) would open up databases for a large range of new scientific and mathematical applications.
The goal of the proposed project is to reinvent database management systems with a foundation in abstract algebra and specifically in ring theory. The presence of an additive inverse allows to cleanly define differences between queries. This gives rise to a database analog of differential calculus that leads to radically new incremental and adaptive query evaluation algorithms that substantially outperform the state of the art techniques. These algorithms enable a new class of systems which I call Dynamic Data Management Systems. Such systems can maintain continuously fresh query views at extremely high update rates and have important applications in interactive Large-scale Data Analysis. There is a natural connection between differences and updates, motivating the group theoretic study of updates that will lead to better ways of creating out-of-core data processing algorithms for new storage devices. Basing queries on ring theory leads to a new class of systems, Algebraic Data Management Systems, which herald a convergence of database systems and computer algebra systems."
Summary
"Contemporary database query languages are ultimately founded on logic and feature an additive operation – usually a form of (multi)set union or disjunction – that is asymmetric in that additions or updates do not always have an inverse. This asymmetry puts a greater part of the machinery of abstract algebra for equation solving outside the reach of databases. However, such equation solving would be a key functionality that problems such as query equivalence testing and data integration could be reduced to: In the current scenario of the presence of an asymmetric additive operation they are undecidable. Moreover, query languages with a symmetric additive operation (i.e., which has an inverse and is thus based on ring theory) would open up databases for a large range of new scientific and mathematical applications.
The goal of the proposed project is to reinvent database management systems with a foundation in abstract algebra and specifically in ring theory. The presence of an additive inverse allows to cleanly define differences between queries. This gives rise to a database analog of differential calculus that leads to radically new incremental and adaptive query evaluation algorithms that substantially outperform the state of the art techniques. These algorithms enable a new class of systems which I call Dynamic Data Management Systems. Such systems can maintain continuously fresh query views at extremely high update rates and have important applications in interactive Large-scale Data Analysis. There is a natural connection between differences and updates, motivating the group theoretic study of updates that will lead to better ways of creating out-of-core data processing algorithms for new storage devices. Basing queries on ring theory leads to a new class of systems, Algebraic Data Management Systems, which herald a convergence of database systems and computer algebra systems."
Max ERC Funding
1 480 548 €
Duration
Start date: 2012-01-01, End date: 2016-12-31
Project acronym ANIMETRICS
Project Measurement-Based Modeling and Animation of Complex Mechanical Phenomena
Researcher (PI) Miguel Angel Otaduy Tristan
Host Institution (HI) UNIVERSIDAD REY JUAN CARLOS
Call Details Starting Grant (StG), PE6, ERC-2011-StG_20101014
Summary Computer animation has traditionally been associated with applications in virtual-reality-based training, video games or feature films. However, interactive animation is gaining relevance in a more general scope, as a tool for early-stage analysis, design and planning in many applications in science and engineering. The user can get quick and visual feedback of the results, and then proceed by refining the experiments or designs. Potential applications include nanodesign, e-commerce or tactile telecommunication, but they also reach as far as, e.g., the analysis of ecological, climate, biological or physiological processes.
The application of computer animation is extremely limited in comparison to its potential outreach due to a trade-off between accuracy and computational efficiency. Such trade-off is induced by inherent complexity sources such as nonlinear or anisotropic behaviors, heterogeneous properties, or high dynamic ranges of effects.
The Animetrics project proposes a modeling and animation methodology, which consists of a multi-scale decomposition of complex processes, the description of the process at each scale through combination of simple local models, and fitting the parameters of those local models using large amounts of data from example effects. The modeling and animation methodology will be explored on specific problems arising in complex mechanical phenomena, including viscoelasticity of solids and thin shells, multi-body contact, granular and liquid flow, and fracture of solids.
Summary
Computer animation has traditionally been associated with applications in virtual-reality-based training, video games or feature films. However, interactive animation is gaining relevance in a more general scope, as a tool for early-stage analysis, design and planning in many applications in science and engineering. The user can get quick and visual feedback of the results, and then proceed by refining the experiments or designs. Potential applications include nanodesign, e-commerce or tactile telecommunication, but they also reach as far as, e.g., the analysis of ecological, climate, biological or physiological processes.
The application of computer animation is extremely limited in comparison to its potential outreach due to a trade-off between accuracy and computational efficiency. Such trade-off is induced by inherent complexity sources such as nonlinear or anisotropic behaviors, heterogeneous properties, or high dynamic ranges of effects.
The Animetrics project proposes a modeling and animation methodology, which consists of a multi-scale decomposition of complex processes, the description of the process at each scale through combination of simple local models, and fitting the parameters of those local models using large amounts of data from example effects. The modeling and animation methodology will be explored on specific problems arising in complex mechanical phenomena, including viscoelasticity of solids and thin shells, multi-body contact, granular and liquid flow, and fracture of solids.
Max ERC Funding
1 277 969 €
Duration
Start date: 2012-01-01, End date: 2016-12-31
Project acronym ANTICS
Project Algorithmic Number Theory in Computer Science
Researcher (PI) Andreas Enge
Host Institution (HI) INSTITUT NATIONAL DE RECHERCHE ENINFORMATIQUE ET AUTOMATIQUE
Call Details Starting Grant (StG), PE6, ERC-2011-StG_20101014
Summary "During the past twenty years, we have witnessed profound technological changes, summarised under the terms of digital revolution or entering the information age. It is evident that these technological changes will have a deep societal impact, and questions of privacy and security are primordial to ensure the survival of a free and open society.
Cryptology is a main building block of any security solution, and at the heart of projects such as electronic identity and health cards, access control, digital content distribution or electronic voting, to mention only a few important applications. During the past decades, public-key cryptology has established itself as a research topic in computer science; tools of theoretical computer science are employed to “prove” the security of cryptographic primitives such as encryption or digital signatures and of more complex protocols. It is often forgotten, however, that all practically relevant public-key cryptosystems are rooted in pure mathematics, in particular, number theory and arithmetic geometry. In fact, the socalled security “proofs” are all conditional to the algorithmic untractability of certain number theoretic problems, such as factorisation of large integers or discrete logarithms in algebraic curves. Unfortunately, there is a large cultural gap between computer scientists using a black-box security reduction to a supposedly hard problem in algorithmic number theory and number theorists, who are often interested in solving small and easy instances of the same problem. The theoretical grounds on which current algorithmic number theory operates are actually rather shaky, and cryptologists are generally unaware of this fact.
The central goal of ANTICS is to rebuild algorithmic number theory on the firm grounds of theoretical computer science."
Summary
"During the past twenty years, we have witnessed profound technological changes, summarised under the terms of digital revolution or entering the information age. It is evident that these technological changes will have a deep societal impact, and questions of privacy and security are primordial to ensure the survival of a free and open society.
Cryptology is a main building block of any security solution, and at the heart of projects such as electronic identity and health cards, access control, digital content distribution or electronic voting, to mention only a few important applications. During the past decades, public-key cryptology has established itself as a research topic in computer science; tools of theoretical computer science are employed to “prove” the security of cryptographic primitives such as encryption or digital signatures and of more complex protocols. It is often forgotten, however, that all practically relevant public-key cryptosystems are rooted in pure mathematics, in particular, number theory and arithmetic geometry. In fact, the socalled security “proofs” are all conditional to the algorithmic untractability of certain number theoretic problems, such as factorisation of large integers or discrete logarithms in algebraic curves. Unfortunately, there is a large cultural gap between computer scientists using a black-box security reduction to a supposedly hard problem in algorithmic number theory and number theorists, who are often interested in solving small and easy instances of the same problem. The theoretical grounds on which current algorithmic number theory operates are actually rather shaky, and cryptologists are generally unaware of this fact.
The central goal of ANTICS is to rebuild algorithmic number theory on the firm grounds of theoretical computer science."
Max ERC Funding
1 453 507 €
Duration
Start date: 2012-01-01, End date: 2016-12-31
Project acronym ARCA
Project Analysis and Representation of Complex Activities in Videos
Researcher (PI) Juergen Gall
Host Institution (HI) RHEINISCHE FRIEDRICH-WILHELMS-UNIVERSITAT BONN
Call Details Starting Grant (StG), PE6, ERC-2015-STG
Summary The goal of the project is to automatically analyse human activities observed in videos. Any solution to this problem will allow the development of novel applications. It could be used to create short videos that summarize daily activities to support patients suffering from Alzheimer's disease. It could also be used for education, e.g., by providing a video analysis for a trainee in the hospital that shows if the tasks have been correctly executed.
The analysis of complex activities in videos, however, is very challenging since activities vary in temporal duration between minutes and hours, involve interactions with several objects that change their appearance and shape, e.g., food during cooking, and are composed of many sub-activities, which can happen at the same time or in various orders.
While the majority of recent works in action recognition focuses on developing better feature encoding techniques for classifying sub-activities in short video clips of a few seconds, this project moves forward and aims to develop a higher level representation of complex activities to overcome the limitations of current approaches. This includes the handling of large time variations and the ability to recognize and locate complex activities in videos. To this end, we aim to develop a unified model that provides detailed information about the activities and sub-activities in terms of time and spatial location, as well as involved pose motion, objects and their transformations.
Another aspect of the project is to learn a representation from videos that is not tied to a specific source of videos or limited to a specific application. Instead we aim to learn a representation that is invariant to a perspective change, e.g., from a third-person perspective to an egocentric perspective, and can be applied to various modalities like videos or depth data without the need of collecting massive training data for all modalities. In other words, we aim to learn the essence of activities.
Summary
The goal of the project is to automatically analyse human activities observed in videos. Any solution to this problem will allow the development of novel applications. It could be used to create short videos that summarize daily activities to support patients suffering from Alzheimer's disease. It could also be used for education, e.g., by providing a video analysis for a trainee in the hospital that shows if the tasks have been correctly executed.
The analysis of complex activities in videos, however, is very challenging since activities vary in temporal duration between minutes and hours, involve interactions with several objects that change their appearance and shape, e.g., food during cooking, and are composed of many sub-activities, which can happen at the same time or in various orders.
While the majority of recent works in action recognition focuses on developing better feature encoding techniques for classifying sub-activities in short video clips of a few seconds, this project moves forward and aims to develop a higher level representation of complex activities to overcome the limitations of current approaches. This includes the handling of large time variations and the ability to recognize and locate complex activities in videos. To this end, we aim to develop a unified model that provides detailed information about the activities and sub-activities in terms of time and spatial location, as well as involved pose motion, objects and their transformations.
Another aspect of the project is to learn a representation from videos that is not tied to a specific source of videos or limited to a specific application. Instead we aim to learn a representation that is invariant to a perspective change, e.g., from a third-person perspective to an egocentric perspective, and can be applied to various modalities like videos or depth data without the need of collecting massive training data for all modalities. In other words, we aim to learn the essence of activities.
Max ERC Funding
1 499 875 €
Duration
Start date: 2016-06-01, End date: 2021-05-31
Project acronym ASAP
Project Thylakoid membrane in action: acclimation strategies in algae and plants
Researcher (PI) Roberta Croce
Host Institution (HI) STICHTING VU
Call Details Starting Grant (StG), LS1, ERC-2011-StG_20101109
Summary Life on earth is sustained by the process that converts sunlight energy into chemical energy: photosynthesis. This process is operating near the boundary between life and death: if the absorbed energy exceeds the capacity of the metabolic reactions, it can result in photo-oxidation events that can cause the death of the organism. Over-excitation is happening quite often: oxygenic organisms are exposed to (drastic) changes in environmental conditions (light intensity, light quality and temperature), which influence the physical (light-harvesting) and chemical (enzymatic reactions) parts of the photosynthetic process to a different extent, leading to severe imbalances. However, daily experience tells us that plants are able to deal with most of these situations, surviving and happily growing. How do they manage? The photosynthetic membrane is highly flexible and it is able to change its supramolecular organization and composition and even the function of some of its components on a time scale as fast as a few seconds, thereby regulating the light-harvesting capacity. However, the structural/functional changes in the membrane are far from being fully characterized and the molecular mechanisms of their regulation are far from being understood. This is due to the fact that all these mechanisms require the simultaneous presence of various factors and thus the system should be analyzed at a high level of complexity; however, to obtain molecular details of a very complex system as the thylakoid membrane in action has not been possible so far. Over the last years we have developed and optimized a range of methods that now allow us to take up this challenge. This involves a high level of integration of biological and physical approaches, ranging from plant transformation and in vivo knock out of individual pigments to ultrafast-spectroscopy in a mix that is rather unique for my laboratory and will allow us to unravel the photoprotective mechanisms in algae and plants.
Summary
Life on earth is sustained by the process that converts sunlight energy into chemical energy: photosynthesis. This process is operating near the boundary between life and death: if the absorbed energy exceeds the capacity of the metabolic reactions, it can result in photo-oxidation events that can cause the death of the organism. Over-excitation is happening quite often: oxygenic organisms are exposed to (drastic) changes in environmental conditions (light intensity, light quality and temperature), which influence the physical (light-harvesting) and chemical (enzymatic reactions) parts of the photosynthetic process to a different extent, leading to severe imbalances. However, daily experience tells us that plants are able to deal with most of these situations, surviving and happily growing. How do they manage? The photosynthetic membrane is highly flexible and it is able to change its supramolecular organization and composition and even the function of some of its components on a time scale as fast as a few seconds, thereby regulating the light-harvesting capacity. However, the structural/functional changes in the membrane are far from being fully characterized and the molecular mechanisms of their regulation are far from being understood. This is due to the fact that all these mechanisms require the simultaneous presence of various factors and thus the system should be analyzed at a high level of complexity; however, to obtain molecular details of a very complex system as the thylakoid membrane in action has not been possible so far. Over the last years we have developed and optimized a range of methods that now allow us to take up this challenge. This involves a high level of integration of biological and physical approaches, ranging from plant transformation and in vivo knock out of individual pigments to ultrafast-spectroscopy in a mix that is rather unique for my laboratory and will allow us to unravel the photoprotective mechanisms in algae and plants.
Max ERC Funding
1 696 961 €
Duration
Start date: 2011-12-01, End date: 2017-11-30
Project acronym ATMINDDR
Project ATMINistrating ATM signalling: exploring the significance of ATM regulation by ATMIN
Researcher (PI) Axel Behrens
Host Institution (HI) THE FRANCIS CRICK INSTITUTE LIMITED
Call Details Starting Grant (StG), LS1, ERC-2011-StG_20101109
Summary ATM is the protein kinase that is mutated in the hereditary autosomal recessive disease ataxia telangiectasia (A-T). A-T patients display immune deficiencies, cancer predisposition and radiosensitivity. The molecular role of ATM is to respond to DNA damage by phosphorylating its substrates, thereby promoting repair of damage or arresting the cell cycle. Following the induction of double-strand breaks (DSBs), the NBS1 protein is required for activation of ATM. But ATM can also be activated in the absence of DNA damage. Treatment of cultured cells with hypotonic stress leads to the activation of ATM, presumably due to changes in chromatin structure. We have recently described a second ATM cofactor, ATMIN (ATM INteractor). ATMIN is dispensable for DSBs-induced ATM signalling, but ATM activation following hypotonic stress is mediated by ATMIN. While the biological role of ATM activation by DSBs and NBS1 is well established, the significance, if any, of ATM activation by ATMIN and changes in chromatin was up to now completely enigmatic.
ATM is required for class switch recombination (CSR) and the suppression of translocations in B cells. In order to determine whether ATMIN is required for any of the physiological functions of ATM, we generated a conditional knock-out mouse model for ATMIN. ATM signaling was dramatically reduced following osmotic stress in ATMIN-mutant B cells. ATMIN deficiency led to impaired CSR, and consequently ATMIN-mutant mice developed B cell lymphomas. Thus ablation of ATMIN resulted in a severe defect in ATM function. Our data strongly argue for the existence of a second NBS1-independent mode of ATM activation that is physiologically relevant. While a large amount of scientific effort has gone into characterising ATM signaling triggered by DSBs, essentially nothing is known about NBS1-independent ATM signaling. The experiments outlined in this proposal have the aim to identify and understand the molecular pathway of ATMIN-dependent ATM signaling.
Summary
ATM is the protein kinase that is mutated in the hereditary autosomal recessive disease ataxia telangiectasia (A-T). A-T patients display immune deficiencies, cancer predisposition and radiosensitivity. The molecular role of ATM is to respond to DNA damage by phosphorylating its substrates, thereby promoting repair of damage or arresting the cell cycle. Following the induction of double-strand breaks (DSBs), the NBS1 protein is required for activation of ATM. But ATM can also be activated in the absence of DNA damage. Treatment of cultured cells with hypotonic stress leads to the activation of ATM, presumably due to changes in chromatin structure. We have recently described a second ATM cofactor, ATMIN (ATM INteractor). ATMIN is dispensable for DSBs-induced ATM signalling, but ATM activation following hypotonic stress is mediated by ATMIN. While the biological role of ATM activation by DSBs and NBS1 is well established, the significance, if any, of ATM activation by ATMIN and changes in chromatin was up to now completely enigmatic.
ATM is required for class switch recombination (CSR) and the suppression of translocations in B cells. In order to determine whether ATMIN is required for any of the physiological functions of ATM, we generated a conditional knock-out mouse model for ATMIN. ATM signaling was dramatically reduced following osmotic stress in ATMIN-mutant B cells. ATMIN deficiency led to impaired CSR, and consequently ATMIN-mutant mice developed B cell lymphomas. Thus ablation of ATMIN resulted in a severe defect in ATM function. Our data strongly argue for the existence of a second NBS1-independent mode of ATM activation that is physiologically relevant. While a large amount of scientific effort has gone into characterising ATM signaling triggered by DSBs, essentially nothing is known about NBS1-independent ATM signaling. The experiments outlined in this proposal have the aim to identify and understand the molecular pathway of ATMIN-dependent ATM signaling.
Max ERC Funding
1 499 881 €
Duration
Start date: 2012-02-01, End date: 2018-01-31
Project acronym BIGCODE
Project Learning from Big Code: Probabilistic Models, Analysis and Synthesis
Researcher (PI) Martin Vechev
Host Institution (HI) EIDGENOESSISCHE TECHNISCHE HOCHSCHULE ZUERICH
Call Details Starting Grant (StG), PE6, ERC-2015-STG
Summary The goal of this proposal is to fundamentally change the way we build and reason about software. We aim to develop new kinds of statistical programming systems that provide probabilistically likely solutions to tasks that are difficult or impossible to solve with traditional approaches.
These statistical programming systems will be based on probabilistic models of massive codebases (also known as ``Big Code'') built via a combination of advanced programming languages and powerful machine learning and natural language processing techniques. To solve a particular challenge, a statistical programming system will query a probabilistic model, compute the most likely predictions, and present those to the developer.
Based on probabilistic models of ``Big Code'', we propose to investigate new statistical techniques in the context of three fundamental research directions: i) statistical program synthesis where we develop techniques that automatically synthesize and predict new programs, ii) statistical prediction of program properties where we develop new techniques that can predict important facts (e.g., types) about programs, and iii) statistical translation of programs where we investigate new techniques for statistical translation of programs (e.g., from one programming language to another, or to a natural language).
We believe the research direction outlined in this interdisciplinary proposal opens a new and exciting area of computer science. This area will combine sophisticated statistical learning and advanced programming language techniques for building the next-generation statistical programming systems.
We expect the results of this proposal to have an immediate impact upon millions of developers worldwide, triggering a paradigm shift in the way tomorrow's software is built, as well as a long-lasting impact on scientific fields such as machine learning, natural language processing, programming languages and software engineering.
Summary
The goal of this proposal is to fundamentally change the way we build and reason about software. We aim to develop new kinds of statistical programming systems that provide probabilistically likely solutions to tasks that are difficult or impossible to solve with traditional approaches.
These statistical programming systems will be based on probabilistic models of massive codebases (also known as ``Big Code'') built via a combination of advanced programming languages and powerful machine learning and natural language processing techniques. To solve a particular challenge, a statistical programming system will query a probabilistic model, compute the most likely predictions, and present those to the developer.
Based on probabilistic models of ``Big Code'', we propose to investigate new statistical techniques in the context of three fundamental research directions: i) statistical program synthesis where we develop techniques that automatically synthesize and predict new programs, ii) statistical prediction of program properties where we develop new techniques that can predict important facts (e.g., types) about programs, and iii) statistical translation of programs where we investigate new techniques for statistical translation of programs (e.g., from one programming language to another, or to a natural language).
We believe the research direction outlined in this interdisciplinary proposal opens a new and exciting area of computer science. This area will combine sophisticated statistical learning and advanced programming language techniques for building the next-generation statistical programming systems.
We expect the results of this proposal to have an immediate impact upon millions of developers worldwide, triggering a paradigm shift in the way tomorrow's software is built, as well as a long-lasting impact on scientific fields such as machine learning, natural language processing, programming languages and software engineering.
Max ERC Funding
1 500 000 €
Duration
Start date: 2016-04-01, End date: 2021-03-31
Project acronym BIONET
Project Network Topology Complements Genome as a Source of Biological Information
Researcher (PI) Natasa Przulj
Host Institution (HI) UNIVERSITY COLLEGE LONDON
Call Details Starting Grant (StG), PE6, ERC-2011-StG_20101014
Summary Genetic sequences have had an enormous impact on our understanding of biology. The expectation is that biological network data will have a similar impact. However, progress is hindered by a lack of sophisticated graph theoretic tools that will mine these large networked datasets.
In recent breakthrough work at the boundary of computer science and biology supported by my USA NSF CAREER award, I developed sensitive network analysis, comparison and embedding tools which demonstrated that protein-protein interaction networks of eukaryotes are best modeled by geometric graphs. Also, they established phenotypically validated, unprecedented link between network topology and biological function and disease. Now I propose to substantially extend these preliminary results and design sensitive and robust network alignment methods that will lead to uncovering unknown biology and evolutionary relationships. The potential ground-breaking impact of such network alignment tools could be parallel to the impact the BLAST family of sequence alignment tools that have revolutionized our understanding of biological systems and therapeutics. Furthermore, I propose to develop additional sophisticated graph theoretic techniques to mine network data and hence complement biological information that can be extracted from sequence. I propose to exploit these new techniques for biological applications in collaboration with experimentalists at Imperial College London: 1. aligning biological networks of species whose genomes are closely related, but that have very different phenotypes, in order to uncover systems-level factors that contribute to pronounced differences; 2. compare and contrast stress response pathways and metabolic pathways in bacteria in a unified systems-level framework and exploit the findings for: (a) bioengineering of micro-organisms for industrial applications (production of bio-fuels, bioremediation, production of biopolymers); (b) biomedical applications.
Summary
Genetic sequences have had an enormous impact on our understanding of biology. The expectation is that biological network data will have a similar impact. However, progress is hindered by a lack of sophisticated graph theoretic tools that will mine these large networked datasets.
In recent breakthrough work at the boundary of computer science and biology supported by my USA NSF CAREER award, I developed sensitive network analysis, comparison and embedding tools which demonstrated that protein-protein interaction networks of eukaryotes are best modeled by geometric graphs. Also, they established phenotypically validated, unprecedented link between network topology and biological function and disease. Now I propose to substantially extend these preliminary results and design sensitive and robust network alignment methods that will lead to uncovering unknown biology and evolutionary relationships. The potential ground-breaking impact of such network alignment tools could be parallel to the impact the BLAST family of sequence alignment tools that have revolutionized our understanding of biological systems and therapeutics. Furthermore, I propose to develop additional sophisticated graph theoretic techniques to mine network data and hence complement biological information that can be extracted from sequence. I propose to exploit these new techniques for biological applications in collaboration with experimentalists at Imperial College London: 1. aligning biological networks of species whose genomes are closely related, but that have very different phenotypes, in order to uncover systems-level factors that contribute to pronounced differences; 2. compare and contrast stress response pathways and metabolic pathways in bacteria in a unified systems-level framework and exploit the findings for: (a) bioengineering of micro-organisms for industrial applications (production of bio-fuels, bioremediation, production of biopolymers); (b) biomedical applications.
Max ERC Funding
1 638 175 €
Duration
Start date: 2012-01-01, End date: 2017-12-31
Project acronym BroadSem
Project Induction of Broad-Coverage Semantic Parsers
Researcher (PI) Ivan Titov
Host Institution (HI) THE UNIVERSITY OF EDINBURGH
Call Details Starting Grant (StG), PE6, ERC-2015-STG
Summary In the last one or two decades, language technology has achieved a number of important successes, for example, producing functional machine translation systems and beating humans in quiz games. The key bottleneck which prevents further progress in these and many other natural language processing (NLP) applications (e.g., text summarization, information retrieval, opinion mining, dialog and tutoring systems) is the lack of accurate methods for producing meaning representations of texts. Accurately predicting such meaning representations on an open domain with an automatic parser is a challenging and unsolved problem, primarily because of language variability and ambiguity. The reason for the unsatisfactory performance is reliance on supervised learning (learning from annotated resources), with the amounts of annotation required for accurate open-domain parsing exceeding what is practically feasible. Moreover, representations defined in these resources typically do not provide abstractions suitable for reasoning.
In this project, we will induce semantic representations from large amounts of unannotated data (i.e. text which has not been labeled by humans) while guided by information contained in human-annotated data and other forms of linguistic knowledge. This will allow us to scale our approach to many domains and across languages. We will specialize meaning representations for reasoning by modeling relations (e.g., facts) appearing across sentences in texts (document-level modeling), across different texts, and across texts and knowledge bases. Learning to predict this linked data is closely related to learning to reason, including learning the notions of semantic equivalence and entailment. We will jointly induce semantic parsers (e.g., log-linear feature-rich models) and reasoning models (latent factor models) relying on this data, thus, ensuring that the semantic representations are informative for applications requiring reasoning.
Summary
In the last one or two decades, language technology has achieved a number of important successes, for example, producing functional machine translation systems and beating humans in quiz games. The key bottleneck which prevents further progress in these and many other natural language processing (NLP) applications (e.g., text summarization, information retrieval, opinion mining, dialog and tutoring systems) is the lack of accurate methods for producing meaning representations of texts. Accurately predicting such meaning representations on an open domain with an automatic parser is a challenging and unsolved problem, primarily because of language variability and ambiguity. The reason for the unsatisfactory performance is reliance on supervised learning (learning from annotated resources), with the amounts of annotation required for accurate open-domain parsing exceeding what is practically feasible. Moreover, representations defined in these resources typically do not provide abstractions suitable for reasoning.
In this project, we will induce semantic representations from large amounts of unannotated data (i.e. text which has not been labeled by humans) while guided by information contained in human-annotated data and other forms of linguistic knowledge. This will allow us to scale our approach to many domains and across languages. We will specialize meaning representations for reasoning by modeling relations (e.g., facts) appearing across sentences in texts (document-level modeling), across different texts, and across texts and knowledge bases. Learning to predict this linked data is closely related to learning to reason, including learning the notions of semantic equivalence and entailment. We will jointly induce semantic parsers (e.g., log-linear feature-rich models) and reasoning models (latent factor models) relying on this data, thus, ensuring that the semantic representations are informative for applications requiring reasoning.
Max ERC Funding
1 457 185 €
Duration
Start date: 2016-05-01, End date: 2021-04-30