|
Current Projects
|
about |
The morphological interpretation of histologic sections
forms the basis of diagnosis and prognostication for cancer. In the
diagnosis of carcinomas, pathologists perform a semiquantitative
analysis of a small set of morphological
features to determine the cancer’s histologic grade. Physicians use
histologic grade to inform their assessment of a carcinoma’s
aggressiveness and a patient’s prognosis. Nevertheless, the
determination of grade in breast cancer examines only a small set of
morphological features of breast cancer epithelial cells, which has
been largely unchanged since the 1920s. A comprehensive analysis of
automatically quantitated morphological features could identify
characteristics of prognostic relevance and provide an accurate and
reproducible means for assessing prognosis from microscopic image
data. We developed the C-Path (Computational Pathologist) system to
measure a rich quantitative feature set from the breast cancer
epithelium and stroma (6642 features), including both standard
morphometric descriptors of image objects and higher-level contextual,
relational, and global image features. These measurements were used to
construct a prognostic model. We applied the C-Path system to
microscopic images from two independent cohorts of breast cancer
patients [from the Netherlands Cancer Institute (NKI) cohort, n = 248,
and the Vancouver General Hospital (VGH) cohort, n = 328]. The
prognostic model score generated by our system was strongly associated
with overall survival in both the NKI and the VGH cohorts (both
log-rank P < 0.001). This association was independent of clinical,
pathological, and molecular factors. Three stromal
features were significantly associated with survival, and this
association was stronger than the association of survival with
epithelial characteristics in the model. These findings implicate
stromal morphologic structure as a previously unrecognized prognostic
determinant for breast cancer.
|
people |
Andrew H. Beck, Ankur R Sangoi, Samuel Leung, Robert J. Marinelli, Torsten O. Nielsen, Marc J. van de Vijver, Robert B. West, Matt van de Rijn, Daphne Koller,
|
about |
Many works in computer vision attempt to tackle problems that form key
components in the scene interpretation task: object recognition, image
segmentation and 3D reconstruction. However, the vast majority of the
work has focused on solving each component in isolation. This
approach has allowed researchers to focus their efforts on engineering
solutions to each of these key problems, resulting in dramatic
improvements in our ability to tackle them. However, this
divide-and-conquer approach has two main limitations. First,
because there is no global understanding of the scene structure, many
methods make errors that appear ridiculous to a human, such as
segmenting a person's head as part of the background, or detecting a
cow in the shadows in the grass. Second, it is only by providing a
consistent set of answers to all of these problems that we can provide
a coherent interpretation for an entire scene.
In this project, we develop an integrated probabilistic model that
provides a consistent, semantic interpretation of all
components of an outdoor scene.
|
people |
Stephen Gould,
Tianshi Gao,
Pawan Kumar,
Ben Packer,
Daphne Koller |
about |
Significant insight about biological networks arises from the study of network motifs small wiring patterns that are overly abundant in the network. However, wiring patterns, like a street map, only reflect the set of potential routes within a cellular network, but not when and how they are used within different cellular processes. Here, we introduce activity motifs, which, like traffic flow, reflect dynamic patterns that are abundant relative to the given network, and use them to study the timing of transcriptional regulation in Saccharomyces cerevisiae metabolism. Specific timing activity motifs, reflecting ordered transcription, are enriched in cellular responses to changing conditions: Linear pathways are enriched for forward activation patterns to produce metabolic compounds efficiently; backward activation to rapidly initiate the production of a critical substrate; and backward shutoff to rapidly stop production of a detrimental product. Branching pathways are enriched for synchronized activation of dependent co-production. We validate our model by measuring protein abundance over a time course, showing that our inferred mRNA timing motifs also occur at the protein level. We also find binding activity motifs, where the genes in a linear chain have ordered binding strength to a particular transcription factor; these binding activity motifs overlap significantly with the timing activity motifs, suggesting a specific biochemical mechanism for ordered transcription. The results show that finely-timed transcriptional regulation is abundant in the yeast metabolic network, and is likely to play a role in its adaptation to new environmental conditions. More generally, the framework of activity motifs is applicable for analyzing a variety of biological networks and functional data, and may be useful in elucidating a broad range of cellular functions.
See the accompanying webpage
|
people |
Gal Chechik, Daphne Koller |
about |
The set of cellular metabolic reactions forms a complex network of interactions, but even in well studied organisms the resulting pathways contain many unidentified enzymes. We study how 'structural' relations between genes in the yeast metabolic pathway are manifested in functional properties of genes and their products, including mRNA expression, protein domain content and cellular localizations. We develop compact and interpretable probabilistic models for representing protein-domain co- occurrences and gene expression time courses. Our models for completing unidentified enzymes in the pathways, achieving accuracy that is significantly superior to existing state-of-the-art approaches. |
people |
Gal Chechik, Daphne Koller |
about |
Protein-protein interactions are central to all cellular processes. Discovery of mechanisms underlying protein interaction network will allow for meaningful predictions about the functions of cellular proteins, with possible applications to drug design. We are using probabilistic models to extract patterns from genomic data and make accurate predictions on protein-protein interactions. |
people |
Haidong Wang, Daphne Koller |
more info |
Protein-protein interactions project page
|
about |
We consider the important challenge of recognizing a variety of deformable object classes in images. Of fundamental importance and particular difficulty in this setting is the problem of "outlining" an object, rather than simply deciding on its presence or absence. A major obstacle in learning a model that will allow us to address this task is the need for hand-segmented training images. In this paper we present a novel landmark-based, piecewise-linear model of the shape of an object class. We then formulate a learning approach that allows us to learn this model with minimal user supervision. We circumvent the need for hand-segmentation by transferring the shape "essence" of an object from drawings to complex images. We show that our method is able to automatically and effectively learn, detect and localize a variety of object classes. |
people |
Geremy Heitz, Gal Elidan, Daphne Koller |
|
Past Projects
|
Acting Rationally with Incomplete Utility Information
about |
Traditional decision theory assumes a probability distribution over possible states and full knowledge of the user's utility function over these states. In many problems, however, the utility information is unavailable or too complex to be elicited fully. We extend the notion of rational decision making to deal with such cases. |
people |
Urszula Chajewska, Daphne Koller |
more info |
Urszula's home page
|
about |
With Active Learning one allows the learner the flexibility to choose the data instances that it feels are most relevant to learn a particular task. We are investigating how active learning can substantially reduce the need for large quantities of data for classification, density estimation and discovering causal structure. |
people |
Simon Tong, Daphne Koller |
more info |
DAGS Active Learning Page Simon Tong's Research Page
|
Continuous Time Bayesian Networks
about |
Continuous time Bayesian networks describe structured stochastic processes that evolve over continuous time. The state of the system is decomposed into a set of local variables whose values change over time. The dynamics of the system are described by specifying the behavior of each local variable as a function of its parents in a directed (possibly cyclic) graph. The model specifies, at any given point in time, the distribution over two aspects: when a local variable changes its value and the next value it takes. These distributions are determined by the variable's current value and the current values of its parents in the graph. |
people |
Uri Nodelman, Christian Shelton, Daphne Koller |
about |
Game theory is a framework for describing the interrelated behavior of multiple agents acting rationally. We are interested in compact representations for structured games, including Multi-Agent Influence Diagrams (MAIDs). We are developing algorithms to exploit this structure in order to compute equilibria efficiently for large games, of the sort that might occur in real-world settings. |
people |
Ben Blum, Daphne Koller, Christian Shelton |
more info |
Game Tracer Software
|
Hybrid Bayesian Networks
about |
Many real world problems are naturally described as hybrid systems, which contain both discrete and continuous components. Examples include fault diagnostics in physical systems, tracking human motions and more. We are exploring methods to deal with the challenging problems of represntation, inference and learning that come up in these systems. |
people |
Uri Lerner, Daphne Koller |
more info |
Uri Lerner's Publications Page
|
about |
We are developing probabilistic models for analyzing biological data
using Probabilistic relational models (PRMs) - an extension of Bayesian
networks to a relational setting, where we have multiple interdependent
objects. Using PRMs, we can incorporate multiple sources of data such
as gene expression patterns, experimental or clinical data, cellular
phenotypes, sequence data, protein 3D structural information, functional
information and more, into the analysis. This enables us to build richer
models that are more suitable for this complex domain.
|
more info |
DAGS Learning Models of Biological and Medical Data Page GeneXPress
|
about |
Markov Decision Processes are formal models for problems in planning, control, and sequential decision making under uncertainty. In our work, we are mainly concerned with the learning of optimal controls from data and with exploiting structure for efficient computation. Focus is on multi-agent systems, partial observability, and continuous states. |
people |
Carlos Guestrin, Christian Shelton, Daphne Koller |
more info |
DAGS MDP Page
|
about |
Probabilistic Relational Models (PRMs) are a language based on relational logic for describing statistical models of structured data. PRMs model complex domains in terms of entities, their properties, and the relations between them. These models represent the uncertainty over the properties of an entity, capturing its probabilistic dependence both on other properties of that entity and on properties of related entities. PRMs can also represent uncertainty over the relational structure itself. |
people |
Nir Friedman, Lise Getoor, Daphne Koller, Uri Nodelman, Avi Pfeffer, Eran Segal, Ben Taskar |
more info |
DAGS PRMs Page
|
|
| |