Dr Uri Keich

F07 - Carslaw Building
The University of Sydney

Telephone 9351 2307
Fax 9351 4534

Website Personal web page

Research interests

Bioinformatics. My current research is motivated by, and mostly concentrated in, biological sequence analysis. It ranges from collaborative research on actual biological problems such as the mapping of sequence motifs involved in DNA replication initiation, through the development of tools for the discovery and analysis of sequence motifs, to the design of novel computational approach for analysis of classical statistical tests. The following couple of examples of projects I am working on will give you a better idea what I mean.

Motif Finding - The identification of transcription factor binding sites is an important step in understanding the regulation of gene expression. To address this need, many motif-finding tools (or finders) have been described that can find short sequence motifs given for example only an input set of sequences. The motifs returned by these tools are evaluated and ranked according to some measure of statistical over-representation, the most popular of which is based on the information content, or entropy. Our interest here lies mainly in analyzing the statistical significance of a finder's output. This important area has lagged considerably behind the extensive development of the finders. While our goal is to design a reliable and usable significance analysis, we also show how such analysis can be leveraged to improve the actual motif finding process. Joint work with my former Ph.D. student Niranjan Nagarajan (now a Senior Research Scientist at the Genome Institute of Singapore), my former Ph.D. student Patrick Ng, and my current PhD student Emi Tanaka. Study of Replication Origins - DNA replication is a fundamental process essential for cell proliferation. While the proteins involved in initiating DNA replication are essentially conserved from yeast to humans, the implicated sequence motifs that these conserved factors interact with are poorly understood outside of S. cerevisiae (baker's yeast). In collaboration with Cornell molecular biologist Bik Tye and her post-doctoral researcher Ivan Liachko we are mapping replication origins in other yeast species hoping to gain an understanding of the evolution of DNA replication origins. Computational Statistics - Our search for an efficient and accurate computation of motif significance led us to develop a new approach for exact tests (exact tests are ones where the significance of the test is evaluated directly from the underlying distribution rather than using an approximation). Borrowing ideas from large-deviation theory, the underlying mechanism of our approach is the exact numerical calculation of the exponentially shifted characteristic function of the test statistic. We use this approach so far to develop faster exact algorithms for the classical multinomial goodness-of-fit test and the Mann-Whitney test. Joint work with my former PhD student Niranjan Nagarajan. Uri Keich is a member of the Statistics Research Group.

Teaching and supervision

Timetable

Selected publications

Download citations: PDF RTF Endnote

Journals

  • Liachko, I., Youngblood, R., Keich, U., Dunham, M. (2013). High-resolution mapping, characterization, and optimization of autonomously replicating sequences in yeast. Genome Research, 23(4), 698-704. [More Information]
  • Ng, P., Keich, U. (2011). Alignment Constrained Sampling. Journal of Computational Biology, 18(2), 155-168. [More Information]
  • Tanaka, E., Bailey, T., Grant, C., Noble, W., Keich, U. (2011). Improved similarity scores for comparing motifs. Bioinformatics, 27(12), 1603-1609. [More Information]
  • Liachko, I., Tanaka, E., Cox, K., Claire, S., Yang, L., Seher, A., Hallas, L., Cha, E., Kang, G., Pace, H., Keich, U., et al (2011). Novel Features of ARS Selection in Budding Yeast Lachancea kluyveri. BMC Genomics, 12(633), 1-18. [More Information]
  • Gupta, N., Bandeira, N., Keich, U., Pevzner, P. (2011). Target-Decoy Approach and False Discovery Rate: When Things May Go Wrong. Journal of the American Society for Mass Spectrometry, 22, 1111-1120. [More Information]
  • Liachko, I., Bhaskar, A., Lee, C., Chung, S., Tye, B., Keich, U. (2010). A Comprehensive Genome-Wide Map of Autonomously Replicating Sequences in a Naive Genome. PLoS Genetics, 6(5), 1-12.
  • Bhaskar, A., Keich, U. (2010). Confidently Estimating the Number of DNA Replication Origins. Statistical Applications in Genetics and Molecular Biology, 9(1), 1-21.
  • Oliver, H., Orsi, R., Ponnala, L., Keich, U., Wang, W., Sun, Q., Cartinhour, S., Filiatrault, M., Wiedmann, M., Boor, K. (2009). Deep RNA sequencing of L. monocytogenes reveals overlapping and extensive stationary phase and sigma B-dependent transcriptomes, including multiple highly transcribed noncoding RNAs. BMC Genomics, 10, 641-1-641-22.
  • Nagarajan, N., Keich, U. (2009). Reliability and efficiency of algorithms for computing the significance of the Mann-Whitney test. Computational Statistics and Data Analysis, 24(4), 605-622. [More Information]
  • Keich, U., Gao, H., Garretson, J., Bhaskar, A., Liachko, I., Donato, J., Tye, B. (2008). Computational detection of significant variation in binding affinity across two sets of sequences with application to the analysis of replication origins in yeast. BMC Bioinformatics, 9(12 September 2008), 372 - 1-372 - 12.
  • Ng, P., Keich, U. (2008). Factoring local sequence composition in motif significance analysis. Genome Informatics, 21, 15-26.
  • Nagarajan, N., Keich, U. (2008). FAST: Fourier transform based Algorithms for significance testing of ungapped multiple alignments. Bioinformatics, 24(4), 577-578.
  • Ng, P., Keich, U. (2008). GIMSAN: a Gibbs motif finder with significance analysis. Bioinformatics, 24(19), 2256-2257.
  • Keich, U., Ng, P. (2007). A conservative parametric approach to motif significance analysis. Genome Informatics, 19, 61-72.
  • Zhi, D., Keich, U., Pevzner, P., Heber, S., Tang, H. (2007). Correcting base-assignment errors in repeat regions of shotgun assembly. IEEE - ACM Transactions on Computational Biology and Bioinformatics, 4(1), 54-64.
  • Keich, U., Nagarajan, N. (2006). A fast and numerically robust method for exact multinomial goodness-of-fit test. Journal of Computational and Graphical Statistics, 15(4), 779-802.
  • Ng, P., Nagarajan, N., Jones, N., Keich, U. (2006). Apples to apples: improving the performance of motif finders and their significance analysis in the Twilight Zone. Bioinformatics, 22(14), e393-e401.
  • Nagarajan, N., Jones, N., Keich, U. (2005). Computing the P-value of the information content from an alignment of multiple sequences. Bioinformatics, 21(Supplement 1), i311-i318.
  • Keich, U. (2005). sFFT: A Faster Accurate Computation of the p-Value of the Entropy Score. Journal of Computational Biology, 12(4), 416-430.
  • Keich, U., Li, M., Bin, M., Tromp, J. (2004). On spaced seeds for similarity search. Discrete Applied Mathematics, 138(3), 253-263.
  • Keich, U. (2003). Stationary Tangent - the discrete and non-smooth case. Journal of Time Series Analysis, 24(2), 173-192.
  • Keich, U., Pevzner, P. (2002). Finding motifs in the twilight zone. Bioinformatics, 18(10), 1374-1381.
  • Keich, U., Pevzner, P. (2002). Subtle motifs: defining the limits of motif finding algorithms. Bioinformatics, 18(10), 1382-1390.

Conferences

  • Nagarajan, N., Ng, P., Keich, U. (2006). Refining motif finders with E-value calculations. 3rd RECOMB Satellite Workshop on Regulatory Genomics, ---: Imperial College Press.
  • Keich, U., Nagarajan, N. (2004). A faster reliable algorithm to estimate the p-value of the multinomial 11r statistic. 4th International Workshop on Algorithms in Bioinformatics, Germnay: Springer.
  • Buhler, J., Keich, U., Sun, Y. (2003). Designing seeds for similarity search in genomic DNA. Seventh Annual International Conference on Research in Computational Molecular Biology, Germany: Springer.
  • Eskin, E., Keich, U., Gelfand, M., Pevzner, P. (2003). Genome-wide analysis of bacterial promoter regions. The Pacific Symposium on Biocomputing, USA: World Scientific Publishing.
  • Keich, U., Pevzner, P. (2002). Finding motifs in the twilight zone. Sixth Annual International Conference on Research in Computational Molecular Biology, New York: Association for Computing Machinery (ACM).

2013

  • Liachko, I., Youngblood, R., Keich, U., Dunham, M. (2013). High-resolution mapping, characterization, and optimization of autonomously replicating sequences in yeast. Genome Research, 23(4), 698-704. [More Information]

2011

  • Ng, P., Keich, U. (2011). Alignment Constrained Sampling. Journal of Computational Biology, 18(2), 155-168. [More Information]
  • Tanaka, E., Bailey, T., Grant, C., Noble, W., Keich, U. (2011). Improved similarity scores for comparing motifs. Bioinformatics, 27(12), 1603-1609. [More Information]
  • Liachko, I., Tanaka, E., Cox, K., Claire, S., Yang, L., Seher, A., Hallas, L., Cha, E., Kang, G., Pace, H., Keich, U., et al (2011). Novel Features of ARS Selection in Budding Yeast Lachancea kluyveri. BMC Genomics, 12(633), 1-18. [More Information]
  • Gupta, N., Bandeira, N., Keich, U., Pevzner, P. (2011). Target-Decoy Approach and False Discovery Rate: When Things May Go Wrong. Journal of the American Society for Mass Spectrometry, 22, 1111-1120. [More Information]

2010

  • Liachko, I., Bhaskar, A., Lee, C., Chung, S., Tye, B., Keich, U. (2010). A Comprehensive Genome-Wide Map of Autonomously Replicating Sequences in a Naive Genome. PLoS Genetics, 6(5), 1-12.
  • Bhaskar, A., Keich, U. (2010). Confidently Estimating the Number of DNA Replication Origins. Statistical Applications in Genetics and Molecular Biology, 9(1), 1-21.

2009

  • Oliver, H., Orsi, R., Ponnala, L., Keich, U., Wang, W., Sun, Q., Cartinhour, S., Filiatrault, M., Wiedmann, M., Boor, K. (2009). Deep RNA sequencing of L. monocytogenes reveals overlapping and extensive stationary phase and sigma B-dependent transcriptomes, including multiple highly transcribed noncoding RNAs. BMC Genomics, 10, 641-1-641-22.
  • Nagarajan, N., Keich, U. (2009). Reliability and efficiency of algorithms for computing the significance of the Mann-Whitney test. Computational Statistics and Data Analysis, 24(4), 605-622. [More Information]

2008

  • Keich, U., Gao, H., Garretson, J., Bhaskar, A., Liachko, I., Donato, J., Tye, B. (2008). Computational detection of significant variation in binding affinity across two sets of sequences with application to the analysis of replication origins in yeast. BMC Bioinformatics, 9(12 September 2008), 372 - 1-372 - 12.
  • Ng, P., Keich, U. (2008). Factoring local sequence composition in motif significance analysis. Genome Informatics, 21, 15-26.
  • Nagarajan, N., Keich, U. (2008). FAST: Fourier transform based Algorithms for significance testing of ungapped multiple alignments. Bioinformatics, 24(4), 577-578.
  • Ng, P., Keich, U. (2008). GIMSAN: a Gibbs motif finder with significance analysis. Bioinformatics, 24(19), 2256-2257.

2007

  • Keich, U., Ng, P. (2007). A conservative parametric approach to motif significance analysis. Genome Informatics, 19, 61-72.
  • Zhi, D., Keich, U., Pevzner, P., Heber, S., Tang, H. (2007). Correcting base-assignment errors in repeat regions of shotgun assembly. IEEE - ACM Transactions on Computational Biology and Bioinformatics, 4(1), 54-64.

2006

  • Keich, U., Nagarajan, N. (2006). A fast and numerically robust method for exact multinomial goodness-of-fit test. Journal of Computational and Graphical Statistics, 15(4), 779-802.
  • Ng, P., Nagarajan, N., Jones, N., Keich, U. (2006). Apples to apples: improving the performance of motif finders and their significance analysis in the Twilight Zone. Bioinformatics, 22(14), e393-e401.
  • Nagarajan, N., Ng, P., Keich, U. (2006). Refining motif finders with E-value calculations. 3rd RECOMB Satellite Workshop on Regulatory Genomics, ---: Imperial College Press.

2005

  • Nagarajan, N., Jones, N., Keich, U. (2005). Computing the P-value of the information content from an alignment of multiple sequences. Bioinformatics, 21(Supplement 1), i311-i318.
  • Keich, U. (2005). sFFT: A Faster Accurate Computation of the p-Value of the Entropy Score. Journal of Computational Biology, 12(4), 416-430.

2004

  • Keich, U., Nagarajan, N. (2004). A faster reliable algorithm to estimate the p-value of the multinomial 11r statistic. 4th International Workshop on Algorithms in Bioinformatics, Germnay: Springer.
  • Keich, U., Li, M., Bin, M., Tromp, J. (2004). On spaced seeds for similarity search. Discrete Applied Mathematics, 138(3), 253-263.

2003

  • Buhler, J., Keich, U., Sun, Y. (2003). Designing seeds for similarity search in genomic DNA. Seventh Annual International Conference on Research in Computational Molecular Biology, Germany: Springer.
  • Eskin, E., Keich, U., Gelfand, M., Pevzner, P. (2003). Genome-wide analysis of bacterial promoter regions. The Pacific Symposium on Biocomputing, USA: World Scientific Publishing.
  • Keich, U. (2003). Stationary Tangent - the discrete and non-smooth case. Journal of Time Series Analysis, 24(2), 173-192.

2002

  • Keich, U., Pevzner, P. (2002). Finding motifs in the twilight zone. Sixth Annual International Conference on Research in Computational Molecular Biology, New York: Association for Computing Machinery (ACM).
  • Keich, U., Pevzner, P. (2002). Finding motifs in the twilight zone. Bioinformatics, 18(10), 1374-1381.
  • Keich, U., Pevzner, P. (2002). Subtle motifs: defining the limits of motif finding algorithms. Bioinformatics, 18(10), 1382-1390.

To update your profile click here. For support on your academic profile contact .