Person:
Falls, Kathleen

Loading...
Profile Picture

Email Address

AA Acceptance Date

Birth Date

Research Projects

Organizational Units

Job Title

Last Name

Falls

First Name

Kathleen

Name

Falls, Kathleen

Search Results

Now showing 1 - 6 of 6
  • Thumbnail Image
    Publication
    Gene Model Annotations for Drosophila melanogaster: Impact of High-Throughput Data
    (Genetics Society of America, 2015) Matthews, Beverley; dos Santos, Gilberto; Crosby, Madeline; Emmert, David; St. Pierre, Susan E.; Gramates, L. Sian; Zhou, Pinglei; Schroeder, Andrew; Falls, Kathleen; Strelets, Victor; Russo, Susan M.; Gelbart, William M.
    We report the current status of the FlyBase annotated gene set for Drosophila melanogaster and highlight improvements based on high-throughput data. The FlyBase annotated gene set consists entirely of manually annotated gene models, with the exception of some classes of small non-coding RNAs. All gene models have been reviewed using evidence from high-throughput datasets, primarily from the modENCODE project. These datasets include RNA-Seq coverage data, RNA-Seq junction data, transcription start site profiles, and translation stop-codon read-through predictions. New annotation guidelines were developed to take into account the use of the high-throughput data. We describe how this flood of new data was incorporated into thousands of new and revised annotations. FlyBase has adopted a philosophy of excluding low-confidence and low-frequency data from gene model annotations; we also do not attempt to represent all possible permutations for complex and modularly organized genes. This has allowed us to produce a high-confidence, manageable gene annotation dataset that is available at FlyBase (http://flybase.org). Interesting aspects of new annotations include new genes (coding, non-coding, and antisense), many genes with alternative transcripts with very long 3′ UTRs (up to 15–18 kb), and a stunning mismatch in the number of male-specific genes (approximately 13% of all annotated gene models) vs. female-specific genes (less than 1%). The number of identified pseudogenes and mutations in the sequenced strain also increased significantly. We discuss remaining challenges, for instance, identification of functional small polypeptides and detection of alternative translation starts.
  • Thumbnail Image
    Publication
    Gene Model Annotations for Drosophila melanogaster: The Rule-Benders
    (Genetics Society of America, 2015) Crosby, Madeline; Gramates, L. Sian; dos Santos, Gilberto; Matthews, Beverley; St. Pierre, Susan E.; Zhou, Pinglei; Schroeder, Andrew; Falls, Kathleen; Emmert, David; Russo, Susan M.; Gelbart, William M.
    In the context of the FlyBase annotated gene models in Drosophila melanogaster, we describe the many exceptional cases we have curated from the literature or identified in the course of FlyBase analysis. These range from atypical but common examples such as dicistronic and polycistronic transcripts, noncanonical splices, trans-spliced transcripts, noncanonical translation starts, and stop-codon readthroughs, to single exceptional cases such as ribosomal frameshifting and HAC1-type intron processing. In FlyBase, exceptional genes and transcripts are flagged with Sequence Ontology terms and/or standardized comments. Because some of the rule-benders create problems for handlers of high-throughput data, we discuss plans for flagging these cases in bulk data downloads.
  • Thumbnail Image
    Publication
    The Framingham Heart Study 100K SNP Genome-Wide Association Study Resource: Overview of 17 Phenotype Working Group Reports
    (BioMed Central, 2007) Cupples, L Adrienne; Arruda, Heather T; Benjamin, Emelia J; D'Agostino, Ralph B; Demissie, Serkalem; DeStefano, Anita L; Dupuis, Josée; Govindaraju, Diddahally R; Heard-Costa, Nancy L; Hwang, Shih-Jen; Kathiresan, Sekar; Laramie, Jason M; Larson, Martin G; Liu, Chun-Yu; Lunetta, Kathryn L; Mailman, Matthew D; Manning, Alisa K; Murabito, Joanne M; O'Connor, George T; Pandey, Mona; Seshadri, Sudha; Vasan, Ramachandran S; Wilk, Jemma B; Wolf, Philip A; Yang, Qiong; Atwood, Larry D; Falls, Kathleen; Fox, Caroline; Gottlieb, Daniel; Guo, Chao-yu; Kiel, Douglas; Levy, Daniel; Meigs, James; Newton-Cheh, Christopher; O'Donnell, Christopher; Wang, Zhen
    Background: The Framingham Heart Study (FHS), founded in 1948 to examine the epidemiology of cardiovascular disease, is among the most comprehensively characterized multi-generational studies in the world. Many collected phenotypes have substantial genetic contributors; yet most genetic determinants remain to be identified. Using single nucleotide polymorphisms (SNPs) from a 100K genome-wide scan, we examine the associations of common polymorphisms with phenotypic variation in this community-based cohort and provide a full-disclosure, web-based resource of results for future replication studies. Methods: Adult participants (n = 1345) of the largest 310 pedigrees in the FHS, many biologically related, were genotyped with the 100K Affymetrix GeneChip. These genotypes were used to assess their contribution to 987 phenotypes collected in FHS over 56 years of follow up, including: cardiovascular risk factors and biomarkers; subclinical and clinical cardiovascular disease; cancer and longevity traits; and traits in pulmonary, sleep, neurology, renal, and bone domains. We conducted genome-wide variance components linkage and population-based and family-based association tests. Results: The participants were white of European descent and from the FHS Original and Offspring Cohorts (examination 1 Offspring mean age 32 ± 9 years, 54% women). This overview summarizes the methods, selected findings and limitations of the results presented in the accompanying series of 17 manuscripts. The presented association results are based on 70,897 autosomal SNPs meeting the following criteria: minor allele frequency ≥ 10%, genotype call rate ≥ 80%, Hardy-Weinberg equilibrium p-value ≥ 0.001, and satisfying Mendelian consistency. Linkage analyses are based on 11,200 SNPs and short-tandem repeats. Results of phenotype-genotype linkages and associations for all autosomal SNPs are posted on the NCBI dbGaP website at http:// www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?id=phs000007. Conclusion: We have created a full-disclosure resource of results, posted on the dbGaP website, from a genome-wide association study in the FHS. Because we used three analytical approaches to examine the association and linkage of 987 phenotypes with thousands of SNPs, our results must be considered hypothesis-generating and need to be replicated. Results from the FHS 100K project with NCBI web posting provides a resource for investigators to identify high priority findings for replication.
  • Thumbnail Image
    Publication
    FlyBase: enhancing Drosophila Gene Ontology annotations
    (Oxford University Press, 2009) Tweedie, Susan; Ashburner, Michael; Leyland, Paul; McQuilton, Peter; Marygold, Steven; Millburn, Gillian; Osumi-Sutherland, David; Seal, Ruth; Falls, Kathleen; Schroeder, Andrew; Zhang, Haiyan; The FlyBase Consortium
    FlyBase (http://flybase.org) is a database of Drosophila genetic and genomic information. Gene Ontology (GO) terms are used to describe three attributes of wild-type gene products: their molecular function, the biological processes in which they play a role, and their subcellular location. This article describes recent changes to the FlyBase GO annotation strategy that are improving the quality of the GO annotation data. Many of these changes stem from our participation in the GO Reference Genome Annotation Project—a multi-database collaboration producing comprehensive GO annotation sets for 12 diverse species.
  • Publication
    Multiplex Sequencing of 1.5 Mb of the Mycobacterium leprae Genome
    (Cold Spring Harbor Laboratory, 1997-08-01) Smith, DR; Richterich, P; Rubenfield, M; Rice, PW; Butler, Carolyn; Lee, HM; Kirst, S; Gundersen, Kathryn; Abendschan, K; Xu, Qiang; Chung, Ming-Kei; Deloughery, C; Aldredge, T; Maher, Janet; Lundstrom, R; Tulig, C; Falls, Kathleen; Imrich, Joann; Torrey, D; Engelstein, M; Breton, G; Madan, D; Nietupski, R; Seitz, B; Connelly, Sheila; McDougall, S; Safer, H; Gibson, R; Doucette-Stamm, L; Eiglmeier, K; Bergh, S; Cole, ST; Robison, K; Richterich, L; Johnson, Jacob; Church, George; Mao, Jialin
    The nucleotide sequence of 1.5 Mb of genomic DNA from Mycobacterium leprae was determined using computer-assisted multiplex sequencing technology. This brings the 2.8-Mb M. leprae genome sequence to ∼66% completion. The sequences, derived from 43 recombinant cosmids, contain 1046 putative protein-coding genes, 44 repetitive regions, 3 rRNAs, and 15 tRNAs. The gene density of one per 1.4 kb is slightly lower than that of Mycoplasma (1.2 kb). Of the protein coding genes, 44% have significant matches to genes with well defined functions. Comparison of 1157 M. leprae and 1564 Mycobacterium tuberculosis proteins shows a complex mosaic of homologous genomic blocks with up to 22 adjacent proteins in conserved map order. Matches to known enzymatic, antigenic, membrane, cell wall, cell division, multidrug resistance, and virulence proteins suggest therapeutic and vaccine targets. Unusual features of the M. leprae genome include large polyketide synthase (pks) operons, inteins, and highly fragmented pseudogenes
  • Publication
    Multiplex Sequencing of 1.5 Mb of theMycobacterium Leprae Genome
    (Cold Spring Harbor Laboratory, 1997-08) Smith, Douglas R.; Richterich, Peter; Rubenfield, Marc; Rice, Philip W.; Butler, Carol; Lee, Hong-Mei; Kirst, Susan; Gundersen, Kristin; Abendschan, Kari; Xu, Qinxue; Chung, Maria; Deloughery, Craig; Aldredge, Tyler; Maher, James; Lundstrom, Ronald; Tulig, Craig; Falls, Kathleen; Imrich, Joan; Torrey, Dana; Engelstein, Marcy; Breton, Gary; Madan, Deepika; Nietupski, Raymond; Seitz, Bruce; Connelly, Steven; McDougall, Steven; Safer, Hershel; Gibson, Rene; Doucette-Stamm, Lynn; Eiglmeier, Karin; Bergh, Staffan; Cole, Stewart T.; Robison, Keith; Richterich, Laura; Johnson, Jason; Church, George; Mao, Jen-i
    The nucleotide sequence of 1.5 Mb of genomic DNA from Mycobacterium leprae was determined using computer-assisted multiplex sequencing technology. This brings the 2.8-Mb M. leprae genome sequence to approximately 66% completion. The sequences, derived from 43 recombinant cosmids, contain 1046 putative protein-coding genes, 44 repetitive regions, 3 tRNAs, and 15 tRNAs. The gene density of one per 1.4 kb is slightly lower than that of Mycoplasma (1.2 kb). Of the protein coding genes, 44% have significant matches to genes with well-defined functions. Comparison of 1157 M. leprae and 1564 Mycobacterium tuberculosis proteins shows a complex mosaic of homologous genomic blocks with up to 22 adjacent proteins in conserved map order. Matches to known enzymatic, antigenic, membrane, cell wall, cell division, multidrug resistance, and virulence proteins suggest therapeutic and vaccine targets. Unusual features of the M. leprae genome include large polyketide synthase (pks) operons, inteins, and highly fragmented pseudogenes.