A Software Framework for Exploring 1,000 Genomes of African Descent
用于探索 1,000 个非洲人后裔基因组的软件框架
基本信息
- 批准号:9301024
- 负责人:
- 金额:$ 44.74万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2015
- 资助国家:美国
- 起止时间:2015-07-01 至 2019-06-30
- 项目状态:已结题
- 来源:
- 关键词:AfricaAfricanAlgorithmsAmericasArchitectureAsthmaAuthorization documentationBacteriaCaribbean regionCatalogsCentral AmericaCommunitiesComputational algorithmComputer softwareDNA SequenceDNA Sequence DatabasesDataData SetDatabasesDevicesDiseaseGenesGenetic VariationGenetic studyGenomeGenomicsGoalsHumanHuman GenomeHypersensitivityIndividualInvestigationLengthLicensingLocationMapsMethodsModelingMutationMutation DetectionNational Heart, Lung, and Blood InstituteNucleic Acid Regulatory SequencesNucleotidesPopulationProcessProtocols documentationRNA SplicingResearch PersonnelResourcesRetrievalRiskSchemeScientistSecureSepsisSequence AlignmentSiteSoftware FrameworkSoftware ToolsSouth AmericaSpeedSystemTimeUnited StatesVariantbasedata sharingdatabase of Genotypes and Phenotypesdesignfusion genegene discoverygenome databasehigh riskhuman datahuman subjectindexinginterestmicrobialnext generationnovelopen sourcepreventprogramspublic health relevancereference genomesample collectionsearch enginesoftware developmentterabytetooltrait
项目摘要
DESCRIPTION (provided by applicant): We propose to create new software and analysis methods designed to make possible the exploration of a unique dataset, the 1,004 genomes sequenced by the Consortium on Asthma among African-Ancestry Populations in the Americas (CAAPA). The size of this dataset, over 130 Terabytes, currently prevents it from being explored with alignment-based tools, and researchers instead are limited to using the much smaller files containing single-nucleotide variants. Our proposed software will make this dataset and others like it available for real- time searching, a capability that is not yet possible for any genomic database of this size. Since the early 1990s, scientists have used DNA sequence databases to study a wide range of problems, including novel gene discovery, mutation detection, the investigation of larger structural variants, and evolutionary processes. The ability to search all known genes and genomes using BLAST and similar programs has long been assumed, and sequence search engines throughout the world provide this ability. However, the vast size of the CAAPA dataset makes it impossible to search the data itself using current tools. One cannot look for specific mutations, extract and re-analyze data for any particular gene or regulatory region, or look for structural variants. Newer, fast next-generation sequence alignment programs such as Bowtie, originally developed in our group, allow far faster alignment of NGS reads to the genome, but even these programs cannot search data on the scale of CAAPA in real time. Different architectures need to be designed and built to accommodate these very large datasets. The CAAPA exploration system (CESYS) will use a combination of a highly efficient database, very fast storage, and fast search algorithms to achieve our goals. This project aims to accomplish several goals that will dramatically enhance the value of CAAPA. First, the data will be made available to a very large community of researchers, who can use it not only to study the genetics of asthma and allergy in the CAAPA populations, but also to compare these subjects to other groups. The data currently resides on hard drives and is available only to a small number of the project's PIs, a situation that limits its value. Second, b creating an authentication system consistent with dbGaP, we will create a data sharing model that other projects can use and that will remove some of the technical barriers to sharing genome data from human subjects. Third, as part of building the database, we will re-call all the SNPs using the newly released human genome build (hg20), creating a consistent set of variants that we will also share freely through the project database. Fourth, we will identify all bacterial contaminants, including those in a subset of subjects known to have bloodstream infections at the time of sample collection. Fifth, we will identify structural variants unique to he CAAPA population, which we can then explore for any association with the risk of asthma.
项目成果
期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Kathleen C Barnes其他文献
The CD14(−159) polymorphism is not associated with circulating sCD14 nor total serum IgE in an asthmatic population of African descent
- DOI:
10.1016/s0091-6749(02)81809-7 - 发表时间:
2002-01-01 - 期刊:
- 影响因子:
- 作者:
April Zambelli-Weiner;Bernadatte Gray;Paul N Levett;Raana P Naidu;Kathleen C Barnes - 通讯作者:
Kathleen C Barnes
Body mass index associates with asthma and respiratory symptoms but is not explained by diet in a caucasian isolate
- DOI:
10.1016/s0091-6749(02)81811-5 - 发表时间:
2002-01-01 - 期刊:
- 影响因子:
- 作者:
Kathyrn B Held;Rasika A Mathias;Kathleen C Barnes - 通讯作者:
Kathleen C Barnes
Kathleen C Barnes的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Kathleen C Barnes', 18)}}的其他基金
PRIDE Academy: Impact of Ancestry and Gender to omics of lung diseases
PRIDE Academy:血统和性别对肺部疾病组学的影响
- 批准号:
10077882 - 财政年份:2019
- 资助金额:
$ 44.74万 - 项目类别:
PRIDE Academy: Impact of Ancestry and Gender to omics of lung diseases
PRIDE Academy:血统和性别对肺部疾病组学的影响
- 批准号:
10378108 - 财政年份:2019
- 资助金额:
$ 44.74万 - 项目类别:
Multi-omic studies of asthma severity in an African ancestry population
非洲血统人群哮喘严重程度的多组学研究
- 批准号:
10094181 - 财政年份:2018
- 资助金额:
$ 44.74万 - 项目类别:
Multi-omic studies of asthma severity in an African ancestry population
非洲血统人群哮喘严重程度的多组学研究
- 批准号:
10331294 - 财政年份:2018
- 资助金额:
$ 44.74万 - 项目类别:
Multi-omic studies of asthma severity in an African ancestry population
非洲血统人群哮喘严重程度的多组学研究
- 批准号:
9522470 - 财政年份:2018
- 资助金额:
$ 44.74万 - 项目类别:
New Approaches for Empowering Studies of Asthma in Populations of African Descent
非洲人后裔哮喘研究的新方法
- 批准号:
9256781 - 财政年份:2016
- 资助金额:
$ 44.74万 - 项目类别:
A Software Framework for Exploring 1,000 Genomes of African Descent
用于探索 1,000 个非洲人后裔基因组的软件框架
- 批准号:
9096211 - 财政年份:2015
- 资助金额:
$ 44.74万 - 项目类别:
Integrative Genomics in Asthmatics of African Descent
非洲裔哮喘的综合基因组学
- 批准号:
9230688 - 财政年份:2014
- 资助金额:
$ 44.74万 - 项目类别:
The autophagic pathway and atopic asthma: role of IL-33 and ST2
自噬途径和特应性哮喘:IL-33 和 ST2 的作用
- 批准号:
8811919 - 财政年份:2014
- 资助金额:
$ 44.74万 - 项目类别:
The autophagic pathway and atopic asthma: role of IL-33 and ST2
自噬途径和特应性哮喘:IL-33 和 ST2 的作用
- 批准号:
8677159 - 财政年份:2014
- 资助金额:
$ 44.74万 - 项目类别:
相似海外基金
Tracing the African roots of Sri-Lanka Portuguese
追溯斯里兰卡葡萄牙语的非洲根源
- 批准号:
AH/Z505717/1 - 财政年份:2024
- 资助金额:
$ 44.74万 - 项目类别:
Research Grant
Commercialisation of African Youth Enterprise Programme
非洲青年企业计划商业化
- 批准号:
ES/Y010752/1 - 财政年份:2024
- 资助金额:
$ 44.74万 - 项目类别:
Research Grant
Evaluating the effectiveness and sustainability of integrating helminth control with seasonal malaria chemoprevention in West African children
评估西非儿童蠕虫控制与季节性疟疾化学预防相结合的有效性和可持续性
- 批准号:
MR/X023133/1 - 财政年份:2024
- 资助金额:
$ 44.74万 - 项目类别:
Fellowship
Resilient and Equitable Nature-based Pathways in Southern African Rangelands (REPAiR)
南部非洲牧场弹性且公平的基于自然的途径 (REPAiR)
- 批准号:
NE/Z503459/1 - 财政年份:2024
- 资助金额:
$ 44.74万 - 项目类别:
Research Grant
Bovine herpesvirus 4 as a vaccine platform for African swine fever virus antigens in pigs
牛疱疹病毒 4 作为猪非洲猪瘟病毒抗原的疫苗平台
- 批准号:
BB/Y006224/1 - 财政年份:2024
- 资助金额:
$ 44.74万 - 项目类别:
Research Grant
Understanding differences in host responses to African swine fever virus
了解宿主对非洲猪瘟病毒反应的差异
- 批准号:
BB/Z514457/1 - 财政年份:2024
- 资助金额:
$ 44.74万 - 项目类别:
Fellowship
The impact on human health of restoring degraded African drylands
恢复退化的非洲旱地对人类健康的影响
- 批准号:
MR/Y019806/1 - 财政年份:2024
- 资助金额:
$ 44.74万 - 项目类别:
Research Grant
CAREER: Habitability of the Hadean Earth - A South African perspective
职业:冥古宙地球的宜居性——南非的视角
- 批准号:
2336044 - 财政年份:2024
- 资助金额:
$ 44.74万 - 项目类别:
Continuing Grant
Nowcasting with Artificial Intelligence for African Rainfall: NAIAR
利用人工智能预测非洲降雨量:NAIAR
- 批准号:
NE/Y000420/1 - 财政年份:2024
- 资助金额:
$ 44.74万 - 项目类别:
Research Grant
South African Modernism (Follow-on-Funding): Decolonising English Literary Studies In and Beyond the Classroom
南非现代主义(后续资助):课堂内外的英国文学研究去殖民化
- 批准号:
AH/Z50581X/1 - 财政年份:2024
- 资助金额:
$ 44.74万 - 项目类别:
Research Grant