SemiSynBio: Highly scalable random access DNA data storage with nanopore-based reading
SemiSynBio:高度可扩展的随机访问 DNA 数据存储,具有基于纳米孔的读取功能
基本信息
- 批准号:1807371
- 负责人:
- 金额:$ 112.5万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2018
- 资助国家:美国
- 起止时间:2018-08-01 至 2022-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
The rapid accumulation of information stored as computer files, such as images, videos, etc. require a lot of computer and internet data storage. The ways we maintain computer data today may cost a lot of money and energy (especially cooling); also, the materials on which the files are written are not too stable - they get spoiled with time, so that data may be eventually lost as a matter of few decades. To solve this issue, a new technology of writing and reading digital data in the molecular strings will be developed, based on DNA, the molecules from which the genetic code is also made. All living cells rely on DNA molecules for storing the instructions to run our cells and tissues, and these molecules are more stable than magnetic tape or paper. If successful, this DNA-based storage of computer data would readily retain all of the world`s current electronic data.To develop a DNA-based data storage technology, a coding scheme that can reliably write and read back data in segments of DNA is proposed. One approach will involve the use of combinatorial molecular barcodes for addressing and random access, to generate the data blocks as well. Synthesis schemes to write long segments of DNA will be employed, where millions of such segment will be generated in parallel. Longer segments allow one to divide large files into fewer segments and thus require shorter index and random access barcodes. The use of nanopore DNA sequencers that generate long sequences will permit reading this data. Sophisticated mathematical coding techniques will be utilized to robustly reconstruct such a written message after accounting for errors specific to the write and read platforms, with a special emphasis on using nanopore technology. The coding techniques are tailored to the higher error rates in nanopore sequencing, which is the most promising for a scalable sequencing scheme. A DNA storage simulator will also be developed, that will allow researchers to model specific application needs, using predefined or custom error models for DNA write and read platforms and a variety of coding models for addresses and data. This will allow the trade-offs between cost, robustness and efficiency to be simulated at data scales from Gigabytes to Exabytes. Several cohorts of students will be trained in the field on the interface of genetics, biochemistry, electrical engineering and coding theory in the course of the proposed work.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
存储为计算机文件的信息(如图像、视频等)的快速积累需要大量的计算机和互联网数据存储。我们今天维护计算机数据的方式可能会花费大量的金钱和能源(特别是冷却);此外,文件写入的材料不是太稳定-它们会随着时间的推移而损坏,因此数据最终可能会丢失几十年。为了解决这个问题,将开发一种基于DNA的分子串中写入和阅读数字数据的新技术,DNA也是遗传密码的分子。所有的活细胞都依赖于DNA分子来存储运行我们细胞和组织的指令,这些分子比磁带或纸张更稳定。如果成功的话,这种基于DNA的计算机数据存储将很容易地保留世界上所有当前的电子数据。为了开发基于DNA的数据存储技术,提出了一种编码方案,可以可靠地写入和读回DNA片段中的数据。一种方法将涉及使用组合分子条形码进行寻址和随机访问,以生成数据块。将采用写入DNA长片段的合成方案,其中将并行生成数百万个这样的片段。较长的段允许将大文件分成较少的段,因此需要较短的索引和随机访问条形码。使用产生长序列的纳米孔DNA测序仪将允许阅读该数据。将利用复杂的数学编码技术,在考虑到写入和读取平台特有的错误后,稳健地重建这样的书面信息,特别强调使用纳米孔技术。编码技术是针对纳米孔测序中的较高错误率而定制的,这对于可扩展的测序方案是最有希望的。还将开发一个DNA存储模拟器,使研究人员能够使用DNA读写平台的预定义或自定义错误模型以及地址和数据的各种编码模型来模拟特定的应用需求。这将允许成本、鲁棒性和效率之间的权衡,以从1000字节到1000字节的数据规模进行模拟。几批学生将在该领域的遗传学,生物化学,电气工程和编码理论的接口在拟议的工作过程中进行培训。这个奖项反映了NSF的法定使命,并已被认为是值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估的支持。
项目成果
期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Impact of lossy compression of nanopore raw signal data on basecalling and consensus accuracy
- DOI:10.1093/bioinformatics/btaa1017
- 发表时间:2020-12-01
- 期刊:
- 影响因子:5.8
- 作者:Chandak, Shubham;Tatwawadi, Kedar;Weissman, Tsachy
- 通讯作者:Weissman, Tsachy
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Hanlee Ji其他文献
81. RESOLVING THE EXACT BREAKPOINTS AND SEQUENCE REARRANGEMENTS OF LARGE NEUROPSYCHIATRIC COPY NUMBER VARIATIONS (CNVS) AT SINGLE BASE-PAIR RESOLUTION USING CRISPR-TARGETED ULTRA-LONG READ SEQUENCING (CTLR-SEQ)
- DOI:
10.1016/j.euroneuro.2022.07.166 - 发表时间:
2022-10-01 - 期刊:
- 影响因子:
- 作者:
Bo Zhou;GiWon Shin;Lisanne Vervoort;Stephanie Greer;Yiling Huang;Tanmoy Roychowdhury;Reenal Pattni;Alexej Abyzov;Joris Vermeesch;Hanlee Ji;Alexander Urban - 通讯作者:
Alexander Urban
Single-Cell Transcriptomic Analysis of a Patient with Metastatic Appendiceal Adenocarcinoma: A Stem or Crypt Cell-Like Neoplasm?
- DOI:
10.1016/j.jamcollsurg.2021.07.497 - 发表时间:
2021-11-01 - 期刊:
- 影响因子:
- 作者:
Carlos Ayala;Susan M. Grimes;Byrne Lee;Hanlee Ji - 通讯作者:
Hanlee Ji
Structure of synthetic peptide analogues of an eggshell protein of Schistosoma mansoni
曼氏血吸虫蛋壳蛋白合成肽类似物的结构
- DOI:
10.1002/pro.5560020604 - 发表时间:
1993 - 期刊:
- 影响因子:8
- 作者:
C. Middaugh;J. Ryan;C. J. Burke;H. Mach;A. M. Naylor;M. Bogusky;S. Pitzenberger;Hanlee Ji;J. S. Cordingley;J. Thomson - 通讯作者:
J. Thomson
Sa1345 THE GASTRIC PRECANCEROUS CONDITIONS STUDY (GAPS)
- DOI:
10.1016/s0016-5085(20)31512-2 - 发表时间:
2020-05-01 - 期刊:
- 影响因子:
- 作者:
Robert J. Huang;Sungho Park;Nicole S. Kwon;Tanvi Chitre;Hanlee Ji;Joo Ha Hwang - 通讯作者:
Joo Ha Hwang
284: ORAL FLORA CHARACTERIZE THE GASTRIC PRECANCEROUS MICROBIOME IN THE ABSENCE OF HELICOBACTER PYLORI
- DOI:
10.1016/s0016-5085(22)60164-1 - 发表时间:
2022-05-01 - 期刊:
- 影响因子:
- 作者:
Robert J. Huang;Jiamin Chen;Sung Eun Kim;Summer S. Han;Joo Ha Hwang;Hanlee Ji - 通讯作者:
Hanlee Ji
Hanlee Ji的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似海外基金
I-Corps: Highly Scalable Differential Power Processing Architecture
I-Corps:高度可扩展的差分电源处理架构
- 批准号:
2348571 - 财政年份:2024
- 资助金额:
$ 112.5万 - 项目类别:
Standard Grant
CAREER: A Highly Effective, Usable, Performant, Scalable Data Reduction Framework for HPC Systems and Applications
职业:适用于 HPC 系统和应用程序的高效、可用、高性能、可扩展的数据缩减框架
- 批准号:
2232120 - 财政年份:2023
- 资助金额:
$ 112.5万 - 项目类别:
Standard Grant
PFI-TT: Highly Efficient, Scalable, and Stable Carbon-based Perovskite Solar Modules
PFI-TT:高效、可扩展且稳定的碳基钙钛矿太阳能模块
- 批准号:
2329871 - 财政年份:2023
- 资助金额:
$ 112.5万 - 项目类别:
Continuing Grant
Novel Highly Regenerative and Scalable Progenitor Cell Exosomes for Treating Peripheral Artery Disease
用于治疗外周动脉疾病的新型高度再生和可扩展的祖细胞外泌体
- 批准号:
10759902 - 财政年份:2023
- 资助金额:
$ 112.5万 - 项目类别:
CAREER: A Highly Effective, Usable, Performant, Scalable Data Reduction Framework for HPC Systems and Applications
职业:适用于 HPC 系统和应用程序的高效、可用、高性能、可扩展的数据缩减框架
- 批准号:
2312673 - 财政年份:2023
- 资助金额:
$ 112.5万 - 项目类别:
Standard Grant
SBIR Phase I: A highly-scalable, rapid, in-season approach to tune a nitrogen model for accurate prediction of a corn crop’s remaining nitrogen need
SBIR 第一阶段:一种高度可扩展、快速的季节性方法,用于调整氮模型,以准确预测玉米作物的剩余氮需求
- 批准号:
2127096 - 财政年份:2022
- 资助金额:
$ 112.5万 - 项目类别:
Standard Grant
Highly Scalable Graph Processing
高度可扩展的图形处理
- 批准号:
RGPIN-2019-04061 - 财政年份:2022
- 资助金额:
$ 112.5万 - 项目类别:
Discovery Grants Program - Individual
Collaborative Research: Scalable Manufacturing Enabled by Highly Tunable Multiphase Liquid Metal Pastes with Solid and Fluid Capsule Additives
合作研究:通过高度可调的多相液态金属浆料与固体和流体胶囊添加剂实现可扩展制造
- 批准号:
2032409 - 财政年份:2021
- 资助金额:
$ 112.5万 - 项目类别:
Standard Grant
Highly Integrated, Scalable Motor and Inverter Module with Flywheel Energy Storage and E-Axle applications
高度集成、可扩展的电机和逆变器模块,具有飞轮储能和电轴应用
- 批准号:
94722 - 财政年份:2021
- 资助金额:
$ 112.5万 - 项目类别:
BEIS-Funded Programmes
Highly scalable and sensitive spatial transcriptomic and epigenomic sequencing of brain tissues from human and non-human primate
对人类和非人类灵长类动物的脑组织进行高度可扩展且灵敏的空间转录组和表观基因组测序
- 批准号:
10370074 - 财政年份:2021
- 资助金额:
$ 112.5万 - 项目类别: