Collaborative Research: Selection Methods for Algebraic Design of Experiments
协作研究:实验代数设计的选择方法
基本信息
- 批准号:1720335
- 负责人:
- 金额:$ 10万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-08-15 至 2021-10-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Data science has emerged as an important field for making decisions based on data collected from sectors as varied as healthcare and housing. Though data are plentiful, thanks to phone apps, merchant loyalty cards, and social media accounts, there is still a question of whether more data translates to more knowledge. Furthermore collection and storage can be problematic especially when data are sensitive, as it is often the case with clinical trials and genetic experiments. The problem of selecting information-rich data becomes crucial for creating models that can reliably predict the outcome of future experiments. Few results have been published on the amount of necessary data, and currently there are no guidelines for generating specific data sets which would unambiguously identify a predictive model. As a first step towards developing a complete theory, the PIs will focus on models described by finite-valued nonlinear polynomial functions. (For example, the internal 'function' in WedMD's Symptom Checker returns medical conditions according to symptoms input by the user.) They will construct the smallest data sets that have a single associated polynomial model and study properties of such data sets. From these computational experiments, they will build the appropriate theory, design algorithms, and generate code that can be later developed into software complete with a graphical user interface. Graduate students will participate at the appropriate level of each component of the project. Such an experience will provide them possible topics for an MS or PhD dissertation and will very likely inspire a career-long involvement in the STEM disciplines. The theoretical results will advance the fields of design of experiments, network inference, and finite dynamical systems through the determination of criteria for selecting data sets to uniquely identify models. The algorithms will serve as a guide for experimentalists in determining the data that are needed to identify the structure of a network of interest. Such knowledge has the potential to drastically reduce wasted resources that arise from too much data with too little information.While this is the age of big data, there is still a question of whether more data translates to more knowledge. Particularly when collecting data is expensive or time consuming, as it is often the case with clinical trials and biomolecular experiments, the problem of selecting information-rich data becomes crucial for creating relevant models. Finite-state multivariate polynomial functions have successfully been used to model complex networks from discretized data; however, few results have been published on the amount of data necessary for such models, with the majority applying to Boolean models only. It is still unknown which data points explicitly identify such discrete models, and as a consequence, there are no methods for generating the specific data sets which would unambiguously identify the model. The PIs will address the issue of the minimality and specificity of data to uniquely identify discrete polynomial models by developing the appropriate theory, designing algorithms, and generating code that can be later built into software. Graduate students will participate at the appropriate level of each component of the project. This project will resolve some important computational issues in network inference and will improve experimental design and model selection by eliminating the effect of computational artifacts that arise when working with nonlinear multivariate polynomials. The theoretical results will advance the fields of design of experiments and network inference through the establishment of criteria to select data sets to uniquely identify models. The proposed work will also increase the utility of polynomial dynamical systems as models of complex networks by establishing the minimal amount of the data for unique model identification. The algorithms will serve as a guide for experimentalists in determining the data that are needed to identify the structure of a network of interest. Such knowledge has the potential to drastically reduce the number of experiments performed and to eliminate the generation of data with little intrinsic value.
数据科学已成为一个重要的领域,用于根据从医疗保健和住房等不同行业收集的数据做出决策。尽管有了手机应用、商家会员卡和社交媒体账户,数据非常丰富,但仍然存在一个问题,即更多的数据是否会转化为更多的知识。此外,收集和存储数据可能会有问题,特别是当数据敏感时,就像临床试验和基因实验经常出现的情况一样。选择信息丰富的数据的问题对于创建能够可靠地预测未来实验结果的模型至关重要。关于必要数据量的结果很少公布,目前还没有关于生成明确识别预测模型的特定数据集的指导方针。作为发展完整理论的第一步,PI将专注于由有限值非线性多项式函数描述的模型。(例如,WedMD的症状检查器中的内部‘函数’根据用户输入的症状返回医疗条件。)他们将构建具有单一关联多项式模型的最小数据集,并研究此类数据集的属性。从这些计算实验中,他们将建立适当的理论,设计算法,并生成代码,这些代码可以在以后开发成具有图形用户界面的软件。研究生将参与项目每个组成部分的适当级别。这样的经历将为他们提供可能的硕士或博士论文主题,并很可能激发他们在STEM学科的职业生涯中的参与。通过确定选择唯一识别模型的数据集的标准,理论结果将促进实验设计、网络推理和有限动力系统领域的发展。这些算法将作为实验者确定识别感兴趣网络结构所需的数据的指南。这样的知识有可能极大地减少因数据太多而信息太少而造成的资源浪费。尽管现在是大数据时代,但更多的数据是否会转化为更多的知识仍然是一个问题。特别是当收集数据昂贵或耗时时,就像临床试验和生物分子实验经常发生的那样,选择信息丰富的数据的问题对于创建相关模型变得至关重要。有限状态多元多项式函数已被成功地用来从离散化的数据中建模复杂的网络;然而,关于这类模型所需的数据量的结果很少,大多数结果仅适用于布尔模型。目前还不清楚哪些数据点明确地标识了这种离散模型,因此,没有方法来生成明确标识该模型的特定数据集。PI将解决数据的最小性和专一性问题,通过开发适当的理论、设计算法和生成可在以后构建到软件中的代码来唯一地识别离散多项式模型。研究生将参与项目每个组成部分的适当级别。该项目将解决网络推理中的一些重要计算问题,并将通过消除使用非线性多元多项式时出现的计算伪影的影响来改进实验设计和模型选择。理论结果将通过建立选择唯一识别模型的数据集的标准来推进实验设计和网络推理领域。这项工作还将通过建立用于唯一模型识别的最小数据量来增加多项式动力系统作为复杂网络模型的实用性。这些算法将作为实验者确定识别感兴趣网络结构所需的数据的指南。这种知识有可能极大地减少进行的实验次数,并消除产生几乎没有内在价值的数据。
项目成果
期刊论文数量(4)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Small Gröbner fans of ideals of points
点理想的格罗布纳小粉丝
- DOI:10.1142/s0219498820500875
- 发表时间:2020
- 期刊:
- 影响因子:0.8
- 作者:Dimitrova, Elena;He, Qijun;Robbiano, Lorenzo;Stigler, Brandilyn
- 通讯作者:Stigler, Brandilyn
Algebraic model selection and experimental design in biological data science
生物数据科学中的代数模型选择和实验设计
- DOI:10.1016/j.aam.2021.102282
- 发表时间:2022
- 期刊:
- 影响因子:1.1
- 作者:Dimitrova, Elena;Hu, Jingzhen;Liang, Qingzhong;Stigler, Brandilyn;Zhang, Anyu
- 通讯作者:Zhang, Anyu
The Number of Gröbner Bases in Finite Fields
有限域中格罗布纳碱基的数量
- DOI:10.1007/978-3-030-42687-3_9
- 发表时间:2020
- 期刊:
- 影响因子:0
- 作者:Stigler, Brandilyn;Zhang, Anyu
- 通讯作者:Zhang, Anyu
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Brandilyn Stigler其他文献
The Number of Gröbner Bases in Finite Fields (Research)
有限域中格罗布纳基的数量(研究)
- DOI:
- 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
Anyu Zhang;Brandilyn Stigler - 通讯作者:
Brandilyn Stigler
Polynomial Dynamical Systems in Systems Biology Brandilyn Stigler
系统生物学中的多项式动力系统 Brandilyn Stigler
- DOI:
- 发表时间:
2007 - 期刊:
- 影响因子:0
- 作者:
Brandilyn Stigler;J. Whitmarsh;Brandilyn Stigler - 通讯作者:
Brandilyn Stigler
Algebraic and Geometric Methods in Statistics: Design of experiments and biochemical network inference
统计学中的代数和几何方法:实验设计和生化网络推理
- DOI:
10.1017/cbo9780511642401.011 - 发表时间:
2009 - 期刊:
- 影响因子:0
- 作者:
R. Laubenbacher;Brandilyn Stigler - 通讯作者:
Brandilyn Stigler
An Algebraic Approach to Reverse Engineering with an Application to Biochemical Networks
逆向工程的代数方法及其在生化网络中的应用
- DOI:
- 发表时间:
2005 - 期刊:
- 影响因子:0
- 作者:
Brandilyn Stigler - 通讯作者:
Brandilyn Stigler
Inferring the Topology of Gene Regulatory Networks: An Algebraic Approach to Reverse Engineering
推断基因调控网络的拓扑:逆向工程的代数方法
- DOI:
- 发表时间:
2013 - 期刊:
- 影响因子:0
- 作者:
Brandilyn Stigler;Elena S. Dimitrova - 通讯作者:
Elena S. Dimitrova
Brandilyn Stigler的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Brandilyn Stigler', 18)}}的其他基金
Collaborative Research: Data selection for unique model identification
协作研究:独特模型识别的数据选择
- 批准号:
1419023 - 财政年份:2015
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: Burrows as buffers: do microhabitat selection and behavior mediate desert tortoise resilience to climate change?
合作研究:洞穴作为缓冲区:微生境选择和行为是否会调节沙漠龟对气候变化的适应能力?
- 批准号:
2301677 - 财政年份:2023
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
Collaborative Research: Intermediaries and Product Selection in the Municipal Bond Market
合作研究:市政债券市场的中介机构和产品选择
- 批准号:
2404669 - 财政年份:2023
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
Collaborative Research: RAPID: Genomic and phenotypic responses to hurricane-mediated selection in an invasive lizard: does epistasis constrain evolution?
合作研究:RAPID:入侵蜥蜴对飓风介导的选择的基因组和表型反应:上位性是否限制进化?
- 批准号:
2349094 - 财政年份:2023
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
Collaborative Research: EDGE FGT: Development of a Comprehensive Selection Library to Reconcile Core Metabolic Knowledge Gaps
合作研究:EDGE FGT:开发综合选择库以弥合核心代谢知识差距
- 批准号:
2319733 - 财政年份:2023
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
Collaborative Research: EDGE FGT: Development of a Comprehensive Selection Library to Reconcile Core Metabolic Knowledge Gaps
合作研究:EDGE FGT:开发综合选择库以弥合核心代谢知识差距
- 批准号:
2319732 - 财政年份:2023
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
Collaborative Research: Burrows as buffers: do microhabitat selection and behavior mediate desert tortoise resilience to climate change?
合作研究:洞穴作为缓冲区:微生境选择和行为是否会调节沙漠龟对气候变化的适应能力?
- 批准号:
2402001 - 财政年份:2023
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
Collaborative Research: Burrows as buffers: do microhabitat selection and behavior mediate desert tortoise resilience to climate change?
合作研究:洞穴作为缓冲区:微生境选择和行为是否会调节沙漠龟对气候变化的适应能力?
- 批准号:
2301676 - 财政年份:2023
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
Collaborative Research: Design-Based Optimal Subdata Selection Using Mixture-of-Experts Models to Account for Big Data Heterogeneity
协作研究:基于设计的最佳子数据选择,使用专家混合模型来解释大数据异构性
- 批准号:
2210576 - 财政年份:2022
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
Collaborative Research: From Molecules to Communities: How Levels of Selection Integrate to Tame Selfish Elements
合作研究:从分子到群体:选择水平如何整合以驯服自私元素
- 批准号:
2151033 - 财政年份:2022
- 资助金额:
$ 10万 - 项目类别:
Standard Grant
Collaborative Research: Visual adaptations in hydrothermal vent shrimp and the role in feeding modalities and habitat selection
合作研究:热液喷口虾的视觉适应及其在摄食方式和栖息地选择中的作用
- 批准号:
2154168 - 财政年份:2022
- 资助金额:
$ 10万 - 项目类别:
Continuing Grant