Social Science Gateway to TeraGrid
TeraGrid 的社会科学门户
基本信息
- 批准号:0922005
- 负责人:
- 金额:$ 39.35万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2009
- 资助国家:美国
- 起止时间:2009-07-01 至 2013-06-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5).The Virtual Research Data Center at Cornell University has been a successful research support tool for users of many of the Census Bureau large-scale confidential data products including, but not limited to, those that are accessible via the Census Research Data Center network. Over 200 computational users and 600 download users have benefited from the VirtualRDC resources. Their scientific publications cite the NSF grants that supported the development of the VirtualRDC. The proposed activity seeks to keep this support network flourishing. In addition, most social science researchers face substantial hurdles when they wish to harness the power of large-scale computational clusters, in particular when using new, very large synthetic data sets with their unprecedented detail on people, jobs, and firms. The proposed activity seeks to extend the VirtualRDC model to allow support of tera-scale social science computing via the NSF-sponsored TeraGrid resources. The most widespread statistical software packages used by social scientists, i.e., SAS, Stata, and SPSS, are not available on the TeraGrid itself or on any of the servers at the borders of the TeraGrid with fast connections to it. When viewing the problem through the lens of the typical data-driven research process (extract, edit and transform data; transfer data to a computational location; and perform analysis) social science researchers are typically constrained in at least one of these steps when approaching the high-performance computing clusters on the TeraGrid. For most data preparation, and for much analysis, the lack of standard statistical analysis and data preparation software packages is a serious impediment. However, the typical social scientist workstation or university-provided computational infrastructure does not have the resources to handle these very large data sets. Furthermore, the social science workstation and the university-provided infrastructure do not have sufficiently fast data connectivity to transfer any large prepared data files to the TeraGrid for processing there. This project aims to remedy bottlenecks in the first and second steps, with a focused expansion of resources at a critical location resulting in a highly useful gateway to the TeraGrid for the social sciences. The project builds a social science TeraGrid gateway that (i) allows researchers to perform the data preparation step using their comfort-level software packages, speeding up the data preparation phase, and (ii) do so on servers that have a fast connection to the TeraGrid, thus greatly speeding up the data-transfer process. The third bottleneck absence of social statistics packages on the TeraGrid is not addressed by this proposal, since it would require resources, in particular licensing resources, an order of magnitude larger than our proposed budget. This step is left to future proposals.Broader impacts: Tera-scale social science data are underutilized. Initially, serious confidentiality issues prevented most researchers from accessing these data. Significant research effort on projects that solve most of these confidentiality issues in combination with an expansion of the restricted-access model via Census Research Data Centers has begun to address this underutilization. Now that an increasing number of previously confidential data sources are finding their way into the public domain, the quantity of social science public-use data is once-again expanding dramatically. This project proposes a method of unlocking those recently released data sources to allow much broader access by the research community. Research strategies such as very large scale resampling and synthesis, which were previously proposed but not technically feasible, will be implemented. The expected explosion of use will lead to new results in a multitude of social sciences. The knowledge gained from running the Social Science TeraGrid Gateway will be leveraged and applied to future proposals in which the third identified bottleneck the absence of familiar software for social scientists on large-scale computing resources will be addressed. The PIs on this proposal are actively involved with other research teams that are moving forward with the development of such proposals. The long-term goal of this proposal is that the tools put together for the research community through this proposal will be the building blocks for bigger, and more transparent mechanisms, for granting social scientists easy access to large-scale computational facilities.
该奖项是根据2009年美国复苏和再投资法案(公法111-5)资助的。康奈尔大学的虚拟研究数据中心已经成为许多人口普查局大规模机密数据产品的用户的成功研究支持工具,包括但不限于那些可通过人口普查研究数据中心网络访问的数据。超过200个计算用户和600个下载用户受益于VirtualRDC资源。他们的科学出版物引用了支持VirtualRDC开发的NSF资助。拟议的活动旨在保持这一支持网络的蓬勃发展。此外,大多数社会科学研究人员在希望利用大规模计算集群的力量时面临着巨大的障碍,特别是在使用新的、非常大的合成数据集时,这些数据集包含了关于人、工作和公司的前所未有的细节。提议的活动旨在扩展VirtualRDC模型,允许通过nsf赞助的TeraGrid资源支持太尺度的社会科学计算。社会科学家使用的最广泛的统计软件包,即SAS, Stata和SPSS,在TeraGrid本身或TeraGrid边界的任何服务器上都无法快速连接到它。当通过典型的数据驱动研究过程(提取、编辑和转换数据;将数据传输到计算位置;并执行分析)来看待问题时,当接近TeraGrid上的高性能计算集群时,社会科学研究人员通常在这些步骤中至少有一个受到限制。对于大多数数据准备和许多分析来说,缺乏标准的统计分析和数据准备软件包是一个严重的障碍。然而,典型的社会科学家工作站或大学提供的计算基础设施没有资源来处理这些非常大的数据集。此外,社会科学工作站和大学提供的基础设施没有足够快的数据连接,无法将任何准备好的大型数据文件传输到TeraGrid进行处理。该项目旨在弥补第一步和第二步的瓶颈,重点扩展关键地点的资源,从而为社会科学提供一个非常有用的TeraGrid门户。该项目建立了一个社会科学TeraGrid网关,该网关(i)允许研究人员使用其舒适的软件包执行数据准备步骤,加快数据准备阶段,并且(ii)在与TeraGrid有快速连接的服务器上进行数据准备,从而大大加快了数据传输过程。第三个瓶颈是TeraGrid上没有社会统计软件包,这个建议没有解决,因为它需要的资源,特别是许可资源,比我们提议的预算大一个数量级。这一步留给未来的提议。更广泛的影响:万亿规模的社会科学数据未得到充分利用。最初,严重的保密问题阻止了大多数研究人员访问这些数据。解决这些机密性问题的项目的重大研究工作,结合通过普查研究数据中心扩展的限制访问模型,已经开始解决这一利用不足的问题。现在,越来越多以前保密的数据来源正在进入公共领域,社会科学公共使用数据的数量再次急剧增加。该项目提出了一种解锁那些最近发布的数据源的方法,以允许研究社区更广泛地访问这些数据源。将实施以前提出但技术上不可行的研究策略,如非常大规模的重采样和合成。预期的使用爆炸将在众多社会科学领域带来新的成果。从运行社会科学TeraGrid网关中获得的知识将被利用并应用于未来的提案中,其中第三个确定的瓶颈——社会科学家在大规模计算资源上缺乏熟悉的软件——将得到解决。该提案的pi积极参与其他正在推进此类提案开发的研究团队。该提案的长期目标是,通过该提案为研究界提供的工具将成为更大、更透明机制的基石,使社会科学家能够轻松访问大规模计算设施。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Lars Vilhuber其他文献
A guide for social science journal editors on easing into open science
- DOI:
10.1186/s41073-023-00141-5 - 发表时间:
2024-02-16 - 期刊:
- 影响因子:10.700
- 作者:
Priya Silverstein;Colin Elman;Amanda Montoya;Barbara McGillivray;Charlotte R. Pennington;Chase H. Harrison;Crystal N. Steltenpohl;Jan Philipp Röer;Katherine S. Corker;Lisa M. Charron;Mahmoud Elsherif;Mario Malicki;Rachel Hayes-Harb;Sandra Grinschgl;Tess Neal;Thomas Rhys Evans;Veli-Matti Karhulahti;William L. D. Krenzer;Anabel Belaus;David Moreau;Debora I. Burin;Elizabeth Chin;Esther Plomp;Evan Mayo-Wilson;Jared Lyle;Jonathan M. Adler;Julia G. Bottesini;Katherine M. Lawson;Kathleen Schmidt;Kyrani Reneau;Lars Vilhuber;Ludo Waltman;Morton Ann Gernsbacher;Paul E. Plonski;Sakshi Ghai;Sean Grant;Thu-Mai Christian;William Ngiam;Moin Syed - 通讯作者:
Moin Syed
Escaping Low Earnings: The Role of Employer Characteristics and Changes
摆脱低收入:雇主特征和变化的作用
- DOI:
10.1177/001979390405700405 - 发表时间:
2004 - 期刊:
- 影响因子:0
- 作者:
Harry J. Holzer;Julia I. Lane;Lars Vilhuber - 通讯作者:
Lars Vilhuber
La spécificité de la formation en milieu de travail : un survol des contributions théoriques et empiriques récentes
劳动环境的形成的具体情况:近年对理论和经验的贡献的监督
- DOI:
10.7202/602347ar - 发表时间:
2009 - 期刊:
- 影响因子:3.7
- 作者:
Lars Vilhuber - 通讯作者:
Lars Vilhuber
Assessing Utility of Differential Privacy for RCTs
评估差异隐私对 RCT 的效用
- DOI:
10.48550/arxiv.2309.14581 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Soumya Mukherjee;Aratrika Mustafi;Aleksandra B. Slavkovic;Lars Vilhuber - 通讯作者:
Lars Vilhuber
Lars Vilhuber的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Lars Vilhuber', 18)}}的其他基金
Collaborative Research: Elements: TRAnsparency CErtified (TRACE): Trusting Computational Research Without Repeating It
协作研究:要素:TRAnsparency CErtified (TRACE):信任计算研究而不重复它
- 批准号:
2209629 - 财政年份:2022
- 资助金额:
$ 39.35万 - 项目类别:
Standard Grant
Conferences on Reproducibility and Replicability in Economics and the Social Sciences (CRRESS)
经济学和社会科学的再现性和可重复性会议(CRRESS)
- 批准号:
2217493 - 财政年份:2022
- 资助金额:
$ 39.35万 - 项目类别:
Standard Grant
RCN: Coordination of the NSF-Census Research Network
RCN:NSF-人口普查研究网络的协调
- 批准号:
1507241 - 财政年份:2014
- 资助金额:
$ 39.35万 - 项目类别:
Standard Grant
RCN: Coordination of the NSF-Census Research Network
RCN:NSF-人口普查研究网络的协调
- 批准号:
1237602 - 财政年份:2012
- 资助金额:
$ 39.35万 - 项目类别:
Standard Grant
NCRN-MN: Cornell Census-NSF Research Node: Integrated Research Support, Training and Data Documentation
NCRN-MN:康奈尔大学人口普查-NSF 研究节点:综合研究支持、培训和数据文档
- 批准号:
1131848 - 财政年份:2011
- 资助金额:
$ 39.35万 - 项目类别:
Standard Grant
Synthetic Data User Testing and Dissemination
综合数据用户测试和传播
- 批准号:
1042181 - 财政年份:2010
- 资助金额:
$ 39.35万 - 项目类别:
Standard Grant
The economics of mass layoffs: displaced workers, displacing firms,and causes and consequences
大规模裁员的经济学:失业工人、企业倒闭以及原因和后果
- 批准号:
0820349 - 财政年份:2008
- 资助金额:
$ 39.35万 - 项目类别:
Continuing Grant
相似国自然基金
科学传播类:基于大科学装置“中国天眼”的AI for science新型科普平台建设
- 批准号:T2241020
- 批准年份:2022
- 资助金额:10.00 万元
- 项目类别:专项项目
SCIENCE CHINA: Earth Sciences
- 批准号:41224003
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
SCIENCE CHINA Chemistry
- 批准号:21224001
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
基于e-Science的民族信息资源融合与语义检索研究
- 批准号:61262071
- 批准年份:2012
- 资助金额:46.0 万元
- 项目类别:地区科学基金项目
Frontiers of Environmental Science & Engineering
- 批准号:51224004
- 批准年份:2012
- 资助金额:20.0 万元
- 项目类别:专项基金项目
Science China-Physics, Mechanics & Astronomy
- 批准号:11224804
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Journal of Computer Science and Technology
- 批准号:61224001
- 批准年份:2012
- 资助金额:20.0 万元
- 项目类别:专项基金项目
SCIENCE CHINA Information Sciences
- 批准号:61224002
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
SCIENCE CHINA Technological Sciences
- 批准号:51224001
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
SCIENCE CHINA Life Sciences (中国科学 生命科学)
- 批准号:81024803
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
相似海外基金
Collaborative Research: EAGER: A High Throughput Science Gateway for the Event Horizon Telescope
合作研究:EAGER:事件视界望远镜的高通量科学网关
- 批准号:
2324672 - 财政年份:2023
- 资助金额:
$ 39.35万 - 项目类别:
Standard Grant
Collaborative Research: EAGER: A High Throughput Science Gateway for the Event Horizon Telescope
合作研究:EAGER:事件视界望远镜的高通量科学网关
- 批准号:
2324673 - 财政年份:2023
- 资助金额:
$ 39.35万 - 项目类别:
Standard Grant
Sustaining: A Bridge to Sustainability for the CIPRES Science Gateway
可持续发展:CIPRES 科学网关可持续发展的桥梁
- 批准号:
2211631 - 财政年份:2022
- 资助金额:
$ 39.35万 - 项目类别:
Standard Grant
Collaborative Research: ELEMENTS: The LROSE Science Gateway LIDAR/RADAR Analysis In The Cloud
合作研究:ELEMENTS:云端 LROSE Science Gateway LIDAR/RADAR 分析
- 批准号:
2103776 - 财政年份:2021
- 资助金额:
$ 39.35万 - 项目类别:
Standard Grant
Collaborative Research: ELEMENTS: The LROSE Science Gateway LIDAR/RADAR Analysis In The Cloud
合作研究:ELEMENTS:云端 LROSE Science Gateway LIDAR/RADAR 分析
- 批准号:
2103785 - 财政年份:2021
- 资助金额:
$ 39.35万 - 项目类别:
Standard Grant
CC* Compute: GP-ARGO: The Great Plains Augmented Regional Gateway to the Open Science Grid
CC* 计算:GP-ARGO:大平原增强开放科学网格区域门户
- 批准号:
2018766 - 财政年份:2020
- 资助金额:
$ 39.35万 - 项目类别:
Standard Grant
RAISE: A Materials Science Gateway for X-ray Imaging and Modeling of Microstructures
RAISE:用于 X 射线成像和微结构建模的材料科学网关
- 批准号:
2037773 - 财政年份:2020
- 资助金额:
$ 39.35万 - 项目类别:
Standard Grant
Science Undergraduate Research Gateway Experience (SURGE)
科学本科生研究门户体验(SURGE)
- 批准号:
10684904 - 财政年份:2019
- 资助金额:
$ 39.35万 - 项目类别:
Science Undergraduate Research Gateway Experience (SURGE)
科学本科生研究门户体验(SURGE)
- 批准号:
10457931 - 财政年份:2019
- 资助金额:
$ 39.35万 - 项目类别:
CICI: SSC: Securing Science Gateway Cyberinfrastructure with Custos
CICI:SSC:与 Custos 一起保护科学网关网络基础设施
- 批准号:
1840003 - 财政年份:2018
- 资助金额:
$ 39.35万 - 项目类别:
Standard Grant