SEER RRSS #5 - Constructing Geographic Areas in GIS for Cancer Data Analysis
先知RRSS
基本信息
- 批准号:7952665
- 负责人:
- 金额:$ 6.35万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2005
- 资助国家:美国
- 起止时间:2005-08-01 至 2010-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Compared with other diseases such as cardiovascular disease and diabetes, cancer is a relatively rare disease. The analysis of cancer incidence often suffers from the small population problem manifested in unreliable rate estimates, sensitivity to missing data and other data errors, and data suppression in sparsely populated areas. When creating maps of cancer incidence, the choice of areal unit of analysis (e.g., county or parish, zip code, census tract) and the geographic region of interest determine whether there will be sufficient numbers of cases in each area. For example, on the State Cancer Profiles website, cancer rates are mapped at the county or parish level. A map of Louisiana¿s parish-level incidence rates for cancer of the brain and other nervous system would have rates suppressed for 43 (67%) of 64 parishes while a map of childhood cancer incidence would have rates suppressed for 53 (80%) parishes (see companion proposal from the Louisiana Tumor Registry (LTR)). In contrast, for California, brain/ONS and childhood cancer rates would be suppressed in only 13 (22%) and 21 (36%) of the state¿s 58 counties, respectively. Meanwhile, rate variations within the largest counties or parishes such as Orleans, Jefferson, and East Baton Rouge in Louisiana and Los Angeles, San Diego, Alameda, and Santa Clara in California are not revealed. Rates in these areas have limited value to researchers and concerned citizens interested in describing cancer incidence patterns at finer geographic scales. Furthermore, within these county boundaries are areas with distinct concentrations of racial/ethnic groups and high and low socioeconomic status that may have different rates of cancer. Incidence rates may be generated for smaller and more homogeneous geographic units such as census tracts. The total population in a census tract (year 2000), however, ranges between 1,500 and 8,000 with an optimal size of 4,000, which would make these geographic units insufficient for estimating reliable tract-level incidence rates that would not jeopardize patients¿ privacy and confidentiality.
Several geographic strategies have been proposed to mitigate the problem. Spatial smoothing computes average rates for each area of interest by incorporating rates in adjacent areas. Spatial smoothing methods include the floating catchment area method, kernel density estimation, empirical Bayes estimation, locally-weighted-average approaches, and adaptive spatial filtering. While spatial smoothing assists in the revealing of the overall trend of spatial patterns (see www.uiowa.edu/iowacancermaps for an example), the result is an estimate of the average rate derived from the area of interest and surrounding areas, but may not reflect the true rate for the area of interest.
This proposal seeks to construct larger geographic areas from smaller areas in order for the total base population to be sufficiently large for generating reliable incidence rates. Geography has a long tradition of grouping areas together for the purposes of ¿regionalization¿ or identifying ¿spatial clustering¿. Traditional methods place the first priority on attribute (e.g., sociodemographic characteristics) similarity within areas, and most are implemented manually or semi-automatically. Attribute information was first used to form initial regions and then applied several subjective rules and local knowledge to further adjust the region boundaries. Advancements in geographic information systems (GIS) technology have enabled researchers to develop methods automating the process. Two other earlier methods emphasized spatial proximity: space-filling curves to measure the nearness or spatial order of areal units and then grouped areas consecutively to reach a capacity constraint, and construction of regions of approximately equal population size by beginning with an area and adding the nearest areas to form each region with the desired threshold population. Neither of these methods however, account for within-area homogeneity of the attribute.
Most recent work aims to develop GIS-based automated methods by accounting for spatial contiguity and attribute homogeneity within the derived areas. A preliminary assessment has identified two promising methods. A family of methods has been developed, termed ¿Regionalization with dynamically constrained agglomerative clustering and partitioning (REDCAP)¿, to identify clusters of areas. Using three distance definitions to measure attribute dissimilarity and two constraining strategies to account for spatial contiguity, REDCAP is a family collection of six methods. REDCAP allows users to specify the desired spatial contiguity, attribute dissimilarity, number of derived regions, and other parameters. A modified scale-space clustering (MSSC) method was devised to form a series of geographic areas. The scale-space theory is based on the notion that an image contains structures at different scales, and its more significant structures can be preserved as the scale of observation becomes coarser. Similar to this operation on an image, the MSSC method merges or melts areas of higher value with surrounding areas of lower values but similar structure to form larger areas. The process is guided by a clear objective of minimizing loss of information. The method does not depend on any probability distribution of the data and is robust for unsupervised hierarchical classification. Like REDCAP, the MSSC method does not guarantee that newly formed areas have a minimum population.
Both the REDCAP and MSSC methods account for attribute similarity when grouping contiguous areas together. The major difference lies in the objective functions to be optimized during the clustering process. The REDCAP minimizes the total heterogeneity value (i.e., sum of squared deviations of all regions while the MSSC attempts to preserve the overall spatial structure by grouping around local maxima. Both methods have demonstrated advantages over other existing ones when evaluated for total heterogeneity, region size balance, internal variation, preservation of data distribution and spatial compactness. However, neither method has been applied to cancer studies. Analysis of cancer data merits special attention such as data confidentiality and privacy concerns, and offers unique challenges such as additional constraints (e.g., creating areas above threshold population and respecting important geopolitical boundaries).
The proposed project plans to evaluate and modify these two methods to enhance the presentation and visualization of cancer surveillance data by geographic area. The study will combine adjacent similar small areas to mask identity while keeping areas with a sufficient number (e.g., ≥ 15) of cancer incidences and population (≥ 50,000) intact.
与心血管疾病、糖尿病等其他疾病相比,癌症是一种相对罕见的疾病。癌症发病率的分析往往存在人口少的问题,表现在发病率估计不可靠,对缺失数据和其他数据错误的敏感性,以及人口稀少地区的数据抑制。在绘制癌症发病率图时,分析区域单位的选择(例如,县或教区、邮政编码、人口普查区)和感兴趣的地理区域决定了每个地区是否有足够数量的病例。例如,在国家癌症概况网站上,癌症发病率是在县或教区一级绘制的。路易斯安那州64个教区中43个(67%)教区的脑癌和其他神经系统癌发病率会受到抑制,而儿童癌症发病率会受到53个(80%)教区的抑制(参见路易斯安那州肿瘤登记处(LTR)的建议)。相比之下,在加州,全州58个县中,脑癌/国家统计局和儿童癌症发病率分别只有13个(22%)和21个(36%)得到抑制。与此同时,路易斯安那州的奥尔良、杰斐逊和东巴吞鲁日以及加利福尼亚州的洛杉矶、圣地亚哥、阿拉米达和圣克拉拉等最大的县或教区的税率变化并未公布。这些地区的发病率对研究人员和有兴趣在更精细的地理尺度上描述癌症发病率模式的有关公民来说价值有限。此外,在这些县界内,不同种族/族裔群体以及社会经济地位高低的地区可能有不同的癌症发病率。发病率可以产生较小和更均匀的地理单位,如人口普查区。然而,人口普查区(2000年)的总人口在1,500至8,000之间,最佳规模为4,000,这将使这些地理单位不足以估计可靠的地区发病率,而不会危及患者的隐私和机密性。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
VIVIEN CHEN其他文献
VIVIEN CHEN的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('VIVIEN CHEN', 18)}}的其他基金
Quality Control of Electronic Pathology (E-Path)ResultsPeriod of Performance: September 15, 2014 - September 14, 2015
电子病理质量控制(E-Path)结果执行期间:2014年9月15日-2015年9月14日
- 批准号:
8947591 - 财政年份:2014
- 资助金额:
$ 6.35万 - 项目类别:
Patterns of Care (POC) Quality of Care Dx Yr 2011
护理模式 (POC) 护理质量 Dx 2011 年
- 批准号:
8565158 - 财政年份:2012
- 资助金额:
$ 6.35万 - 项目类别:
SURVEILLANCE, EPIDEMIOLOGY, AND END RESULTS (SEER) PROGRAM
监测、流行病学和最终结果 (SEER) 计划
- 批准号:
8481440 - 财政年份:2012
- 资助金额:
$ 6.35万 - 项目类别:
TAS::75 0849::TAS SURVEILLANCE, EPIDEMIOLOGY, AND END RESULTS (SEER) PROGRAM
TAS::75 0849::TAS 监测、流行病学和最终结果 (SEER) 计划
- 批准号:
8317507 - 财政年份:2011
- 资助金额:
$ 6.35万 - 项目类别:
TAS::75 0849::TAS SURVEILLANCE, EPIDEMIOLOGY, AND END RESULTS (SEER) PROGRAM
TAS::75 0849::TAS 监测、流行病学和最终结果 (SEER) 计划
- 批准号:
8317508 - 财政年份:2011
- 资助金额:
$ 6.35万 - 项目类别:
TAS::75 0849::TAS SURVEILLANCE, EPIDEMIOLOGY, AND END RESULTS (SEER) PROGRAM
TAS::75 0849::TAS 监测、流行病学和最终结果 (SEER) 计划
- 批准号:
8163677 - 财政年份:2010
- 资助金额:
$ 6.35万 - 项目类别:
TAS::75 0849::TAS SURVEILLANCE, EPIDEMIOLOGY, AND END RESULTS (SEER) PROGRAM
TAS::75 0849::TAS 监测、流行病学和最终结果 (SEER) 计划
- 批准号:
8131534 - 财政年份:2010
- 资助金额:
$ 6.35万 - 项目类别:
Surveillance, Epidemiology and End Results (SEER) Program - LSU
监测、流行病学和最终结果 (SEER) 计划 - 路易斯安那州立大学
- 批准号:
7824258 - 财政年份:2005
- 资助金额:
$ 6.35万 - 项目类别:
RRSS #9 - Patterns of Care - Dx 2006 Feasibility Adolescent and Young Adult - LSU
RRRSS
- 批准号:
7824260 - 财政年份:2005
- 资助金额:
$ 6.35万 - 项目类别:
相似海外基金
IGF::OT::IGF SEER RRSS IMPROVING OUTPATIENT REPORTING OF CANCER OCCURRENCE AND TREATMENT; 9/19/16-9/18/17
IGF::OT::IGF SEER RRSS 改善癌症发生和治疗的门诊报告;
- 批准号:
9361196 - 财政年份:2016
- 资助金额:
$ 6.35万 - 项目类别:
IGF::OT::IGF SEER RRSS IMPROVING OUTPATIENT REPORTING OF CANCER OCCURRENCE AND TREATMENT; 9/19/16-9/18/17
IGF::OT::IGF SEER RRSS 改善癌症发生和治疗的门诊报告;
- 批准号:
9361195 - 财政年份:2016
- 资助金额:
$ 6.35万 - 项目类别:
RRSS Evaluate Completeness Liver Cancer Reporting Under New Clinical Guidelines
RRSS 根据新临床指南评估肝癌报告的完整性
- 批准号:
8351018 - 财政年份:2011
- 资助金额:
$ 6.35万 - 项目类别:
SEER RRSS Improving SES Data: Linkage State Vital Records, Birth Certificate Data
SEER RRSS 改进 SES 数据:链接状态人口记录、出生证明数据
- 批准号:
8351002 - 财政年份:2011
- 资助金额:
$ 6.35万 - 项目类别:
RRSS Improving SES Data: Linkage w State Vital Records, Birth Certificate Data
RRSS 改进 SES 数据:与州人口记录、出生证明数据的链接
- 批准号:
8351016 - 财政年份:2011
- 资助金额:
$ 6.35万 - 项目类别:
SEER RRSS Using Multiple Imputation to Enhance the Utility of SEER Summary Stage
SEER RRSS 使用多重插补增强 SEER 摘要阶段的实用性
- 批准号:
8351006 - 财政年份:2011
- 资助金额:
$ 6.35万 - 项目类别: