We need better tools for C, such as source browsers, bug finders, and automated refactorings. The problem is that large C systems such as Linux are software product lines, containing thousands of configuration variables controlling every aspect of the software from architecture features to file systems and drivers. The challenge of such configurability is how do software tools accurately analyze all configurations of the source without the exponential explosion of trying them all separately. To this end, we focus on two key subproblems, parsing and the build system. The contributions of this thesis are the following: (1) a configuration-preserving preprocessor and parser called SuperC that preserves configurations in its output syntax tree; (2) a configuration-preserving Makefile evaluator called Kmax that collects Linux’s compilation units and their configurations; and (3) a framework for configuration-aware analyses of source code using these tools. C tools need to process two languages: C itself and the preprocessor. The latter improves expressivity through file includes, macros, and static conditionals. But it operates only on tokens, making it hard to even parse both languages. SuperC is a complete, performant solution to parsing all of C. First, a configuration-preserving preprocessor resolves includes and macros yet leaves static conditionals intact, thus preserving a program’s variability. To ensure completeness, we analyze all interactions between preprocessor features and identify techniques for correctly handling them. Second, a configurationpreserving parser generates a well-formed AST with static choice nodes for conditionals. It forks new subparsers when encountering static conditionals and merges them again after the conditionals. To ensure performance, we present a simple algorithm for table-driven Fork-Merge LR parsing and four novel optimizations. We demonstrate SuperC’s effectiveness on the x86 Linux kernel. Large-scale C codebases like Linux are software product families, with complex build systems that
我们需要更好的C工具,例如源浏览器,错误查找器和自动重构。系统和驱动程序的挑战是软件工具如何准确分析源的所有配置全部。我们将重点放在两个关键的子问题上,解析和构建系统。 (2)一个称为Kmax的配置贴合型评估者,它收集Linux的汇编单元及其配置;(3)使用这些工具对源代码进行配置的框架。 C需要处理两种语言:C本身和后者通过文件改善了表达性,包括宏和静态条件。解析所有C的解决方案,首先,配置预处理的解决方案包括和宏包括静态条件,从而保留了程序的可变性完整性,我们分析了预处理功能之间的所有交互,并确定正确处理它们的技术,配置PRESERAVERVING PRESERVING PARSER生成了一个良好的AST,并在遇到静态条件的情况下静态选择。为了确保性能,我们提出了一种简单的算法,用于桌子驱动的叉车LR解析和四个新颖的优化Superc对X86 Linux内核的有效性。