`

SSD06 Exercise04 个人解答

阅读更多

 

Profiling Lab: Understanding Program Performance

In this exercise, you will modify an existing program to make it run faster. The program is not written for speed. You will find many things to optimize, but you should concentrate on optimizations that will make a significant impact on run time.

The program is all in one file called substitute.cpp . The purpose of the program is to perform string substitutions on a list of files. The program input is specified on the command line, for example:

substitute.exe replacements.txt file1.txt file2.txt ... fileN.txt

The first file, replacements.txt in this example, contains a list of substitutions to perform. Each string substitution is specified in 3 lines: The first line specifies a string to search for, the next line specifies the replacement string, followed by a blank line. The following is an example substitution file:

the
that

his
her

The remaining files on the command line are the files to be modified. The program reads each file, performs the substitutions one line at a time, and then writes the file. To perform a substitution, the program looks for an exact match to the first string within the file. The matching characters are replaced by the second string. Then the file is searched for another match to the first string. This match is performed on the new state of the file, so it may include characters from any previous substitution. In fact, if the replacement string contains the search string, the program will go into an infinite loop. When no more matches are found, the program moves on to the next line in the substitution file.

A good programming style would be to avoid reading in a whole file at a time because the files might be very large. For this exercise, however, you can assume that there is always enough memory to read the whole file.

Running substitute.exe

Before running substitute.exe, you need some data to run it on. Download this data , create a folder for it, and extract the files from the .zip archive. Since the program will modify some of these files, you will want to either save the .zip archive or save a copy of the files. (You will be running the program many times on this test data.)

To run substitute.exe, you can move it to a directory of your choice, create an MS-DOS command prompt window, and type in the program and command line:

substitute.exe replace.txt call.cpp compiler.cpp driver.cpp getopt.cpp jnk.cpp mach.cpp  math.cpp semantics.cpp test.cpp

Profiling substitute.exe

Once everything is running (check the test files to see that the substitutions were applied), you are ready to start optimizing. The first step will be to use a profiler to find out where the program is spending its time and what it is doing with that time. Consult Appendix A Profiler Customized for SSD6 for more information about the profiler.

Your Assignment

You should make a new version of substitute.exe and demonstrate, using profiling output, that it runs faster. You should be able to obtain at least a factor of 2 speedup (old run time divided by new run time). You do not have to use Microsoft Foundation Class objects, but given that these are well written and probably correct, you should only replace code that is doing unnecessary work as reflected in profiler measurements.

Submit two files:

  1. Your modified substitute.cpp
  2. A file that contains the following
    1. a clear but concise description of what you observed before optimization. It should be substantiated by an empirical evidence of the profiler output.
    2. the bottlenecks you noticed
    3. the actions you took to address the bottlenecks, and the improvements you observed (again substantiated by empirical evidence)
    4. if you had decided to continue with the next most promising code segment for optimization, what would it be? Provide reasons for why you did NOT attempt to pursue that optimization.

To verify the correctness of your solution, compare the output files produced before making any source code changes with the output files produced after making source code changes. Since your optimization should not change the external behavior of the program, the corresponding output files should be identical. If the output files differ, your solution is incorrect. You can use the "comp" command in windows to check if the contents of two files differ.

Note: If any of the files you use for this assessment are stored on a network drive, the network I/O time may dwarf the CPU time consumed by the program and may seriously affect your measurements. Make sure all your files are on the hard drive of your local machine.

Final Word

This project is for educational value only. However, you might be intrigued by the power of a string replacement engine like substitute.exe. Often, in software engineering and even web page maintenance, you will need to perform global replacement of variables, classes, and even misspellings. This is so common in large projects that there are many special tools to facilitate this job. Unix tools such as Find and SED and languages such as Awk and Perl make this kind of job simple. In fact, a SED script to replace strings in a file is not much more complicated than the "replace.txt" file read in by this project. With most tools, you can match patterns, allowing you do more powerful things such as substituting only when the search string is followed by any non-alphanumeric-that is, when the search string is not a prefix of some other identifier.

Few programmers are experts at all of these tools. Novices tend to ignore them and do things the hard way. Experienced programmers know they can learn to use another programming language, they teach themselves what they need to know on an as-needed basis, and they get the job done quickly.

 



 

  • 大小: 76.1 KB
  • 大小: 149.2 KB
分享到:
评论
2 楼 lipanpally 2011-10-13  
怎么profile自己的程序呢?
谢谢!
1 楼 qianjigui 2009-02-08  
Feedback for Program Profiling (v2.0)
Total Score: 89.36170212765957/100

    * Optimization
      Score: 42/47

          o Pre-Optimization - Description of what you observed before optimizing.
            Score: 5/5

          o Explanation of bottlenecks
            Score: 0/5
            Incorrect assessment of problem areas.

          o Optimization - steps taken to alleviate bottlenecks and their impact.
            Score: 15/15

          o Post-Optimization - reasons for stopping further optimizations
            Score: 10/10

          o Implementation of optimization
            Score: 12/12

相关推荐

    SSD06 Exercise05 个人解答

    【标题】"SSD06 Exercise05 个人解答"主要涵盖了两个关键知识点:源码分析和工具使用。在这个练习中,作者分享了他对某个特定编程问题或项目的解答,这通常涉及深入理解代码的运作机制,包括算法、数据结构以及编程...

    SSD06 Exercise02 个人解答

    【标题】"SSD06 Exercise02 个人解答"主要涵盖了两个关键概念:源码分析和工具使用。这可能是某个课程或项目练习的一部分,其中作者Qianjigui分享了他在解决特定编程问题或实现某功能时的经验和理解。 在源码分析...

    SSD06 Exercise03 个人解答

    标题“SSD06 Exercise03 个人解答”暗示了一个编程练习或课程作业,其中可能涉及 SSD(固态存储)相关的技术,而 Exercise03 可能是该系列练习中的第三个部分。描述提到的“Ubuntu8.04+Gcc+Gdb”是一个古老的Linux...

    SSD04 Exercise04 个人解答

    【标题】"SSD04 Exercise04 个人解答"主要涵盖了两个关键知识点:源码理解和工具使用。在这个练习中,作者分享了他们对于特定编程问题的解决方案,可能涉及编程语言的深入理解、代码调试技巧以及如何有效地利用开发...

    SSD06 Exercise01 个人解答

    NULL 博文链接:https://qianjigui.iteye.com/blog/256678

    SSD04 Exercise06 个人解答

    标题“SSD04 Exercise06 个人解答”暗示了一个编程练习或项目,其中涉及到对Microsoft Calendar Control 10.0的使用。这个控制组件通常用于Windows应用程序开发,特别是使用Visual Basic 6 (VB6) 或其他支持ActiveX...

    SSD04 Exercise03 个人解答

    【标题】"SSD04 Exercise03 个人解答"主要涵盖了两个关键概念:源码分析和工具使用。这可能是某个课程或项目中的一个练习,其中"SSD04"可能代表课程编号或者阶段,而"Exercise03"则指示这是第三次实践任务。解答者...

    SSD04 Exercise01 个人解答

    这是我的解答 博文链接:https://qianjigui.iteye.com/blog/248917

    SSD04 Exercise05 个人解答

    【标题】"SSD04 Exercise05 个人解答"主要涵盖了两个核心知识点:源码分析和工具使用。在这个练习中,作者分享了他对某个特定编程问题或项目的解答,这通常涉及到深入理解代码的运作机制,包括算法、数据结构以及...

    SSD04 Exercise08 个人解答

    【SSD04 Exercise08 个人解答】 在这个学习实践中,我们主要关注的是与源码分析和工具使用相关的知识。这个题目可能源自于一个软件开发或计算机科学的课程,其中"SSD04"可能是课程代码,而"Exercise08"指的是第八个...

    SSD04 Exercise02 个人解答

    我的解答 博文链接:https://qianjigui.iteye.com/blog/248918

Global site tag (gtag.js) - Google Analytics