External libraries. B-trees,priority queues.
Symbols and libraries
• External libraries provide a wealth of functionality –example: C standard library
• Programs access libraries’ functions and variables viaidentifiers known as symbols
• Header file declarations/prototypes mapped to symbols atcompile time
• Symbols linked to definitions in external libraries duringlinking
• Our own program produces symbols, too
所谓Symbol,应该就是方法签名吧,由于以后链接的需要,在二进制库文件中应该还有对应的方法签名,这个能通过反编译得到。
Functions and variables as symbols
• Consider the simple hello world program written below:
#include <stdio.h>
const char msg [] = "Hello, world.";
int main(void){
puts (msg);
return 0;
}
• What variables and functions are declared globally?
msg, main(), puts(), others in stdio.h
我认为C程序的执行就是对全局变量、静态变量和方法(实际上也可以看作个变量,毕竟方法执行的最终结果就是一个数)的操作。他们是程序最顶层的部分,因此用Symbol来表示。
• Let’s compile, but not link, the file hello.c to createhello.o:
athena% gcc -Wall -c hello.c -ohello.o
• -c: compile, but do not linkhello.c; result will compile the code into machine instructions butnot make the program executable
• addresses for lines of code andstatic and global variables not yet assigned
• need to perform link step onhello.o (using gcc or ld) to assign memory to each symbol
• linking resolves symbols definedelsewhere (like the C standard library) and makes the code executable
• Let’s look at the symbols in the compiled file hello.o:
athena% nm hello.o
• Output:
0000000000000000 T main
0000000000000000 R msg
U puts
• ’T’ – (text) code; ’R’ – read-only memory; ’U’- undefined symbol
• Addresses all zero before linking; symbols not allocatedmemory yet
• Undefined symbols are defined externally, resolved duringlinking
• Why aren’t symbols listed for other declarations instdio.h?
• Compiler doesn’t bother creating symbols for unused functionprototypes (saves space)
• What happens when we link?
athena% gcc -Wall hello.o -o hello
• Memory allocated for defined symbols
• Undefined symbols located in external libraries (like libcfor C standard library)
• Let’s look at the symbols now:
athena% nm hello
• Output:
(other default symbols)
.
0000000000400524 T main
000000000040062c R msg
U puts@@GLIBC_2.2.5
• Addresses for static (allocated at compile time) symbols
• Symbol puts located in shared library GLIBC_2.2.5 (GNU Cstandard library)
• Shared symbol puts not assigned memory until run time
link阶段主要是symbols赋地址,我觉得底层的变量应该是用相对位置来算的。
Static and dynamic linkage
• Functions, global variables must be allocated memory beforeuse
• Can allocate at compile time (static) or at run time (shared)
• Advantages/disadvantages to both
• Symbols in same file, other .o files, or static libraries(archives, .a files) – static linkage
• Symbols in shared libraries (.so files) – dynamic linkage
• gcc links against shared libraries by default, can forcestatic linkage using -static flag
个人更喜欢static,原因是以前遇到n多库版本不兼容引发的惨案。。。
Loading shared libraries
• Shared library located during compile-time linkage, but needsto be located again during run-time loading
• Shared libraries located at run-time using linker libraryld.so
• Whenever shared libraries on system change, need to runldconfig to update links seen by ld.so
• During loading, symbols in dynamic library are allocatedmemory and loaded from shared library file
现在知道为什么以前在解决链接库的版本问题时总是要用ldconfig,也怪我太懒,那会儿man一下就什么都知道了。
Loading shared libraries on demand
• In Linux, can load symbols from shared libraries on demandusing functions in dlfcn.h
• Open a shared library for loading:
void ∗ dlopen(const char ∗file, int mode);
values for mode: combination ofRTLD_LAZY (lazy loading of library), RTLD_NOW (load now), RTLD_GLOBAL(make symbols in library available to other libraries yet to be
loaded), RTLD_LOCAL (symbols loadedare accessible only to your code)
• Get the address of a symbol loaded from the library:
void ∗ dlsym(void ∗ handle, constchar ∗ symbol_name);
handle from call to dlopen; returnedaddress is pointer to variable or function identified by symbol_name
• Need to close shared library file handle after done withsymbols in library:
int dlclose(void ∗ handle);
• These functions are not part of C standard library; need tolink against library libdl: -ldl compiler flag
Symbol resolution issues
• Symbols can be defined in multiple places
• Suppose we define our own puts() function
• But, puts() defined in C standard library
• When we call puts(), which one gets used?
• Our puts() gets used since ours is static, and puts() in Cstandard library not resolved until run-time
• If statically linked against C standard library, linker findstwo puts() definitions and aborts (multiple definitions notallowed)
Symbol resolution issues
• How about if we define puts() in a shared library and attemptto use it within our programs?
• Symbols resolved in order they areloaded
• Suppose our library containingputs() is libhello.so, located in a standard library directory (like/usr/lib), and we compile our hello.c code against this library:
athena% gcc -g -Wall hello.c -lhello-o hello.o
• Libraries specified using -l flagare loaded in order specified, and before C standard library
• Which puts() gets used here?
athena% gcc -g -Wall hello.c -lc-lhello -o hello.o
C中静态链接的一个问题就是命名冲突,动态链接可以通过顺序来解决这个问题,有重载的味道。
Creating libraries
• Libraries contain C code like any other program
• Static or shared libraries compiled from (un-linked) objectfiles created using gcc
• Compiling a static library:
• compile, but do not link source files:
athena% gcc -g -Wall -c infile.c -ooutfile.o
• collect compiled (unlinked) files into an archive:
athena% ar -rcs libname.a outfile1.ooutfile2.o …
Creating shared libraries
• Compile and do not link files using gcc:
athena% gcc -g -Wall -fPIC -c infile.c-o outfile.o
• -fPIC option: create position-independent code, since codewill be repositioned during loading
• Link files using ld to create a shared object (.so) file:
athena% ld -shared -soname libname.so-o libname.so.version -lc outfile1.o outfile2.o ...
• If necessary, add directory to LD_LIBRARY_PATH environmentvariable, so ld.so can find file when loading at run-time
• Configure ld.so for new (or changed) library:
athena% ldconfig -v
不明白PIC的作用,在编译阶段Symbol不是没有地址吗?
B-tree structure
• Binary search tree with variable number of children (at leastt, up to 2t)
• Tree is balanced – all leaves at same level
• Node contains list of “keys” – divide range of elementsin children
Initializing a B-tree
• Initially, B-tree contains root node with no children (leafnode), no keys
• Note: root node exempt from minimum children requirement
B树中要求每个节点的子元素数目在t-1和2t-1之间,根节点在生长初期显然不能满足这个要求,因此它可以不受这个规则的约束。
Inserting elements
• Insertion complicated due to maximum number of keys
• At high level:
1. traverse tree down to leaf node
2. if leaf already full, split intotwo leaves:
(a) move median key element intoparent (splitting parent already full)
(b)split remaining keys into twoleaves (one with lower, one with higher elements)
3. add element to sorted list of keys
• Can accomplish in one pass, splitting full parent nodesduring traversal in step 1
B树节点中子元素的个数在t-1和2t-1之间,这种要求可以保证在插入和删除元素时节点的合并和分裂都符合这个要求。2t-1保证了中间元素只有一个,分裂之后的两个节点大小都为t-1,再加入一个元素则变为一个t和一个t-1。在节点合并时,如果从一个含t-1个元素的节点删除一个元素,删除后这个节点要和含2t-1个元素的相邻节点以及他们两个在上层节点中的父元素,共3t-3个元素合并,合并后超过2t-1,继续分裂,其平均值1.5t-1.5一定符合t-1和2t-1之间这个条件。如果两个都是t-1,合并后为2t-2,也满足这个条件。
Searching a B-tree
• Search like searching a binary search tree:
1. start at root.
2. if node empty, element not in tree
3. search list of keys for element(using linear or binary search)
4. if element in list, return element
5. otherwise, element between keys,and repeat search on child node for that range
• Tree is balanced – search takes O(log n) time
B树的搜索本质上与二叉树搜索没有什么区别,都是通过比较来进行搜索。不过B树中一个节点可以有多个元素,访问这些元素是随机访存,而二叉树则需要通过指针来访问,这方面B树效率要高一些。另外,B树在分裂和分裂时都会选节点中的中间元素,从而使得树更为平衡。当然,保证B树平衡最重要的兄弟节点间的合并和分裂,以及这些合并和分裂背后的最小元素数和最大元素数。
Deletion
• Deletion complicated by minimum children restriction
• When traversing tree to find element, need to ensure childnodes to be traversed have enough keys
• if adjacent child node has atleast t keys, move separating key from parent to child and closestkey in adjacent child to parent
• if no adjacent child nodes haveextra keys, merge child node with adjacent child
• When removing a key from a node with children, need torearrange keys again
• if child before or after removedkey has enough keys, move closest key from child to parent
• if neither child has enough keys,merge both children
• if child not a leaf, have torepeat this process
Delte是最复杂的部分,记得6.087的最后一个课后练习就是实现一个B树,当时B树的Detle搞的我压力很大,后来发现那个作业中不需要实现Delete。。。。
Priority queue
• Abstract data structure ordering elements by priority
• Elements enqueued with priority, dequeued in order of highestpriority
• Common implementations: heap or binary search tree
• Operations: insertion, peek/extract max-priority element,increase element priority
线性的队列,容易实现,感觉最简单就是使用插入排序算法实现。往往涉及大量的插入和删除,因此使用链表实现比较好。
Heaps
• Heap - tree with heap-ordering property: priority(child)≤priority(parent)
• More sophisticated heaps exist – e.g. binomial heap,Fibonacci heap
• We’ll focus on simple binary heaps
• Usually implemented as an array with top element at beginning
• Can sort data using a heap – O(n log n) worst case in-placesort!
Extracting data
• Heap-ordering property ⇒ maximum priority element at top ofheap
• Can peek by looking at top element
• Can remove top element, move last element to top, and swaptop element down with its children until it satisfies heap-orderingproperty:
1. start at top
2. find largest of element and leftand right child; if element is largest, we are done
3. otherwise, swap element withlargest child and repeat with element in new position
Inserting data/increasing priority
• Insert element at end of heap, set to lowest priority −∞
• Increase priority of element to real priority:
1. start at element
2. if new priority less thanparent’s, we are done
-
otherwise, swap element with parent and repeat
树、堆等层次结构相对于线性结构的最大优势就是跳转距离比较大,访问效率高。缺点是实现稍微复杂。插入元素时两者的搜索顺序有很大不同,树的结构是从上到下,逐层寻找元素的位置;而最大堆则是将新入元素放到最底部,然后通过从底向上搜索,这样可以保证完美的树形结构。最大堆可以这么玩是因为里面的元素位置不是绝对固定,他们不需要保持特定顺序,只需要满足子节点比父节点小就可以,这样最多到达顶部就停止,而不会有从上到下的过程,因此其计算效率也是很高的。
分享到:
相关推荐
资料目录.bat Advice to next year student.doc lec1.ppt lec10.ppt lec11.ppt lec12.ppt lec13.ppt lec14.ppt lec15.ppt lec16.ppt lec17.ppt lec18.ppt lec19.ppt lec2.ppt lec20.ppt ...lec9.ppt
In the previous lecture, we leant about impedance spectroscopy. Electrochemical impedance spectroscopy is the technique where the cell or electrode impedance is platted versus frequency. Thus, the ...
programming in computing 10a lec2
EI374 高级算法-全套 PPT 课件-笔记 lec1-slides.pdf lec1.pdf lec2-slides.pdf lec2.pdf lec3-slides.pdf lec3.pdf lec4-slides.pdf ...lec6.pdf lec7.pdf lec8.pdf lec9.pdf lec10.pdf lec11.pdf
6. **划分风险等级**:根据LEC分值将危险源分为不同的风险等级,如低风险、中风险和高风险。 总的来说,"lec.rar_LEC"提供的工具是进行LEC风险评估的重要助手,它使用MATLAB语言编写,能够客观地计算和分析工作场所...
Lec 00 Introduction and Course Overview Lec 01 Bezier Curves and Splines Assignment 0 Lec 02 Curves Properties and Conversion, Surface Representation Lec 03 Coordinates and Transformations Lec 04 ...
EI338 计算机系统工程-Computer Systems Engineering-全套 PPT 课件 CA-lec1.pdf ...lec6-OS.pdf lec7-OS.pdf lec8-OS.pdf lec9-OS.pdf lec10-OS.pdf lec11-OS.pdf lec12-OS.pdf Study-Guide.pdf Summary.pdf
《LEC培训(完整版).pdf》是一份关于逻辑等效检查的详细教程,重点介绍了使用Conformal工具进行逻辑等效验证的方法和技术。Conformal是一款强大的逻辑等效检查工具,广泛应用于芯片设计的验证阶段,确保设计的逻辑...
在Linux系统中,头文件是C语言编程的关键部分,它们允许程序员在源代码(如lec.c)中使用特定的函数和数据类型,而无需在每个文件中包含完整的实现。lec.c很可能是这个LAN Emulation客户端的实现文件,包含实际的...
麻省理工matlab课件-MIT6_094IAP10_lec04.pdf 本帖最后由 sunchy11 于 2012-2-8 15:46 编辑 分享个MIT的matlab 教程,属于初级入门,希望对大家有帮助哈。
由于提供的文件内容主要是一些网站链接、电子邮箱地址和数字的排列,没有提供实际的HOLLiAS-LEC G3 PLC选型手册的详细内容,我无法直接从中提取相关的知识点。然而,我可以根据HOLLiAS-LEC G3 PLC这个主题,根据一般...
Week1—4_Note_Lec1—6.pdf
总结而言,“lec.zip_LEC _点名_点名系统_点名系统C_随机点名”提供的是一款基于C语言实现的随机点名系统,它集成了随机数生成和数据管理技术,为教学环境带来了便捷和公平。通过理解和掌握这样的系统,不仅可以提升...
9. 硬件可靠性:LEC-3210具备内置的蜂鸣器和实时时钟(RTC),并支持看门狗时钟(WTD),提供1-255级时间间隔的系统重启设置。 10. 定制按钮与指示灯:设备配备有Reset按钮和CTR按钮,允许用户自定义其功能。Run和...
demo_Lec.m
"Lec 1 SDLC.pptx"这个文件很可能包含了对SDLC和UML的详细介绍,涵盖了以上各个阶段的理论知识和实际应用案例。通过学习这个讲座材料,可以深入理解这两个概念如何在实际项目中协同工作,提升软件开发的效率和质量。...
Lec1-Introduction.pdf.zip
安全风险源识别LEC评分.pdf
麻省理工matlab课件-MIT6_094IAP10_lec05.pdf 本帖最后由 sunchy11 于 2012-2-8 15:46 编辑 分享个MIT的matlab 教程,属于初级入门,希望对大家有帮助哈。
立华公司推出的LEC-3010 Din-Rail嵌入式通讯管理机是一款基于Intel Atom N450处理器的高性能通讯控制设备,其设计注重低功耗、嵌入式应用,具备多种硬件接口,满足工业现场对通讯和控制的严格需求。 LEC-3010主要...