http://www.embedded.com/electronics-blogs/programming-pointers/4026076/Why-size-t-matters
Using size_t appropriately can improve the portability, efficiency, or readability of your code. Maybe even all three.
Numerous functions in the Standard C library accept arguments or return values that represent object sizes in bytes. For example, the lone argument in malloc(n) specifies the size of the object to be allocated, and the last argument in memcpy(s1, s2, n) specifies the size of the object to be copied. The return value of strlen(s) yields the length of (the number of characters in) null-terminated character array s excluding the null character, which isn't exactly the size of s, but it's in the ballpark.
You might reasonably expect these parameters and return types that represent sizes to be declared with type int (possibly long and/or unsigned), but they aren't. Rather, the C standard declares them as type size_t. According to the standard, the declaration for malloc should appear in <stdlib.h> as something equivalent to:
void *malloc(size_t n);
and the declarations for memcpy and strlen should appear in <string.h> looking much like:
void *memcpy(void *s1, void const *s2, size_t n);
size_t strlen(char const *s);
The type size_t also appears throughout the C++ standard library. In addition, the C++ library uses a related symbol size_type, possibly even more than it uses size_t.
In my experience, most C and C++ programmers are aware that the standard libraries use size_t, but they really don't know what size_t represents or why the libraries use size_t as they do. Moreover, they don't know if and when they should use size_t themselves.
In this column, I'll explain what size_t is, why it exists, and how you should use it in your code.
A portability problem
Classic C (the early dialect of C described by Brian Kernighan and Dennis Ritchie in The C Programming Language, Prentice-Hall, 1978) didn't provide size_t. The C standards committee introduced size_t to eliminate a portability problem, illustrated by the following example.
Let's examine the problem of writing a portable declaration for the standard memcpy function. We'll look at a few different declarations and see how well they work when compiled for different architectures with different-sized address spaces and data paths.
Recall that calling memcpy(s1, s2, n) copies the first n bytes from the object pointed to by s2 to the object pointed to by s1, and returns s1. The function can copy objects of any type, so the pointer parameters and return type should be declared as "pointer to void." Moreover, memcpy doesn't modify the source object, so the second parameter should really be "pointer to const void." None of this poses a problem.
The real concern is how to declare the function's third parameter, which represents the size of the source object. I suspect many programmers would choose plain int, as in:
void *memcpy(void *s1, void const *s2, int n);
which works fine most of the time, but it's not as general as it could be. Plain int is signed--it can represent negative values. However, sizes are never negative. Using unsigned int instead of int as the type of the third parameter lets memcpy copy larger objects, at no additional cost.
On most machines, the largest unsigned int value is roughly twice the largest positive int value. For example, on a 16-bit twos-complement machine, the largest unsigned int value is 65,535 and the largest positive int value is 32,767. Using an unsigned int as memcpy's third parameter lets you copy objects roughly twice as big as when using int.
Although the size of an int varies among C implementations, on any given implementation int objects are always the same size as unsigned int objects. Thus, passing an unsigned int argument is always the same cost as passing an int.
Using unsigned int as the parameter type, as in:
void *memcpy(void *s1, void const *s2, unsigned int n);
works just dandy on any platform in which an sunsigned int can represent the size of the largest data object. This is generally the case on any platform in which integers and pointers have the same size, such as IP16, in which both integers and pointers occupy 16 bits, or IP32, in which both occupy 32 bits. (See the sidebar on C data model notation.)
C data model notation Of late, I've run across several articles that employ a compact notation for describing the C language data representation on different target platforms. I have yet to find the origins of this notation, a formal syntax, or even a name for it, but it appears to be simple enough to be usable without a formal definition. The general form of the notation appears to be: I nI L nL LL nLL P nP where each capital letter (or pair thereof) represents a C data type, and each corresponding n is the number of bits that the type occupies. I stands for int, L stands for long, LL stands for long long, and P stands for pointer (to data, not pointer to function). Each letter and number is optional. |
Unfortunately, this declaration for memcpy comes up short on an I16LP32 processor (16-bits for int and 32-bits for long and pointers), such as the first generation Motorola 68000. In this case, the processor can copy objects larger than 65,536 bytes, but this memcpy can't because parameter n can't handle values that large.
Easy to fix, you say? Just change the type of memcpy's third parameter:
void *memcpy(void *s1, void const *s2,
unsigned long n);
You can use this declaration to write a memcpy for an I16LP32 target, and it will be able to copy large objects. It will also work on IP16 and IP32 platforms, so it does provide a portable declaration for memcpy. Unfortunately, on an IP16 platform, the machine code you get from using unsigned long here is almost certainly a little less efficient (the code is both bigger and slower) than what you get from using an unsigned int.
In Standard C, a long (whether signed or unsigned) must occupy at least 32 bits. Thus, an IP16 platform that supports Standard C really must be an IP16L32 platform. Such platforms typically implement each 32-bit long as a pair of 16-bit words. In that case, moving a 32-bit long usually requires two machine instructions, one to move each 16-bit chunk. In fact, almost all 32-bit operations on these platforms require at least two instructions, if not more.
Thus, declaring memcpy's third parameter as an unsigned long in the name of portability exacts a performance toll on some platforms, something we'd like to avoid. Using size_t avoids that toll.
Type size_t is a stypedef that's an alias for some unsigned integer type, typically unsigned int or unsigned long, but possibly even unsigned long long. Each Standard C implementation is supposed to choose the unsigned integer that's big enough--but no bigger than needed--to represent the size of the largest possible object on the target platform.
Using size_t
The definition for size_t appears in several Standard C headers, namely, <stddef.h>, <stdio.h>, <stdlib.h>, <string.h>, <time.h>, and <wchar.h>. It also appears in the corresponding C++ headers, <cstddef>, <cstdio>, and so on. You should include at least one of these headers in your code before referring to size_t.
Including any of the C headers (in a program compiled as either C or C++) declares size_t as a global name. Including any of the C++ headers (something you can do only in C++) defines size_t as a member of namespace std.
By definition, size_t is the result type of the sizeof operator. Thus, the appropriate way to declare n to make the assignment:
n = sizeof(thing);
both portable and efficient is to declare n with type size_t. Similarly, the appropriate way to declare a function foo to make the call:
foo(sizeof(thing));
both portable and efficient is to declare foo's parameter with type size_t. Functions with parameters of type size_t often have local variables that count up to or down from that size and index into arrays, and size_t is often a good type for those variables.
Using size_t appropriately makes your source code a little more self-documenting. When you see an object declared as a size_t, you immediately know it represents a size in bytes or an index, rather than an error code or a general arithmetic value.
Expect to see me using size_t in other examples in upcoming columns.
Dan Saks is president of Saks & Associates, a C/C++ training and consulting company. For more information about Dan Saks, visit his website at www.dansaks.com. Dan also welcomes your feedback: e-mail him at dsaks@wittenberg.edu. For more information about Dan click here .
相关推荐
John Hughes在1984年的论文《Why Functional Programming Matters》中,详细阐述了函数式编程的这些优势,并通过操纵列表和树、编写数值算法和实现用于游戏程序的人工智能中的α-β剪枝算法来举例说明。论文强调了...
As software becomes more and more complex, it is more and more important to structure it well. Well-structured software is easy to write and to debug, and provides a collection of modules that can be ...
### 基础架构的重要性:POWER8与XEON x86对比分析 #### 概述 随着企业数字化转型的步伐不断加快,服务器基础架构的选择成为决定业务成功与否的关键因素之一。本报告聚焦于IBM的POWER8架构与Intel XEON x86架构之间...
标题“minuku_who_matters”可能指的是一个与软件开发相关的项目或库,可能是由名为Shriti和Neeraj Kumar的开发者创建的。描述中提到的“DReflect和Minuku库”是Java编程语言中的两个关键组件。下面将详细解释这两个...
How to Design a Good API and Why it Matters
npm install -- save react - native - size - matters //or: yarn add react - native - size - matters 动机 在使用react-native进行开发时,您需要手动调整您的应用程序,使其在各种不同的屏幕尺寸下看起来都不错...
### 函数式编程的重要性 #### 一、函数式编程简介 函数式编程(Functional Programming, FP)是一种编程范式,其核心思想在于将计算过程视为一系列函数调用的结果。与命令式编程不同,函数式编程强调的是“做什么”...
2. 主语从句:"In my opinion, ________ matters is whether we can win together as a team instead of individuals."中,"________ matters"为主语从句,它充当了整个句子的主语。从句中缺少主语,应使用"what...
初级入门吉他谱 guitar tab
12. 第十二题,“_______ matters is whether we can win together as a team.” 这里“_______ matters”是主语从句,用“what”引导,选项D正确。 13. 第十三题,“Tom's pay depends on ________ the factory.”...
《Matters Computational》是一本关于精选算法的书籍草稿,旨在为对特定算法感兴趣的程序员提供实用且经过优化的代码实现。本书的目标读者是对书中所涉及算法感兴趣并希望实际创建及理解经过合理优化的代码的程序员...
在当前的数字化时代,隐私保护成为了人们关注的重要议题。联邦学习(Federated Learning, FL)作为一种创新的机器学习框架,旨在解决这个问题,允许不同设备(客户端)在不共享原始数据的情况下协同训练模型。...
### Matters Computational: Key Insights and Algorithms #### Overview "Matters Computational" is a comprehensive resource authored by Jörg Arndt that delves into the intricacies of computational ...
How-to-Design-a-Good-API-and-Why-it-Matters The offline pdf already in this repo. ##API的重要性 公司最大的资产 公司最大的负债 ##好的API特征(和一个好的开源框架类似) 易于学习 即使没有文档,易于使用 很难...
<Data Transfer Matters for GPU Computing> Abstract—Graphics processing units (GPUs) embrace manycore compute devices where massively parallel compute threads are offloaded from CPUs. This ...
why C. where D. what - **正确答案**:D - **解析**:这里考查的是表语从句。“____ he can’t enjoy while living in big cities”是对“which”的解释说明,即乡村生活给予他的宁静是他在大城市里不能享受到的...