C++ type sizes in bytes | |
---|---|
bool: | 4 |
char: | 1 |
short: | 2 |
int: | 4 |
long: | 4 |
long long: | 8 |
float: | 4 |
double: | 8 |
void*: | 4 |
EmptyClass: | 1 |
EmptyVirtualClass: | 4 |
So a boolean is mostly wasted space! Minimum "Object" size is 4 bytes (pointer for to vtable). The empty class has no virtual method so its the size of the sum of its elements with a lower bound of 1 byte. Certainly most C++ has lots of unused memory space in little fragments all over the place.
Smalltalk OTOH uses "a variable-length header format which seldom requires more than a single 32-bit word of header information per object. The format is given in Tables 1 and 2.
Table 1: Format of a Squeak object header offset contents occurrence
-8 size in words (30 bits), header type (2 bits) 1%
-4 full class pointer (30 bits), header type (2 bits) 18%
0 base header, as follows...
storage management (3 bits)
object hash (12 bits)
compact class index (5 bits)
object format field (4 bits, see below)
size in words (6 bits)
header type (2 bits) 100%
Table 2: Encoding of the object format field in a Squeak object header 0 no fields
1 fixed pointer fields
2 indexable pointer fields
3 both fixed and indexable pointer fields
4 unused
5 unused
6 indexable word fields (no pointers)
7 unused
8-11 indexable byte fields (no pointers):
low 2 bits are low 2 bits of size in bytes
12-15 compiled methods: low 2 bits are low 2 bits of size in bytes.
The number of literals is specified in method header, followed by the indexable bytes that store byte codes.
Our design is based on the fact that most objects in a typical Smalltalk image are small instances of a relatively small number of classes. The 5-bit compact class index field, if non-zero, is an index into a table of up to 31 classes that are designated as having compact instances; the programmer can change which classes these are. The 6-bit size field, if non-zero, specifies the size of the object in words, accommodating sizes up to 256 bytes (i.e., 64 words, with the additional 2 bits needed to resolve the length of byte-indexable objects encoded in the format field). With only 12 classes designated as compact in the 1.18 Squeak release, around 81% of the objects have only this single word of overhead. Most of the rest need one additional word to store a full class pointer. Only a few remaining objects (1%) are large enough to require a third header word to encode their size, and this extra word of overhead is a tiny fraction of their size. "
From here: [link|ftp://st.cs.uiuc.edu/Smalltalk/Squeak/docs/OOPSLA.Squeak.html|ftp://st.cs.uiuc.edu...OPSLA.Squeak.html]
So which language is more type safe at runtime? The one where you can tell what the bits mean that you are looking at or the one that struggles to prevent the creation of instructions that misinterpret the bits and thus makes assumptions with every memory access?
C++ is a naive hack that got way out of hand.
Edit - fixed html table