The Wayback Machine - https://web.archive.org/web/20090202113414/http://blog.ooz.ie:80/2008/09/0x03-notes-on-assembly-memory-from.html

Sunday, September 14, 2008

[0x03]. Notes on Assembly - Memory from a process' point of view

Virtual Memory enables a user of a modern Operating System to utilize features, without which modern IT world has no chances. It wasn't always like that. On the older OS'es programs could use most of the memory available to the contemporary Operating Systems (which was not too much in terms of today, quite a lot those days), but the price they paid for it was running one application at a time. The advent of Virtual Memory and Protected Mode changed a lot. As long as an OS allows it, a process can now think that it can use the entire main memory for itself. Multiple processes are tricked into this belief what makes backwards compatibility possible and relatively painless.

In-depth memory layout is specific to both the CPU architecture and the OS itself and at this stage is beyond my understanding. The way a process sees it's memory share during execution is much easier to grasp and this is what I'm going to describe here.

Memory Layout from a process perspective

When a program is being executed it is read into memory*, where it resides and is being read from until it finishes. Around the code, a number of special purpose memory blocks are allocated for different data types. The very common scheme, but not the only one, is depicted in the following table.

*a statement that the storage size of your binary does not influence the memory use is not true. Programs static code is read into the lower part of memory.



Stack

a very dynamic kind of memory located at it's top (high addresses) and growing downwards



Memory not allocated yet

Memory that will soon become allocated by either stack, that grows down, or the heap growing up from underneath. When stack meats heap we run out of memory. Typically this is the biggest chunk and most of it remains unused



Heap

Some people say this is the most dynamic part of memory. It is dynamically allocated and freed in big chunks. The allocation process is rather complex (stub/buddy system) and is more time consuming than putting things on stack.


BSS

Memory containing global variables of known (predeclared) size.


Constant data

All constants used in a program.


Static program code

Reserved / other stuff

In order to prove that things really work this way (on many systems anyway) I wrote a C program, mem_sequence.c, that allocates 5 types of data, finds their location the (virtual) memory address, sorts them in descending order and then displays presenting a similar output to the table above. mem_sequence.c is tested on Linux, FreeBSD, MacOS X, WinXP and DOS. All UNIX-like systems preserve a similar model with slight differences in address thresholds, the output from Microsoft systems is different and thus interesting. In my opinion the Unix way makes more sense, because it leaves the unallocated space between stack and heap and enables the program to adjust according to own needs.

This is how I played with mem_sequence:
$ gcc mem_sequence.c -o mem_sequence
$ ./mem_sequence
1.(0xbf828124) stack
2.(0x0804a008) heap
3.(0x080497d4) bss
4.(0x08048688) const's
5.(0x08048557) code
^Z
[1]+ Stopped ./mem_sequence
$ cat /proc/`pidof mem_sequence`/maps
08048000-08049000 r-xp 00000000 fd:01 313781 mem_sequence
08049000-0804a000 rw-p 00000000 fd:01 313781 mem_sequence
0804a000-0806b000 rw-p 0804a000 00:00 0 [heap]
b7dda000-b7ddb000 rw-p b7dda000 00:00 0
b7ddb000-b7efe000 r-xp 00000000 fd:01 4872985 /lib/libc-2.5.so
b7efe000-b7eff000 r--p 00123000 fd:01 4872985 /lib/libc-2.5.so
b7eff000-b7f01000 rw-p 00124000 fd:01 4872985 /lib/libc-2.5.so
b7f01000-b7f04000 rw-p b7f01000 00:00 0
b7f18000-b7f1b000 rw-p b7f18000 00:00 0
b7f1b000-b7f35000 r-xp 00000000 fd:01 4872978 /lib/ld-2.5.so
b7f35000-b7f36000 r--p 00019000 fd:01 4872978 /lib/ld-2.5.so
b7f36000-b7f37000 rw-p 0001a000 fd:01 4872978 /lib/ld-2.5.so
bf816000-bf82b000 rw-p bffeb000 00:00 0 [stack]
ffffe000-fffff000 r-xp 00000000 00:00 0 [vdso]
$
Numbers don't lie! Analyze it:
  • The code (5) and constants (4) fall into the readable and executable (non-writable!) portion of code.
  • BSS (3) is enclosed in the read-write but not executable partition.
  • Heap sits on top of them and is denoted by "[Heap]". Bingo!
  • ...long long nothing...
  • Stack at the very top, described as "[Stack]". Yahtzee!
It works, great news, but it works differently on different x86 based Operating Systems. Check it out yourself and please let me know if you make an interesting discovery on some other exotic system.
topLinuxFreeBSDMacOSX x86 / PPC
Win32DOS
1
stackstackstackheapheap
2
heapheapheapbssstack
3
bssbssbssconstbss
4
constconstconstcodeconst
5codecodecodestackcode

Thanks to Harald Monihart for providing MacOSX PPC data.

7 comments:

vasa said...

The information very good and useful

Sudheer said...

Hi,

Could you please tell which variables are stored in Data Segment ?

Sudheer

naresh said...

Data segment contains
1) Uninitialized global/static variables.
2) Global/static variables initialized with non-zero values.

naresh said...

since uninitialized global/static variables take "0" value, we do not store them in data segment. They get stored in bss segment. Data segment contains only global/static variables which are initialized to non-zero value.

Avenging said...

Thank for this very useful information.
Could you please include into your analyse (program) the data segment?

Anonymous said...

Amiga OS 4.1 shows:

1. code
2. heap
3. bss
4. const's
5. stack

;-)

Anonymous said...

Consider a 32 bit system which can address 4G of memory. Note that the layout here talks about 0-3G used by the program. Memory over 3G is used by the kernel. Also note that none of the addresses printed by this program are also over the 3G limit.