Visualizations of the 8th generations of this Mandelbrot are available here.
All systems use the same algorithm that calculates 128x256 dots of Mandelbrot. They also use almost the same ways to visualize it. Every dot is encoded with 4 bits. So all systems have to output exactly 16 KB of graphical data for every picture. The algorithm implementations for all systems are very optimized, graphics implemented via direct access to hardware but it is not so good optimized as the main Mandelbrot computational code. The next systems have been tested.
# | System | Year | OS |
---|
The Mandelbrot algorithm uses the next parameters for the first 16 visualizations.
# | iterations | x-interval | y-interval |
---|---|---|---|
1 | 7 | [-4.64, 4.29] | [-4.5, 4.5] |
2 | 8 | [-4.09, 3.60] | [-3.75, 3.75] |
3 | 9 | [-3.69, 3] | [-3.25, 3.25] |
4 | 10 | [-3.21, 2.5] | [-2.75, 2.75] |
5 | 11 | [-2.89, 2.07] | [-2.5, 2.5] |
6 | 12 | [-2.77, 1.70] | [-2, 2] |
7 | 13 | [-2.83, 1.38] | [-1.5, 1.5] |
8 | 14 | [-2.60, 1.12] | [-1, 1] |
9 | 15 | [-2.34, 0.89] | [-0.75, 0.75] |
10 | 16 | [-2.03, 0.70] | [-0.75, 0.75] |
11 | 17 | [-1.95, 0.53] | [-0.75, 0.75] |
12 | 18 | [-2.10, 0.38] | [-0.75, 0.75] |
13 | 19 | [-2.22, 0.26] | [-0.75, 0.75] |
14 | 20 | [-2.33, 0.15] | [-0.75, 0.75] |
15 | 21 | [-2.43, 0.05] | [-0.75, 0.75] |
16 | 22 | [-2.51, -0.03] | [-0.75, 0.75] |
All systems also provide timing information. The next table shows timings for drawing of pictures #1-16. The algorithm uses 16-bit signed arithmetic, so 16/32-bit systems have an advantage. Value "Gr%" presents the part of total time that is spent on the graphic output. The number in parentheses after @ is the approximated effective CPU frequency.
The color writing mode for the Corvette writes data for all 3 graphic planes simultaneously, so it actually updates 24 KB of video RAM on each screen in this mode.
Writing modes 0 and 2 were used for the EGA. Both produce the same picture. I can think that for the VGA results will be the same.
The results for the Amiga 500 with fast RAM are only about 1% faster so I haven't included them.
Some systems (the Apple IIgs, Atari ST, MSX, Geneve 9640, CoCo 3) have to use a slower (rotatated images) way to draw images because their graphics incapable to show 256 raster lines like most other computers. I can estimate that this makes these systems graphic performance up to 50% lower.
The Amstrad CPC/PCW uses faster main Mandelbrot computational code than the MSX or Commodore 128/Z80 because the Amstrad may set a memory layout that allows us to use a faster way to work with the look-up table.
The BK and Geneve take advantage of their CPU's ability to ignore the low bit of the address when working with words. Other architectures have to use special instructions to clear this bit. This gives the BK and Geneve a speed boost of about 20%. It is possible to slightly reduce the accuracy of the calculations and to make the value of the bit insignificant, but the first Mandelbrot program of this series was for the BK and therefore the use of the bit remained the same.
The BBC Master Turbo uses OSWORD 6 to draw pixels and this is not the fastest way possibe.
The Commodore +4 results can be about 5% faster if we turn on the NTSC mode during vertical retrace time.
Qemulator appears to be about 7% faster than real hardware. So the QL results are adjusted according results provided by mk79. It seems that pcem is also about 7% faster than the real IBM PC XT/AT but I have only indirect information about this so I didn't apply any correction to data from pcem.
The next table contains approximate values of efficiency reciprocals (ER) for the tested CPUs at effecive frequencies. These values are calculated by multiplication of the total time of the Mandelbrot calculations for the 16 first Mandelbrot pictures by the effective CPU frequency. The ER value reflects the efficiency of CPU electronics, it gives the reciprocal of the CPU performance at 1 MHz.
Rank | Processor | Year | ER |
---|
It is also interesting to compare the code density for this task. Two values are provided for this: the total program size and the size of the main loop. The results are sorted by the size of the main loop.
Rank | Platform | CPU | Program size | Main loop size | |
---|---|---|---|---|---|
bytes | LOC | ||||
1 | БK | T-11 | 902 | 32 | 13 |
2 | Geneve 9640, rotated | TMS9995 | 1720 | 36 | 14 |
3 | Geneve 9640, interlaced | 1702 | |||
4 | Atari ST, mono | 68000 | 1090 | 42 | 18 |
5 | Atari ST, rotated | 1209 | |||
6 | Macintosh | 1337 | |||
7 | QL | 68008 | 2241 | ||
8 | Amiga | 68000 68020 | 2385 | ||
9 | Pro-380, rotated | J-11 | 1219 | 46 | 17 |
10 | Pro-380 | 1221 | |||
11 | IBM PC, mode 2 | 8088 80286 | 919 | 20 | |
12 | IBM PC, mode 0 | 1019 | |||
13 | Tandy Coco 3 | 6809 | 1105 | 52 | 25 |
14 | 6309 | 1109 | 54 | 24 | |
15 | Amstrad CPC, 16c | Z80 | 1040 | 58 | 41 |
16 | Amstrad CPC, 4c | 1064 | |||
17 | Amstrad PCW | 1702 | |||
18 | MSX2, rotated | 1432 | 63 | 44 | |
19 | MSX2, interlaced | 1481 | |||
20 | Commodore 128 | 1601 | |||
21 | Archimedes | ARM2 | 1349 | 64 | 16 |
22 | Apple IIgs | 65816 | 1362 | 73 | 39 |
23 | Corvette, color | 8080 | 1121 | 81 | 63 |
24 | Corvette, planar | 1162 | |||
25 | Vector-06C | 1178 | |||
26 | BBC Micro, 16c | 6502 | 1376 | 131 | 81 |
27 | BBC Micro, 4c | 1408 | |||
28 | BBC Master Turbo, 16c | 1422 | |||
29 | Commodore 128 | 1648 | |||
30 | Plus4, interlaced | 1768 | |||
31 | Plus4, flashing | 1807 |
The QL code is a Basic program which generates and uses ML code.
Sources for all these programs are available at github. You also can download their executables there.
If anybody finds a way to speed up these implementations of Mandelbrot calculations, or just creates new implementations, please inform me and I should update this page. Send your reports to zliztwr@yzandex.ru but remove all z in the address. Reports may be also sent directly to the project github-page.
Many thanks to the people who helped: stasmas, reddie, mk79, BigEd, RichTW, MMS, stanp, leegleason, Hunta, ... and the staff of Yandex Museum.