forked from maximmasiutin/FastMM4-AVX
-
Notifications
You must be signed in to change notification settings - Fork 0
/
FastMM4.pas
20772 lines (19578 loc) · 729 KB
/
FastMM4.pas
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
(*
FastMM4-AVX (efficient synchronization and AVX1/AVX2/AVX512/ERMS/FSRM support for FastMM4)
- Copyright (C) 2017-2020 Ritlabs, SRL. All rights reserved.
- Copyright (C) 2020-2023 Maxim Masiutin. All rights reserved.
Written by Maxim Masiutin <[email protected]>
Version 1.0.7
This is a fork of the "Fast Memory Manager" (FastMM) v4.993 by Pierre le Riche
(see below for the original FastMM4 description)
What was added to FastMM4-AVX in comparison to the original FastMM4:
- Efficient synchronization
- improved synchronization between the threads; proper synchronization
techniques are used depending on context and availability, i.e., spin-wait
loops, umonitor / umwait, SwitchToThread, critical sections, etc.;
- used the "test, test-and-set" technique for the spin-wait loops; this
technique is recommended by Intel (see Section 11.4.3 "Optimization with
Spin-Locks" of the Intel 64 and IA-32 Architectures Optimization Reference
Manual) to determine the availability of the synchronization variable;
according to this technique, the first "test" is done via the normal
(non-locking) memory load to prevent excessive bus locking on each
iteration of the spin-wait loop; if the variable is available upon
the normal memory load of the first step ("test"), proceed to the
second step ("test-and-set") which is done via the bus-locking atomic
"xchg" instruction; however, this two-steps approach of using "test" before
"test-and-set" can increase the cost for the un-contended case comparing
to just single-step "test-and-set", this may explain why the speed benefits
of the FastMM4-AVX are more pronounced when the memory manager is called
from multiple threads in parallel, while in single-threaded use scenario
there may be no benefit compared to the original FastMM4;
- the number of iterations of "pause"-based spin-wait loops is 5000,
before relinquishing to SwitchToThread();
- see https://stackoverflow.com/a/44916975 for more details on the
implementation of the "pause"-based spin-wait loops;
- using normal memory store to release a lock:
FastMM4-AVX uses normal memory store, i.e., the "mov" instruction, rather
then the bus-locking "xchg" instruction to write into the synchronization
variable (LockByte) to "release a lock" on a data structure,
see https://stackoverflow.com/a/44959764
for discussion on releasing a lock;
you man define "InterlockedRelease" to get the old behavior of the original
FastMM4.
- implemented dedicated lock and unlock procedures that operate with
synchronization variables (LockByte);
before that, locking operations were scattered throughout the code;
now the locking functions have meaningful names:
AcquireLockByte and ReleaseLockByte;
the values of the lock byte are now checked for validity when
FullDebugMode or DEBUG is defined, to detect cases when the same lock is
released twice, and other improper use of the lock bytes;
- added compile-time options "SmallBlocksLockedCriticalSection",
"MediumBlocksLockedCriticalSection" and "LargeBlocksLockedCriticalSection"
which are set by default (inside the FastMM4Options.inc file) as
conditional defines. If you undefine these options, you will get the
old locking mechanism of the original FastMM4 based on loops of Sleep() or
SwitchToThread().
- AVX, AVX2 or AVX512 instructions for faster memory copy
- if the CPU supports AVX or AVX2, use the 32-byte YMM registers
for faster memory copy, and if the CPU supports AVX-512,
use the 64-byte ZMM registers for even faster memory copy;
- please note that the effect of using AVX instruction in speed improvement is
negligible, compared to the effect brought by efficient synchronization;
sometimes AVX instructions can even slow down the program because of AVX-SSE
transition penalties and reduced CPU frequency caused by AVX-512
instructions in some processors; use DisableAVX to turn AVX off completely
or use DisableAVX1/DisableAVX2/DisableAVX512 to disable separately certain
AVX-related instruction set from being compiled);
- if EnableAVX is defined, all memory blocks are aligned by 32 bytes, but
you can also use Align32Bytes define without AVX; please note that the memory
overhead is higher when the blocks are aligned by 32 bytes, because some
memory is lost by padding; however, if your CPU supports
"Fast Short REP MOVSB" (Ice Lake or newer), you can disable AVX, and align
by just 8 bytes, and this may even be faster because less memory is wasted
on alignment;
- with AVX, memory copy is secure - all XMM/YMM/ZMM registers used to copy
memory are cleared by vxorps/vpxor, so the leftovers of the copied memory
are not exposed in the XMM/YMM/ZMM registers;
- the code attempts to properly handle AVX-SSE transitions to not incur the
transition penalties, only call vzeroupper under AVX1, but not under AVX2
since it slows down subsequent SSE code under Skylake / Kaby Lake;
- on AVX-512, writing to xmm16-xmm31 registers will not affect the turbo
clocks, and will not impose AVX-SSE transition penalties; therefore, when we
have AVX-512, we now only use x(y/z)mm16-31 registers.
- Speed improvements due to code optimization and proper techniques
- if the CPU supports Enhanced REP MOVSB/STOSB (ERMS), use this feature
for faster memory copy (under 32 bit or 64-bit) (see the EnableERMS define,
on by default, use DisableERMS to turn it off);
- if the CPU supports Fast Short REP MOVSB (FSRM), uses this feature instead
of AVX;
- branch target alignment in assembly routines is only used when
EnableAsmCodeAlign is defined; Delphi incorrectly encodes conditional
jumps, i.e., use long, 6-byte instructions instead of just short, 2-byte,
and this may affect branch prediction, so the benefits of branch target
alignment may not outweigh the disadvantage of affected branch prediction,
see https://stackoverflow.com/q/45112065
- compare instructions + conditional jump instructions are put together
to allow macro-op fusion (which happens since Core2 processors, when
the first instruction is a CMP or TEST instruction and the second
instruction is a conditional jump instruction);
- multiplication and division by a constant, which is a power of 2
replaced to shl/shr, because Delphi64 compiler doesn't replace such
multiplications and divisions to shl/shr processor instructions,
and, according to the Intel Optimization Reference Manual, shl/shr is
faster than imul/idiv, at least for some processors.
- Safer, cleaner code with stricter type adherence and better compatibility
- names assigned to some constants that used to be "magic constants",
i.e., unnamed numerical constants - plenty of them were present
throughout the whole code;
- removed some typecasts; the code is stricter to let the compiler
do the job, check everything and mitigate probable error. You can
even compile the code with "integer overflow checking" and
"range checking", as well as with "typed @ operator" - for safer
code. Also added round bracket in the places where the typed @ operator
was used, to better emphasize on who's address is taken;
- the compiler environment is more flexible now: you can now compile FastMM4
with, for example, typed "@" operator or any other option. Almost all
externally-set compiler directives are honored by FastMM except a few
(currently just one) - look for the "Compiler options for FastMM4" section
below to see what options cannot be externally set and are always
redefined by FastMM4 for itself - even if you set up these compiler options
differently outside FastMM4, they will be silently
redefined, and the new values will be used for FastMM4 only;
- the type of one-byte synchronization variables (accessed via "lock cmpxchg"
or "lock xchg") replaced from Boolean to Byte for stricter type checking;
- those fixed-block-size memory move procedures that are not needed
(under the current bitness and alignment combinations) are
explicitly excluded from compiling, to not rely on the compiler
that is supposed to remove these function after compilation;
- added length parameter to what were the dangerous null-terminated string
operations via PAnsiChar, to prevent potential stack buffer overruns
(or maybe even stack-based exploitation?), and there some Pascal functions
also left, the argument is not yet checked. See the "todo" comments
to figure out where the length is not yet checked. Anyway, since these
memory functions are only used in Debug mode, i.e., in development
environment, not in Release (production), the impact of this
"vulnerability" is minimal (albeit this is a questionable statement);
- removed all non-US-ASCII characters, to avoid using UTF-8 BOM, for
better compatibility with very early versions of Delphi (e.g., Delphi 5),
thanks to Valts Silaputnins;
- support for Lazarus 1.6.4 with FreePascal (the original FastMM4 4.992
requires modifications, it doesn't work under Lazarus 1.6.4 with FreePascal
out-of-the-box, also tested under Lazarus 1.8.2 / FPC 3.0.4 with Win32
target; later versions should be also supported.
Here are the comparison of the Original FastMM4 version 4.992, with default
options compiled for Win64 by Delphi 10.2 Tokyo (Release with Optimization),
and the current FastMM4-AVX branch ("AVX-br."). Under some multi-threading
scenarios, the FastMM4-AVX branch is more than twice as fast compared to the
Original FastMM4. The tests have been run on two different computers: one
under Xeon E5-2543v2 with 2 CPU sockets, each has 6 physical cores
(12 logical threads) - with only 5 physical core per socket enabled for the
test application. Another test was done under an i7-7700K CPU.
Used the "Multi-threaded allocate, use and free" and "NexusDB"
test cases from the FastCode Challenge Memory Manager test suite,
modified to run under 64-bit.
Xeon E5-2543v2 2*CPU i7-7700K CPU
(allocated 20 logical (8 logical threads,
threads, 10 physical 4 physical cores),
cores, NUMA), AVX-1 AVX-2
Orig. AVX-br. Ratio Orig. AVX-br. Ratio
------ ----- ------ ----- ----- ------
02-threads realloc 96552 59951 62.09% 65213 49471 75.86%
04-threads realloc 97998 39494 40.30% 64402 47714 74.09%
08-threads realloc 98325 33743 34.32% 64796 58754 90.68%
16-threads realloc 116273 45161 38.84% 70722 60293 85.25%
31-threads realloc 122528 53616 43.76% 70939 62962 88.76%
64-threads realloc 137661 54330 39.47% 73696 64824 87.96%
NexusDB 02 threads 122846 90380 73.72% 79479 66153 83.23%
NexusDB 04 threads 122131 53103 43.77% 69183 43001 62.16%
NexusDB 08 threads 124419 40914 32.88% 64977 33609 51.72%
NexusDB 12 threads 181239 55818 30.80% 83983 44658 53.18%
NexusDB 16 threads 135211 62044 43.61% 59917 32463 54.18%
NexusDB 31 threads 134815 48132 33.46% 54686 31184 57.02%
NexusDB 64 threads 187094 57672 30.25% 63089 41955 66.50%
The above tests have been run on 14-Jul-2017.
Here are some more test results (Compiled by Delphi 10.2 Update 3):
Xeon E5-2667v4 2*CPU i9-7900X CPU
(allocated 32 logical (20 logical threads,
threads, 16 physical 10 physical cores),
cores, NUMA), AVX-2 AVX-512
Orig. AVX-br. Ratio Orig. AVX-br. Ratio
------ ----- ------ ----- ----- ------
02-threads realloc 80544 60025 74.52% 66100 55854 84.50%
04-threads realloc 80751 47743 59.12% 64772 40213 62.08%
08-threads realloc 82645 32691 39.56% 62246 27056 43.47%
12-threads realloc 89951 43270 48.10% 65456 25853 39.50%
16-threads realloc 95729 56571 59.10% 67513 27058 40.08%
31-threads realloc 109099 97290 89.18% 63180 28408 44.96%
64-threads realloc 118589 104230 87.89% 57974 28951 49.94%
NexusDB 01 thread 160100 121961 76.18% 93341 95807 102.64%
NexusDB 02 threads 115447 78339 67.86% 77034 70056 90.94%
NexusDB 04 threads 107851 49403 45.81% 73162 50039 68.39%
NexusDB 08 threads 111490 36675 32.90% 70672 42116 59.59%
NexusDB 12 threads 148148 46608 31.46% 92693 53900 58.15%
NexusDB 16 threads 111041 38461 34.64% 66549 37317 56.07%
NexusDB 31 threads 123496 44232 35.82% 62552 34150 54.60%
NexusDB 64 threads 179924 62414 34.69% 83914 42915 51.14%
The above tests (on Xeon E5-2667v4 and i9) have been done on 03-May-2018.
Here is the single-threading performance comparison in some selected
scenarios between FastMM v5.03 dated May 12, 2021 and FastMM4-AVX v1.05
dated May 20, 2021. FastMM4-AVX is compiled with default optinos. This
test is run on May 20, 2021, under Intel Core i7-1065G7 CPU, Ice Lake
microarchitecture, base frequency: 1.3 GHz, max turbo frequencey: 3.90 GHz,
4 cores, 8 threads. Compiled under Delphi 10.3 Update 3, 64-bit target.
Please note that these are the selected scenarios where FastMM4-AVX is
faster then FastMM5. In other scenarios, especially in multi-threaded
with heavy contention, FastMM5 is faster.
FastMM5 AVX-br. Ratio
------ ------ ------
ReallocMem Small (1-555b) benchmark 1425 1135 79.65%
ReallocMem Medium (1-4039b) benchmark 3834 3309 86.31%
Block downsize 12079 10305 85.31%
Address space creep benchmark 13283 12571 94.64%
Address space creep (larger blocks) 16066 13879 86.39%
Single-threaded reallocate and use 4395 3960 90.10%
Single-threaded tiny reallocate and use 8766 7097 80.96%
Single-threaded allocate, use and free 13912 13248 95.23%
You can find the program, used to generate the benchmark data,
at https://github.com/maximmasiutin/FastCodeBenchmark
You can find the program, used to generate the benchmark data,
at https://github.com/maximmasiutin/FastCodeBenchmark
FastMM4-AVX is released under a dual license, and you may choose to use it
under either the Mozilla Public License 2.0 (MPL 2.1, available from
https://www.mozilla.org/en-US/MPL/2.0/) or the GNU Lesser General Public
License Version 3, dated 29 June 2007 (LGPL 3, available from
https://www.gnu.org/licenses/lgpl.html).
FastMM4-AVX is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
FastMM4-AVX is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License
along with FastMM4-AVX (see license_lgpl.txt and license_gpl.txt)
If not, see <http://www.gnu.org/licenses/>.
FastMM4-AVX Version History:
- 1.0.7 (21 March 2023) - implemented the use of umonitor/umwait instructions;
thanks to TetzkatLipHoka for the updated FullDebugMode to v1.64
of the original FastMM4.
- 1.0.6 (25 August 2021) - it can now be compiled with any alignment (8, 16, 32)
regardless of the target (x86, x64) and whether inline assembly is used
or not; the "PurePascal" conditional define to disable inline assembly at
all, however, in this case, efficient locking would not work since it
uses inline assembly; FreePascal now uses the original FreePascal compiler
mode, rather than the Delphi compatibility mode as before; resolved many
FreePascal compiler warnings; supported branch target alignment
in FreePascal inline assembly; small block types now always have
block sizes of 1024 and 2048 bytes, while in previous versions
instead of 1024-byte blocks there were 1056-byte blocks,
and instead of 2048-byte blocks were 2176-byte blocks;
fixed Delphi compiler hints for 64-bit Release mode; Win32 and Win64
versions compiled under Delphi and FreePascal passed the all the FastCode
validation suites.
- 1.05 (20 May 2021) - improved speed of releasing memory blocks on higher thread
contention. It is also possible to compile FastMM4-AVX without a single
inline assembly code. Renamed some conditional defines to be self-explaining.
Rewritten some comments to be meaningful. Made it compile under FreePascal
for Linux 64-bit and 32-bit. Also made it compile under FreePascal for
Windows 32-bit and 64-bit. Memory move functions for 152, 184 and 216 bytes
were incorrect Linux. Move216AVX1 and Move216AVX2 Linux implementation had
invalid opcodes. Added support for the GetFPCHeapStatus(). Optimizations on
single-threaded performance. If you define DisablePauseAndSwitchToThread,
it will use EnterCriticalSection/LeaveCriticalSectin. An attempt to free a
memory block twice was not caught under 32-bit Delphi. Added SSE fixed block
copy routines for 32-bit targets. Added support for the "Fast Short REP MOVSB"
CPU feature. Removed redundant SSE code from 64-bit targets.
- 1.04 (O6 October 2020) - improved use of AVX-512 instructions to avoid turbo
clock reduction and SSE/AVX transition penalty; made explicit order of
parameters for GetCPUID to avoid calling convention ambiguity that could
lead to incorrect use of registers and finally crashes, i.e., under Linux;
improved explanations and comments, i.e., about the use of the
synchronization techniques.
- 1.03 (04 May 2018) - minor fixes for the debug mode, FPC compatibility
and code readability cosmetic fixes.
- 1.02 (07 November 2017) - added and tested support for the AVX-512
instruction set.
- 1.01 (10 October 2017) - made the source code compile under Delphi5,
thanks to Valts Silaputnins.
- 1.00 (27 July 2017) - initial revision.
The original FastMM4 description follows:
Fast Memory Manager 4.993
Description:
A fast replacement memory manager for Embarcadero Delphi Win32 applications
that scales well under multi-threaded usage, is not prone to memory
fragmentation, and supports shared memory without the use of external .DLL
files.
Homepage:
Version 4: https://github.com/pleriche/FastMM4
Version 5: https://github.com/pleriche/FastMM5
Advantages:
- Fast
- Low overhead. FastMM is designed for an average of 5% and maximum of 10%
overhead per block.
- Supports up to 3GB of user mode address space under Windows 32-bit and 4GB
under Windows 64-bit. Add the "$SetPEFlags $20" option (in curly braces)
to your .dpr to enable this.
- Highly aligned memory blocks. Can be configured for either 8-byte, 16-byte
or 32-byte alignment.
- Good scaling under multi-threaded applications
- Intelligent reallocations. Avoids slow memory move operations through
not performing unnecessary downsizes and by having a minimum percentage
block size growth factor when an in-place block upsize is not possible.
- Resistant to address space fragmentation
- No external DLL required when sharing memory between the application and
external libraries (provided both use this memory manager)
- Optionally reports memory leaks on program shutdown. (This check can be set
to be performed only if Delphi is currently running on the machine, so end
users won't be bothered by the error message.)
- Supports Delphi 4 (or later), C++ Builder 4 (or later), Kylix 3.
Usage:
Delphi:
Place this unit as the very first unit under the "uses" section in your
project's .dpr file. When sharing memory between an application and a DLL
(e.g. when passing a long string or dynamic array to a DLL function), both the
main application and the DLL must be compiled using this memory manager (with
the required conditional defines set). There are some conditional defines
(inside FastMM4Options.inc) that may be used to tweak the memory manager. To
enable support for a user mode address space greater than 2GB you will have to
use the EditBin* tool to set the LARGE_ADDRESS_AWARE flag in the EXE header.
This informs Windows x64 or Windows 32-bit (with the /3GB option set) that the
application supports an address space larger than 2GB (up to 4GB). In Delphi 6
and later you can also specify this flag through the compiler directive
{$SetPEFlags $20}
*The EditBin tool ships with the MS Visual C compiler.
C++ Builder 6:
Refer to the instructions inside FastMM4BCB.cpp.
License:
This work is copyright Professional Software Development / Pierre le Riche. It
is released under a dual license, and you may choose to use it under either the
Mozilla Public License 1.1 (MPL 1.1, available from
http://www.mozilla.org/MPL/MPL-1.1.html) or the GNU Lesser General Public
License 2.1 (LGPL 2.1, available from
http://www.opensource.org/licenses/lgpl-license.php). If you find FastMM useful
or you would like to support further development, a donation would be much
appreciated.
My PayPal account is:
Contact Details:
My contact details are shown below if you would like to get in touch with me.
If you use this memory manager I would like to hear from you: please e-mail me
your comments - good and bad.
E-mail:
Support:
If you have trouble using FastMM, you are welcome to drop me an e-mail at the
address above.
Disclaimer:
FastMM has been tested extensively with both single and multithreaded
applications on various hardware platforms, but unfortunately, I am not in a
position to make any guarantees. Use it at your own risk.
Acknowledgements (for version 4):
- Eric Grange for his RecyclerMM on which the earlier versions of FastMM were
based. RecyclerMM was what inspired me to try and write my own memory
manager back in early 2004.
- Primoz Gabrijelcic for several bugfixes and enhancements.
- Dennis Christensen for his tireless efforts with the Fastcode project:
helping to develop, optimize and debug the growing Fastcode library.
- JiYuan Xie for implementing the leak reporting code for C++ Builder.
- Sebastian Zierer for implementing the OS X support.
- Pierre Y. for his suggestions regarding the extension of the memory leak
checking options.
- Hanspeter Widmer for his suggestion to have an option to display install and
uninstall debug messages and moving options to a separate file, as well as
the new usage tracker.
- Anders Isaksson and Greg for finding and identifying the "DelphiIsRunning"
bug under Delphi 5.
- Francois Malan for various suggestions and bug reports.
- Craig Peterson for helping me identify the cache associativity issues that
could arise due to medium blocks always being an exact multiple of 256 bytes.
Also for various other bug reports and enhancement suggestions.
- Jarek Karciarz, Vladimir Ulchenko (Vavan) and Bob Gonder for their help in
implementing the BCB support.
- Ben Taylor for his suggestion to display the object class of all memory
leaks.
- Jean Marc Eber and Vincent Mahon (the Memcheck guys) for the call stack
trace code and also the method used to catch virtual method calls on freed
objects.
- Nahan Hyn for the suggestion to be able to enable or disable memory leak
reporting through a global variable (the "ManualLeakReportingControl"
option.)
- Leonel Togniolli for various suggestions with regard to enhancing the bug
tracking features of FastMM and other helpful advice.
- Joe Bain and Leonel Togniolli for the workaround to QC#10922 affecting
compilation under Delphi 2005.
- Robert Marquardt for the suggestion to make localisation of FastMM easier by
having all string constants together.
- Simon Kissel and Fikret Hasovic for their help in implementing Kylix support.
- Matthias Thoma, Petr Vones, Robert Rossmair and the rest of the JCL team for
their debug info library used in the debug info support DLL and also the
code used to check for a valid call site in the "raw" stack trace code.
- Andreas Hausladen for the suggestion to use an external DLL to enable the
reporting of debug information.
- Alexander Tabakov for various good suggestions regarding the debugging
facilities of FastMM.
- M. Skloff for some useful suggestions and bringing to my attention some
compiler warnings.
- Martin Aignesberger for the code to use madExcept instead of the JCL library
inside the debug info support DLL.
- Diederik and Dennis Passmore for the suggestion to be able to register
expected leaks.
- Dario Tiraboschi and Mark Gebauer for pointing out the problems that occur
when range checking and complete boolean evaluation is turned on.
- Arthur Hoornweg for notifying me of the image base being incorrect for
borlndmm.dll.
- Theo Carr-Brion and Hanspeter Widmer for finding the false alarm error
message "Block Header Has Been Corrupted" bug in FullDebugMode.
- Danny Heijl for reporting the compiler error in "release" mode.
- Omar Zelaya for reporting the BCB support regression bug.
- Dan Miser for various good suggestions, e.g. not logging expected leaks to
file, enhancements the stack trace and messagebox functionality, etc.
- Arjen de Ruijter for fixing the bug in GetMemoryLeakType that caused it
to not properly detect expected leaks registered by class when in
"FullDebugMode".
- Aleksander Oven for reporting the installation problem when trying to use
FastMM in an application together with libraries that all use runtime
packages.
- Kristofer Skaug for reporting the bug that sometimes causes the leak report
to be shown, even when all the leaks have been registered as expected leaks.
Also for some useful enhancement suggestions.
- Guenter Schoch for the "RequireDebuggerPresenceForLeakReporting" option.
- Jan Schlueter for the "ForceMMX" option.
- Hallvard Vassbotn for various good enhancement suggestions.
- Mark Edington for some good suggestions and bug reports.
- Paul Ishenin for reporting the compilation error when the NoMessageBoxes
option is set and also the missing call stack entries issue when "raw" stack
traces are enabled, as well as for the Russian translation.
- Cristian Nicola for reporting the compilation bug when the
CatchUseOfFreedInterfaces option was enabled (4.40).
- Mathias Rauen (madshi) for improving the support for madExcept in the debug
info support DLL.
- Roddy Pratt for the BCB5 support code.
- Rene Mihula for the Czech translation and the suggestion to have dynamic
loading of the FullDebugMode DLL as an option.
- Artur Redzko for the Polish translation.
- Bart van der Werf for helping me solve the DLL unload order problem when
using the debug mode borlndmm.dll library, as well as various other
suggestions.
- JRG ("The Delphi Guy") for the Spanish translation.
- Justus Janssen for Delphi 4 support.
- Vadim Lopushansky and Charles Vinal for reporting the Delphi 5 compiler
error in version 4.50.
- Johni Jeferson Capeletto for the Brazilian Portuguese translation.
- Kurt Fitzner for reporting the BCB6 compiler error in 4.52.
- Michal Niklas for reporting the Kylix compiler error in 4.54.
- Thomas Speck and Uwe Queisser for German translations.
- Zaenal Mutaqin for the Indonesian translation.
- Carlos Macao for the Portuguese translation.
- Michael Winter for catching the performance issue when reallocating certain
block sizes.
- dzmitry[li] for the Belarussian translation.
- Marcelo Montenegro for the updated Spanish translation.
- Jud Cole for finding and reporting the bug which may trigger a read access
violation when upsizing certain small block sizes together with the
"UseCustomVariableSizeMoveRoutines" option.
- Zdenek Vasku for reporting and fixing the memory manager sharing bug
affecting Windows 95/98/Me.
- RB Winston for suggesting the improvement to GExperts "backup" support.
- Thomas Schulz for reporting the bug affecting large address space support
under FullDebugMode, as well as the recursive call bug when attempting to
report memory leaks when EnableMemoryLeakReporting is disabled.
- Luigi Sandon for the Italian translation.
- Werner Bochtler for various suggestions and bug reports.
- Markus Beth for suggesting the "NeverSleepOnThreadContention" option.
- JiYuan Xie for the Simplified Chinese translation.
- Andrey Shtukaturov for the updated Russian translation, as well as the
Ukrainian translation.
- Dimitry Timokhov for finding two elusive bugs in the memory leak class
detection code.
- Paulo Moreno for fixing the AllocMem bug in FullDebugMode that prevented
large blocks from being cleared.
- Vladimir Bochkarev for the suggestion to remove some unnecessary code if the
MM sharing mechanism is disabled.
- Loris Luise for the version constant suggestion.
- J.W. de Bokx for the MessageBox bugfix.
- Igor Lindunen for reporting the bug that caused the Align16Bytes option to
not work in FullDebugMode.
- Ionut Muntean for the Romanian translation.
- Florent Ouchet for the French translation.
- Marcus Moennig for the ScanMemoryPoolForCorruptions suggestion and the
suggestion to have the option to scan the memory pool before every
operation when in FullDebugMode.
- Francois Piette for bringing under my attention that
ScanMemoryPoolForCorruption was not thread safe.
- Michael Rabatscher for reporting some compiler warnings.
- QianYuan Wang for the Simplified Chinese translation of FastMM4Options.inc.
- Maurizio Lotauro and Christian-W. Budde for reporting some Delphi 5
compiler errors.
- Patrick van Logchem for the DisableLoggingOfMemoryDumps option.
- Norbert Spiegel for the BCB4 support code.
- Uwe Schuster for the improved string leak detection code.
- Murray McGowan for improvements to the usage tracker.
- Michael Hieke for the SuppressFreeMemErrorsInsideException option as well
as a bugfix to GetMemoryMap.
- Richard Bradbrook for fixing the Windows 95 FullDebugMode support that was
broken in version 4.94.
- Zach Saw for the suggestion to (optionally) use SwitchToThread when
waiting for a lock on a shared resource to be released.
- Everyone who have made donations. Thanks!
- Any other Fastcoders or supporters that I have forgotten, and also everyone
that helped with the older versions.
Change log:
Version 1.00 (28 June 2004):
- First version (called PSDMemoryManager). Based on RecyclerMM (free block
stack approach) by Eric Grange.
Version 2.00 (3 November 2004):
- Complete redesign and rewrite from scratch. Name changed to FastMM to
reflect this fact. Uses a linked-list approach. Is faster, has less memory
overhead, and will now catch most bad pointers on FreeMem calls.
Version 3.00 (1 March 2005):
- Another rewrite. Reduced the memory overhead by: (a) not having a separate
memory area for the linked list of free blocks (uses space inside free
blocks themselves) (b) batch managers are allocated as part of chunks (c)
block size lookup table size reduced. This should make FastMM more CPU
cache friendly.
Version 4.00 (7 June 2005):
- Yet another rewrite. FastMM4 is in fact three memory managers in one: Small
blocks (up to a few KB) are managed through the binning model in the same
way as previous versions, medium blocks (from a few KB up to approximately
256K) are allocated in a linked-list fashion, and large blocks are grabbed
directly from the system through VirtualAlloc. This 3-layered design allows
very fast operation with the most frequently used block sizes (small
blocks), while also minimizing fragmentation and imparting significant
overhead savings with blocks larger than a few KB.
Version 4.01 (8 June 2005):
- Added the options "RequireDebugInfoForLeakReporting" and
"RequireIDEPresenceForLeakReporting" as suggested by Pierre Y.
- Fixed the "DelphiIsRunning" function not working under Delphi 5, and
consequently, no leak checking. (Reported by Anders Isaksson and Greg.)
Version 4.02 (8 June 2005):
- Fixed the compilation error when both the "AssumeMultiThreaded" and
"CheckHeapForCorruption options were set. (Reported by Francois Malan.)
Version 4.03 (9 June 2005):
- Added descriptive error messages when FastMM4 cannot be installed because
another MM has already been installed or memory has already been allocated.
Version 4.04 (13 June 2005):
- Added a small fixed offset to the size of medium blocks (previously always
exact multiples of 256 bytes). This makes performance problems due to CPU
cache associativity limitations much less likely. (Reported by Craig
Peterson.)
Version 4.05 (17 June 2005):
- Added the Align16Bytes option. Disable this option to drop the 16 byte
alignment restriction and reduce alignment to 8 bytes for the smallest
block sizes. Disabling Align16Bytes should lower memory consumption at the
cost of complicating the use of aligned SSE move instructions. (Suggested
by Craig Peterson.)
- Added a support unit for C++ Builder 6 - Add FastMM4BCB.cpp and
FastMM4.pas to your BCB project to use FastMM instead of the RTL MM. Memory
leak checking is not supported because (unfortunately) once an MM is
installed under BCB you cannot uninstall it... at least not without
modifying the RTL code in exit.c or patching the RTL code runtime. (Thanks
to Jarek Karciarz, Vladimir Ulchenko and Bob Gonder.)
Version 4.06 (22 June 2005):
- Displays the class of all leaked objects on the memory leak report and also
tries to identify leaked long strings. Previously it only displayed the
sizes of all leaked blocks. (Suggested by Ben Taylor.)
- Added support for displaying the sizes of medium and large block memory
leaks. Previously it only displayed details for small block leaks.
Version 4.07 (22 June 2005):
- Fixed the detection of the class of leaked objects not working under
Windows 98/Me.
Version 4.08 (27 June 2005):
- Added a BorlndMM.dpr project to allow you to build a borlndmm.dll that uses
FastMM4 instead of the default memory manager. You may replace the old
DLL in the Delphi \Bin directory to make the IDE use this memory manager
instead.
Version 4.09 (30 June 2005):
- Included a patch fix for the bug affecting replacement borlndmm.dll files
with Delphi 2005 (QC#14007). Compile the patch, close Delphi, and run it
once to patch your vclide90.bpl. You will now be able to use the
replacement borlndmm.dll to speed up the Delphi 2005 IDE as well.
Version 4.10 (7 July 2005):
- Due to QC#14070 ("Delphi IDE attempts to free memory after the shutdown
code of borlndmm.dll has been called"), FastMM cannot be uninstalled
safely when used inside a replacement borlndmm.dll for the IDE. Added a
conditional define "NeverUninstall" for this purpose.
- Added the "FullDebugMode" option to pad all blocks with a header and footer
to help you catch memory overwrite bugs in your applications. All blocks
returned to freemem are also zeroed out to help catch bugs involving the
use of previously freed blocks. Also catches attempts at calling virtual
methods of freed objects provided the block in question has not been reused
since the object was freed. Displays stack traces on error to aid debugging.
- Added the "LogErrorsToFile" option to log all errors to a text file in the
same folder as the application.
- Added the "ManualLeakReportingControl" option (suggested by Nahan Hyn) to
enable control over whether the memory leak report should be done or not
via a global variable.
Version 4.11 (7 July 2005):
- Fixed a compilation error under Delphi 2005 due to QC#10922. (Thanks to Joe
Bain and Leonel Togniolli.)
- Fixed leaked object classes not displaying in the leak report in
"FullDebugMode".
Version 4.12 (8 July 2005):
- Moved all the string constants to one place to make it easier to do
translations into other languages. (Thanks to Robert Marquardt.)
- Added support for Kylix. Some functionality is currently missing: No
support for detecting the object class on leaks and also no MM sharing.
(Thanks to Simon Kissel and Fikret Hasovic).
Version 4.13 (11 July 2005):
- Added the FastMM_DebugInfo.dll support library to display debug info for
stack traces.
- Stack traces for the memory leak report is now logged to the log file in
"FullDebugMode".
Version 4.14 (14 July 2005):
- Fixed string leaks not being detected as such in "FullDebugMode". (Thanks
to Leonel Togniolli.)
- Fixed the compilation error in "FullDebugMode" when "LogErrorsToFile" is
not set. (Thanks to Leonel Togniolli.)
- Added a "Release" option to allow the grouping of various options and to
make it easier to make debug and release builds. (Thanks to Alexander
Tabakov.)
- Added a "HideMemoryLeakHintMessage" option to not display the hint below
the memory leak message. (Thanks to Alexander Tabakov.)
- Changed the fill character for "FullDebugMode" from zero to $80 to be able
to differentiate between invalid memory accesses using nil pointers to
invalid memory accesses using fields of freed objects. FastMM tries to
reserve the 64K block starting at $80800000 at startup to ensure that an
A/V will occur when this block is accessed. (Thanks to Alexander Tabakov.)
- Fixed some compiler warnings. (Thanks to M. Skloff)
- Fixed some display bugs in the memory leak report. (Thanks to Leonel
Togniolli.)
- Added a "LogMemoryLeakDetailToFile" option. Some applications leak a lot of
memory and can make the log file grow very large very quickly.
- Added the option to use madExcept instead of the JCL Debug library in the
debug info support DLL. (Thanks to Martin Aignesberger.)
- Added procedures "GetMemoryManagerState" and "GetMemoryMap" to retrieve
statistics about the current state of the memory manager and memory pool.
(A usage tracker form together with a demo is also available.)
Version 4.15 (14 July 2005):
- Fixed a false 4GB(!) memory leak reported in some instances.
Version 4.16 (15 July 2005):
- Added the "CatchUseOfFreedInterfaces" option to catch the use of interfaces
of freed objects. This option is not compatible with checking that a freed
block has not been modified, so enable this option only when hunting an
invalid interface reference. (Only relevant if "FullDebugMode" is set.)
- During shutdown FastMM now checks that all free blocks have not been
modified since being freed. (Only when "FullDebugMode" is set and
"CatchUseOfFreedInterfaces" is disabled.)
Version 4.17 (15 July 2005):
- Added the AddExpectedMemoryLeaks and RemoveExpectedMemoryLeaks procedures to
register/unregister expected leaks, thus preventing the leak report from
displaying if only expected leaks occurred. (Thanks to Diederik and Dennis
Passmore for the suggestion.) (Note: these functions were renamed in later
versions.)
- Fixed the "LogMemoryLeakDetailToFile" not logging memory leak detail to file
as it is supposed to. (Thanks to Leonel Togniolli.)
Version 4.18 (18 July 2005):
- Fixed some issues when range checking or complete boolean evaluation is
switched on. (Thanks to Dario Tiraboschi and Mark Gebauer.)
- Added the "OutputInstallUninstallDebugString" option to display a message when
FastMM is installed or uninstalled. (Thanks to Hanspeter Widmer.)
- Moved the options to a separate include file. (Thanks to Hanspeter Widmer.)
- Moved message strings to a separate file for easy translation.
Version 4.19 (19 July 2005):
- Fixed Kylix support that was broken in 4.14.
Version 4.20 (20 July 2005):
- Fixed a false memory overwrite report at shutdown in "FullDebugMode". If you
consistently got a "Block Header Has Been Corrupted" error message during
shutdown at address $xxxx0070 then it was probably a false alarm. (Thanks to
Theo Carr-Brion and Hanspeter Widmer.}
Version 4.21 (27 July 2005):
- Minor change to the block header flags to make it possible to immediately
tell whether a medium block is being used as a small block pool or not.
(Simplifies the leak checking and status reporting code.)
- Expanded the functionality around the management of expected memory leaks.
- Added the "ClearLogFileOnStartup" option. Deletes the log file during
initialization. (Thanks to M. Skloff.)
- Changed "OutputInstallUninstallDebugString" to use OutputDebugString instead
of MessageBox. (Thanks to Hanspeter Widmer.)
Version 4.22 (1 August 2005):
- Added a FastAllocMem function that avoids an unnecessary FillChar call with
large blocks.
- Changed large block resizing behavior to be a bit more conservative. Large
blocks will be downsized if the new size is less than half of the old size
(the threshold was a quarter previously).
Version 4.23 (6 August 2005):
- Fixed BCB6 support (Thanks to Omar Zelaya).
- Renamed "OutputInstallUninstallDebugString" to "UseOutputDebugString", and
added debug string output on memory leak or error detection.
Version 4.24 (11 August 2005):
- Added the "NoMessageBoxes" option to suppress the display of message boxes,
which is useful for services that should not be interrupted. (Thanks to Dan
Miser).
- Changed the stack trace code to return the line number of the caller and not
the line number of the return address. (Thanks to Dan Miser).
Version 4.25 (15 August 2005):
- Fixed GetMemoryLeakType not detecting expected leaks registered by class
when in "FullDebugMode". (Thanks to Arjen de Ruijter).
Version 4.26 (18 August 2005):
- Added a "UseRuntimePackages" option that allows FastMM to be used in a main
application together with DLLs that all use runtime packages. (Thanks to
Aleksander Oven.)
Version 4.27 (24 August 2005):
- Fixed a bug that sometimes caused the leak report to be shown even though all
leaks were registered as expected leaks. (Thanks to Kristofer Skaug.)
Version 4.29 (30 September 2005):
- Added the "RequireDebuggerPresenceForLeakReporting" option to only display
the leak report if the application is run inside the IDE. (Thanks to Guenter
Schoch.)
- Added the "ForceMMX" option, which when disabled will check the CPU for
MMX compatibility before using MMX. (Thanks to Jan Schlueter.)
- Added the module name to the title of error dialogs to more easily identify
which application caused the error. (Thanks to Kristofer Skaug.)
- Added an ASCII dump to the "FullDebugMode" memory dumps. (Thanks to Hallvard
Vassbotn.)
- Added the option "HideExpectedLeaksRegisteredByPointer" to suppress the
display and logging of expected memory leaks that were registered by pointer.
(Thanks to Dan Miser.) Leaks registered by size or class are often ambiguous,
so these expected leaks are always logged to file (in FullDebugMode) and are
never hidden from the leak display (only displayed if there is at least one
unexpected leak).
- Added a procedure "GetRegisteredMemoryLeaks" to return a list of all
registered memory leaks. (Thanks to Dan Miser.)
- Added the "RawStackTraces" option to perform "raw" stack traces, negating
the need for stack frames. This will usually result in more complete stack
traces in FullDebugMode error reports, but it is significantly slower.
(Thanks to Hallvard Vassbotn, Dan Miser and the JCL team.)
Version 4.31 (2 October 2005):
- Fixed the crash bug when both "RawStackTraces" and "FullDebugMode" were
enabled. (Thanks to Dan Miser and Mark Edington.)
Version 4.33 (6 October 2005):
- Added a header corruption check to all memory blocks that are identified as
leaks in FullDebugMode. This allows better differentiation between memory
pool corruption bugs and actual memory leaks.
- Fixed the stack overflow bug when using "RawStackTraces".
Version 4.35 (6 October 2005):
- Fixed a compilation error when the "NoMessageBoxes" option is set. (Thanks
to Paul Ishenin.)
- Before performing a "raw" stack trace, FastMM now checks whether exception
handling is in place. If exception handling is not in place FastMM falls
back to stack frame tracing. (Exception handling is required to handle the
possible A/Vs when reading invalid call addresses. Exception handling is
usually always available except when SysUtils hasn't been initialized yet or
after SysUtils has been finalized.)
Version 4.37 (8 October 2005):
- Fixed the missing call stack trace entry issue when dynamically loading DLLs.
(Thanks to Paul Ishenin.)
Version 4.39 (12 October 2005):
- Restored the performance with "RawStackTraces" enabled back to the level it
was in 4.35.
- Fixed the stack overflow error when using "RawStackTraces" that I thought I
had fixed in 4.31, but unfortunately didn't. (Thanks to Craig Peterson.)
Version 4.40 (13 October 2005):
- Improved "RawStackTraces" to have less incorrect extra entries. (Thanks to
Craig Peterson.)
- Added the Russian (by Paul Ishenin) and Afrikaans translations of
FastMM4Messages.pas.
Version 4.42 (13 October 2005):
- Fixed the compilation error when "CatchUseOfFreedInterfaces" is enabled.
(Thanks to Cristian Nicola.)
Version 4.44 (25 October 2005):
- Implemented a FastGetHeapStatus function in analogy with GetHeapStatus.
(Suggested by Cristian Nicola.)
- Shifted more of the stack trace code over to the support dll to allow third
party vendors to make available their own stack tracing and stack trace
logging facilities.
- Mathias Rauen (madshi) improved the support for madExcept in the debug info
support DLL. Thanks!
- Added support for BCB5. (Thanks to Roddy Pratt.)
- Added the Czech translation by Rene Mihula.
- Added the "DetectMMOperationsAfterUninstall" option. This will catch
attempts to use the MM after FastMM has been uninstalled, and is useful for
debugging.
Version 4.46 (26 October 2005):
- Renamed FastMM_DebugInfo.dll to FastMM_FullDebugMode.dll and made the
dependency on this library a static one. This solves a DLL unload order
problem when using FullDebugMode together with the replacement
borlndmm.dll. (Thanks to Bart van der Werf.)
- Added the Polish translation by Artur Redzko.
Version 4.48 (10 November 2005):
- Fixed class detection for objects leaked in dynamically loaded DLLs that
were relocated.
- Fabio Dell'Aria implemented support for EurekaLog in the FullDebugMode
support DLL. Thanks!
- Added the Spanish translation by JRG ("The Delphi Guy").
Version 4.49 (10 November 2005):
- Implemented support for installing replacement AllocMem and leak
registration mechanisms for Delphi/BCB versions that support it.
- Added support for Delphi 4. (Thanks to Justus Janssen.)
Version 4.50 (5 December 2005):
- Renamed the ReportMemoryLeaks global variable to ReportMemoryLeaksOnShutdown
to be more consistent with the Delphi 2006 memory manager.
- Improved the handling of large blocks. Large blocks can now consist of
several consecutive segments allocated through VirtualAlloc. This
significantly improves speed when frequently resizing large blocks, since
these blocks can now often be upsized in-place.
Version 4.52 (7 December 2005):
- Fixed the compilation error with Delphi 5. (Thanks to Vadim Lopushansky and
Charles Vinal for reporting the error.)
Version 4.54 (15 December 2005):
- Added the Brazilian Portuguese translation by Johni Jeferson Capeletto.
- Fixed the compilation error with BCB6. (Thanks to Kurt Fitzner.)
Version 4.56 (20 December 2005):
- Fixed the Kylix compilation problem. (Thanks to Michal Niklas.)
Version 4.58 (1 February 2006):
- Added the German translations by Thomas Speck and Uwe Queisser.
- Added the Indonesian translation by Zaenal Mutaqin.
- Added the Portuguese translation by Carlos Macao.
Version 4.60 (21 February 2006):
- Fixed a performance issue due to an unnecessary block move operation when
allocating a block in the range 1261-1372 bytes and then reallocating it in
the range 1373-1429 bytes twice. (Thanks to Michael Winter.)
- Added the Belarussian translation by dzmitry[li].
- Added the updated Spanish translation by Marcelo Montenegro.
- Added a new option "EnableSharingWithDefaultMM". This option allows FastMM
to be shared with the default MM of Delphi 2006. It is on by default, but
MM sharing has to be enabled otherwise it has no effect (refer to the
documentation for the "ShareMM" and "AttemptToUseSharedMM" options).
Version 4.62 (22 February 2006):
- Fixed a possible read access violation in the MoveX16LP routine when the
UseCustomVariableSizeMoveRoutines option is enabled. (Thanks to Jud Cole for
some great detective work in finding this bug.)
- Improved the downsizing behaviour of medium blocks to better correlate with
the reallocation behaviour of small blocks. This change reduces the number
of transitions between small and medium block types when reallocating blocks
in the 0.7K to 2.6K range. It cuts down on the number of memory move
operations and improves performance.
Version 4.64 (31 March 2006):
- Added the following functions for use with FullDebugMode (and added the
exports to the replacement BorlndMM.dll): SetMMLogFileName,
GetCurrentAllocationGroup, PushAllocationGroup, PopAllocationGroup and
LogAllocatedBlocksToFile. The purpose of these functions is to allow you to
identify and log related memory leaks while your application is still
running.
- Fixed a bug in the memory manager sharing mechanism affecting Windows
95/98/ME. (Thanks to Zdenek Vasku.)
Version 4.66 (9 May 2006):
- Added a hint comment in this file so that FastMM4Messages.pas will also be
backed up by GExperts. (Thanks to RB Winston.)
- Fixed a bug affecting large address space (> 2GB) support under
FullDebugMode. (Thanks to Thomas Schulz.)
Version 4.68 (3 July 2006):
- Added the Italian translation by Luigi Sandon.
- If FastMM is used inside a DLL it will now use the name of the DLL as base
for the log file name. (Previously it always used the name of the main
application executable file.)
- Fixed a rare A/V when both the FullDebugMode and RawStackTraces options were
enabled. (Thanks to Primoz Gabrijelcic.)
- Added the "NeverSleepOnThreadContention" option. This option may improve
performance if the ratio of the the number of active threads to the number
of CPU cores is low (typically < 2). This option is only useful for 4+ CPU
systems, it almost always hurts performance on single and dual CPU systems.
(Thanks to Werner Bochtler and Markus Beth.)
Version 4.70 (4 August 2006):
- Added the Simplified Chinese translation by JiYuan Xie.
- Added the updated Russian as well as the Ukrainian translation by Andrey
Shtukaturov.
- Fixed two bugs in the leak class detection code that would sometimes fail
to detect the class of leaked objects and strings, and report them as
'unknown'. (Thanks to Dimitry Timokhov)
Version 4.72 (24 September 2006):
- Fixed a bug that caused AllocMem to not clear blocks > 256K in
FullDebugMode. (Thanks to Paulo Moreno.)
Version 4.74 (9 November 2006):
- Fixed a bug in the segmented large block functionality that could lead to
an application freeze when upsizing blocks greater than 256K in a
multithreaded application (one of those "what the heck was I thinking?"
type bugs).
Version 4.76 (12 January 2007):
- Changed the RawStackTraces code in the FullDebugMode DLL
to prevent it from modifying the Windows "GetLastError" error code.
(Thanks to Primoz Gabrijelcic.)
- Fixed a threading issue when the "CheckHeapForCorruption" option was
enabled, but the "FullDebugMode" option was disabled. (Thanks to Primoz
Gabrijelcic.)
- Removed some unnecessary startup code when the MM sharing mechanism is
disabled. (Thanks to Vladimir Bochkarev.)
- In FullDebugMode leaked blocks would sometimes be reported as belonging to
the class "TFreedObject" if they were allocated but never used. Such blocks
will now be reported as "unknown". (Thanks to Francois Malan.)
- In recent versions the replacement borlndmm.dll created a log file (when
enabled) that used the "borlndmm" prefix instead of the application name.
It is now fixed to use the application name, however if FastMM is used
inside other DLLs the name of those DLLs will be used. (Thanks to Bart van
der Werf.)
- Added a "FastMMVersion" constant. (Suggested by Loris Luise.)
- Fixed an issue with error message boxes not displaying under certain
configurations. (Thanks to J.W. de Bokx.)
- FastMM will now display only one error message at a time. If many errors
occur in quick succession, only the first error will be shown (but all will
be logged). This avoids a stack overflow with badly misbehaved programs.
(Thanks to Bart van der Werf.)
- Added a LoadDebugDLLDynamically option to be used in conjunction with
FullDebugMode. In this mode FastMM_FullDebugMode.dll is loaded dynamically.
If the DLL cannot be found, stack traces will not be available. (Thanks to
Rene Mihula.)
Version 4.78 (1 March 2007):
- The MB_DEFAULT_DESKTOP_ONLY constant that is used when displaying messages
boxes since 4.76 is not defined under Kylix, and the source would thus not
compile. That constant is now defined. (Thanks to Werner Bochtler.)
- Moved the medium block locking code that was duplicated in several places
to a subroutine to reduce code size. (Thanks to Hallvard Vassbotn.)
- Fixed a bug in the leak registration code that sometimes caused registered
leaks to be reported erroneously. (Thanks to Primoz Gabrijelcic.)
- Added the NoDebugInfo option (on by default) that suppresses the generation
of debug info for the FastMM4.pas unit. This will prevent the integrated
debugger from stepping into the memory manager. (Thanks to Primoz
Gabrijelcic.)
- Increased the default stack trace depth in FullDebugMode from 9 to 10 to
ensure that the Align16Bytes setting works in FullDebugMode. (Thanks to
Igor Lindunen.)
- Updated the Czech translation. (Thanks to Rene Mihula.)
Version 4.84 (7 July 2008):
- Added the Romanian translation. (Thanks to Ionut Muntean.)
- Optimized the GetMemoryMap procedure to improve speed.
- Added the GetMemoryManagerUsageSummary function that returns a summary of
the GetMemoryManagerState call. (Thanks to Hallvard Vassbotn.)
- Added the French translation. (Thanks to Florent Ouchet.)
- Added the "AlwaysAllocateTopDown" FullDebugMode option to help with
catching bad pointer arithmetic code in an address space > 2GB. This option
is enabled by default.
- Added the "InstallOnlyIfRunningInIDE" option. Enable this option to
only install FastMM as the memory manager when the application is run
inside the Delphi IDE. This is useful when you want to deploy the same EXE
that you use for testing, but only want the debugging features active on
development machines. When this option is enabled and the application is
not being run inside the IDE, then the default Delphi memory manager will
be used (which, since Delphi 2006, is FastMM without FullDebugMode.) This
option is off by default.
- Added the "FullDebugModeInIDE" option. This is a convenient shorthand for
enabling FullDebugMode, InstallOnlyIfRunningInIDE and
LoadDebugDLLDynamically. This causes FastMM to be used in FullDebugMode
when the application is being debugged on development machines, and the
default memory manager when the same executable is deployed. This allows
the debugging and deployment of an application without having to compile
separate executables. This option is off by default.
- Added a ScanMemoryPoolForCorruptions procedure that checks the entire
memory pool for corruptions and raises an exception if one is found. It can
be called at any time, but is only available in FullDebugMode. (Thanks to
Marcus Moennig.)
- Added a global variable "FullDebugModeScanMemoryPoolBeforeEveryOperation".
When this variable is set to true and FullDebugMode is enabled, then the
entire memory pool is checked for consistency before every GetMem, FreeMem
and ReallocMem operation. An "Out of Memory" error is raised if a
corruption is found (and this variable is set to false to prevent recursive
errors). This obviously incurs a massive performance hit, so enable it only
when hunting for elusive memory corruption bugs. (Thanks to Marcus Moennig.)
- Fixed a bug in AllocMem that caused the FPU stack to be shifted by one
position.
- Changed the default for option "EnableMMX" to false, since using MMX may
cause unexpected behaviour in code that passes parameters on the FPU stack
(like some "compiler magic" routines, e.g. VarFromReal).
- Removed the "EnableSharingWithDefaultMM" option. This is now the default
behaviour and cannot be disabled. (FastMM will always try to share memory
managers between itself and the default memory manager when memory manager
sharing is enabled.)
- Introduced a new memory manager sharing mechanism based on memory mapped
files. This solves compatibility issues with console and service
applications. This sharing mechanism currently runs in parallel with the
old mechanism, but the old mechanism can be disabled by undefining
"EnableBackwardCompatibleMMSharing" in FastMM4Options.inc.
- Fixed the recursive call error when the EnableMemoryLeakReporting option
is disabled and an attempt is made to register a memory leak under Delphi
2006 or later. (Thanks to Thomas Schulz.)
- Added a global variable "SuppressMessageBoxes" to enable or disable
message boxes at runtime. (Thanks to Craig Peterson.)
- Added the leak reporting code for C++ Builder, as well as various other
C++ Builder bits written by JiYuan Xie. (Thank you!)