11 Dec 08:54
Changes to GENCGC memory zeroing
Juho Snellman <jsnell <at> iki.fi>
2005-12-11 07:54:53 GMT
2005-12-11 07:54:53 GMT
Attached patch is intended to improve GENCGC performance, and seems to
mostly succeed in doing that. Unless there are objections, I'd like to
commit most of it for 0.9.8 (i.e. probably on next weekend). Testing
on non-Linux or low-memory x86 systems would be appreciated. Other
comments are of course also welcome.
* Instead of zeroing memory by remapping memory with munmap/mmap at
GC time, pages are just marked as needing zeroing and zeroed with
memset when they're added to a new allocation region. This reduces
GC latency both for the common and worst cases (~30% improvement
for both on an Athlon X2 with kernel 2.6.14, ~5% average/~15%
worst-case on a PIII/2.6.10).
It also improves the performance of the whole Lisp system
noticeably (up to 45% on some CL-BENCH tests on x86-64). Attached
are CL-BENCH results for a Pentium 120, Pentium III, Duron,
Athlon, P IV, and Athlon X2. (Thanks to Hannu Koivisto and Peter
De Wachter for some of these results). As a summary these results
show mostly improvements on all platforms, with few significant
regressions. (See the end of this message for some instructions on
how to decipher the results).
* To keep the memory footprint down, clear the pages by remapping after
major GCs (arbitrarily defined as a collection of generation 2 or older).
The memory freed from a minor GC is just going to get used again immediately,
so releasing them back to the OS would make little sense.
The RSS of a vanilla SBCL and a modified one acting are very
similar for things like acting as a SBCL host compiler.
Anecdotally the new version causes no more thrashing on a
low-memory system than a vanilla one, though I haven't really
measured this.
* Supply hand-coded assembly routines for zeroing memory instead of
relying on the libc memset() which seems to be suboptimal on a lot of
systems.
* On x86-64 use SSE2 (MOVNTDQ)
* On x86 use either SSE2 (MOVNTDQ), MMX (MOVNTQ) or REP STOSL depending
on CPUID flags.
The extra complexity introduced here is quite manageable, since we're
only using these routines for zeroing page-aligned blocks of memory.
Separate results for this change are included in the CL-BENCH
reports. As a summary, this is very beneficial for the SSE2
systems and the PIII, quite good for the P120, and terrible for
the Duron and the Athlon. Since the x86 results were mixed, this
part is probably not something to commit in the 0.9.8 timeframe.
* Shrink generation_size_t and reorganize struct page a bit to shrink
the page-table (25% reduction on x86, 33% on x86-64). Reduces memory
use and improves performance. This change is not included in any
of the CL-BENCH reports.
* Make MAP-ALLOCATED-OBJECTS page-table aware, so that the non-zero free
pages don't confuse ROOM. As a bonus the results from ROOM are also
more accurate now, instead of reporting each free page as consisting
of a large number of conses.
* On BSDs GENCGC always used memset instead of mmap tricks, apparently
due to some bugs in swap space handling on some ancient FreeBSD version.
Get rid of this irregularity, and do the same thing on all platforms.
I'm not sure of the effect this will have on performance on BSDs.
* Add a GENCGC mode (#define READ_PROTECT_FREE_PAGES) for catching attempts
to read unallocated pages
* Genesify the GENCGC page size
CL-BENCH readers guide:
Each file contains five columns:
* Benchmark name
* Absolute run-time for vanilla SBCL (on some boxes the tests were
run with different iteration counts, so the absolute values of different
reports are not comparable)
* Relative run-time (to the vanilla results) for vanilla SBCL
* Relative run-time (to the vanilla results) for memset-using SBCL
* Relative run-time (to the vanilla results) for SBCL using
hand-optimized zeroing
Measurements are reported as xxx|yyy, where xxx is the result and yyy
is the standard error.
I'll make some pretty pictures of the results available later.
--
Juho Snellman
Index: package-data-list.lisp-expr
===================================================================
RCS file: /cvsroot/sbcl/sbcl/package-data-list.lisp-expr,v
retrieving revision 1.339
diff -u -r1.339 package-data-list.lisp-expr
--- package-data-list.lisp-expr 21 Nov 2005 14:00:29 -0000 1.339
+++ package-data-list.lisp-expr 11 Dec 2005 06:52:37 -0000
@@ -2158,6 +2158,7 @@
"SIMPLE-FUN-TYPE-SLOT"
"FUNCALLABLE-INSTANCE-LAYOUT-SLOT"
"FUNCALLABLE-INSTANCE-LEXENV-SLOT"
+ "GENCGC-PAGE-SIZE"
"GENESIS" "HALT-TRAP" "IGNORE-ME-SC-NUMBER"
"IMMEDIATE-CHARACTER-SC-NUMBER" "IMMEDIATE-SAP-SC-NUMBER"
"IMMEDIATE-SC-NUMBER" "*INITIAL-DYNAMIC-SPACE-FREE-POINTER*"
Index: src/code/room.lisp
===================================================================
RCS file: /cvsroot/sbcl/sbcl/src/code/room.lisp,v
retrieving revision 1.34
diff -u -r1.34 room.lisp
--- src/code/room.lisp 14 Jul 2005 16:30:38 -0000 1.34
+++ src/code/room.lisp 11 Dec 2005 06:52:37 -0000
@@ -204,6 +204,24 @@
shift)
(ash len shift)))))))
+;;; Access to the GENCGC page table for better precision in
+;;; MAP-ALLOCATED-OBJECTS
+#!+gencgc
+(progn
+ (define-alien-type nil
+ (struct page
+ (start long)
+ (flags unsigned-int)
+ (bytes-used short)
+ (gen unsigned-short)))
+ (declaim (inline find-page-index))
+ (define-alien-routine "find_page_index" long (index long))
+ (define-alien-variable "page_table"
+ (array (struct page)
+ #.(truncate (- dynamic-space-end
+ dynamic-space-start)
+ sb!vm:gencgc-page-size))))
+
;;; Iterate over all the objects allocated in SPACE, calling FUN with
;;; the object, the object's type code, and the object's total size in
;;; bytes, including any header and padding.
@@ -211,84 +229,122 @@
(defun map-allocated-objects (fun space)
(declare (type function fun) (type spaces space))
(without-gcing
- (multiple-value-bind (start end) (space-bounds space)
- (declare (type system-area-pointer start end))
- (declare (optimize (speed 3) (safety 0)))
- (let ((current start)
- #+nil
- (prev nil))
- (loop
- (let* ((header (sap-ref-word current 0))
- (header-widetag (logand header #xFF))
- (info (svref *room-info* header-widetag)))
- (cond
- ((or (not info)
- (eq (room-info-kind info) :lowtag))
- (let ((size (* cons-size n-word-bytes)))
- (funcall fun
- (make-lisp-obj (logior (sap-int current)
- list-pointer-lowtag))
- list-pointer-lowtag
- size)
- (setq current (sap+ current size))))
- ((eql header-widetag closure-header-widetag)
- (let* ((obj (make-lisp-obj (logior (sap-int current)
- fun-pointer-lowtag)))
- (size (round-to-dualword
- (* (the fixnum (1+ (get-closure-length obj)))
- n-word-bytes))))
- (funcall fun obj header-widetag size)
- (setq current (sap+ current size))))
- ((eq (room-info-kind info) :instance)
- (let* ((obj (make-lisp-obj
- (logior (sap-int current) instance-pointer-lowtag)))
- (size (round-to-dualword
- (* (+ (%instance-length obj) 1) n-word-bytes))))
- (declare (fixnum size))
- (funcall fun obj header-widetag size)
- (aver (zerop (logand size lowtag-mask)))
- #+nil
- (when (> size 200000) (break "implausible size, prev ~S" prev))
- #+nil
- (setq prev current)
- (setq current (sap+ current size))))
- (t
- (let* ((obj (make-lisp-obj
- (logior (sap-int current) other-pointer-lowtag)))
- (size (ecase (room-info-kind info)
- (:fixed
- (aver (or (eql (room-info-length info)
+ (multiple-value-bind (start end) (space-bounds space)
+ (declare (type system-area-pointer start end))
+ (declare (optimize (speed 3) (safety 0)))
+ (let ((current start)
+ (skip-tests-until-addr 0))
+ (labels ((maybe-finish-mapping ()
+ (unless (sap< current end)
+ ;; Skipping unallocated pages might lead to overrunning
+ ;; the imaginary GENCGC dynamic-space-free-pointer.
+ ;; This should be completely harmless.
+ #!-gencgc
+ (aver (sap= current end))
+ (return-from map-allocated-objects)))
+ ;; GENCGC doesn't allocate linearly, which means that the
+ ;; dynamic space can contain large blocks zeros that get
+ ;; accounted as conses in ROOM (and slow down other
+ ;; applications of MAP-ALLOCATED-OBJECTS). To fix this
+ ;; check the GC page structure for the current address.
+ ;; If the page is free or the address is beyond the page-
+ ;; internal allocation offset (bytes-used) skip to the
+ ;; next page immediately.
+ (maybe-skip-page ()
+ #!+gencgc
+ (when (eq space :dynamic)
+ (loop with page-mask = #.(1- sb!vm:gencgc-page-size)
+ for addr of-type sb!vm:word = (sap-int current)
+ for offset = (logand page-mask addr)
+ while (>= addr skip-tests-until-addr)
+ do
+ ;; For some reason binding PAGE with LET
+ ;; conses like mad (but gives no compiler notes...)
+ ;; Work around the problem with SYMBOL-MACROLET
+ ;; instead of trying to figure out the real
+ ;; issue. -- JES, 2005-05-17
+ (symbol-macrolet
+ ((page (deref page-table
+ (find-page-index addr))))
+ ;; I don't think we have any nicer way to
+ ;; access C struct bitfields. This is more
+ ;; fragile than I'd like.
+ (let ((alloc-flag (ldb (byte 3 2)
+ (slot page 'flags)))
+ (bytes-used (slot page 'bytes-used)))
+ (when (and (not (zerop alloc-flag))
+ (<= offset bytes-used))
+ (setf skip-tests-until-addr
+ (+ (logandc2 addr page-mask)
+ (the fixnum bytes-used)))
+ (return-from maybe-skip-page))
+ (setf current (sap+ current
+ (- sb!vm:gencgc-page-size
+ offset)))
+ (aver (zerop (logand page-mask
+ (sap-int current))))
+ (maybe-finish-mapping)))))))
+ (declare (inline maybe-finish-mapping maybe-skip-page))
+ (loop
+ (maybe-finish-mapping)
+ (maybe-skip-page)
+ (let* ((header (sap-ref-word current 0))
+ (header-widetag (logand header #xFF))
+ (info (svref *room-info* header-widetag)))
+ (cond
+ ((or (not info)
+ (eq (room-info-kind info) :lowtag))
+ (let ((size (* cons-size n-word-bytes)))
+ (funcall fun
+ (make-lisp-obj (logior (sap-int current)
+ list-pointer-lowtag))
+ list-pointer-lowtag
+ size)
+ (setq current (sap+ current size))))
+ ((eql header-widetag closure-header-widetag)
+ (let* ((obj (make-lisp-obj (logior (sap-int current)
+ fun-pointer-lowtag)))
+ (size (round-to-dualword
+ (* (the fixnum (1+ (get-closure-length obj)))
+ n-word-bytes))))
+ (funcall fun obj header-widetag size)
+ (setq current (sap+ current size))))
+ ((eq (room-info-kind info) :instance)
+ (let* ((obj (make-lisp-obj
+ (logior (sap-int current) instance-pointer-lowtag)))
+ (size (round-to-dualword
+ (* (+ (%instance-length obj) 1) n-word-bytes))))
+ (declare (fixnum size))
+ (funcall fun obj header-widetag size)
+ (aver (zerop (logand size lowtag-mask)))
+ (setq current (sap+ current size))))
+ (t
+ (let* ((obj (make-lisp-obj
+ (logior (sap-int current) other-pointer-lowtag)))
+ (size (ecase (room-info-kind info)
+ (:fixed
+ (aver (or (eql (room-info-length info)
(1+ (get-header-data obj)))
- (floatp obj)
- (simple-array-nil-p obj)))
- (round-to-dualword
- (* (room-info-length info) n-word-bytes)))
- ((:vector :string)
- (vector-total-size obj info))
- (:header
- (round-to-dualword
- (* (1+ (get-header-data obj)) n-word-bytes)))
- (:code
- (+ (the fixnum
- (* (get-header-data obj) n-word-bytes))
- (round-to-dualword
- (* (the fixnum (%code-code-size obj))
- n-word-bytes)))))))
- (declare (fixnum size))
- (funcall fun obj header-widetag size)
- (aver (zerop (logand size lowtag-mask)))
- #+nil
- (when (> size 200000)
- (break "Implausible size, prev ~S" prev))
- #+nil
- (setq prev current)
- (setq current (sap+ current size))))))
- (unless (sap< current end)
- (aver (sap= current end))
- (return)))
+ (floatp obj)
+ (simple-array-nil-p obj)))
+ (round-to-dualword
+ (* (room-info-length info) n-word-bytes)))
+ ((:vector :string)
+ (vector-total-size obj info))
+ (:header
+ (round-to-dualword
+ (* (1+ (get-header-data obj)) n-word-bytes)))
+ (:code
+ (+ (the fixnum
+ (* (get-header-data obj) n-word-bytes))
+ (round-to-dualword
+ (* (the fixnum (%code-code-size obj))
+ n-word-bytes)))))))
+ (declare (fixnum size))
+ (funcall fun obj header-widetag size)
+ (aver (zerop (logand size lowtag-mask)))
+ (setq current (sap+ current size))))))))))))
- #+nil
- prev))))
;;;; MEMORY-USAGE
Index: src/compiler/x86/parms.lisp
===================================================================
RCS file: /cvsroot/sbcl/sbcl/src/compiler/x86/parms.lisp,v
retrieving revision 1.52
diff -u -r1.52 parms.lisp
--- src/compiler/x86/parms.lisp 12 Oct 2005 23:53:47 -0000 1.52
+++ src/compiler/x86/parms.lisp 11 Dec 2005 06:52:38 -0000
@@ -35,6 +35,10 @@
;;; addressable object
(def!constant n-byte-bits 8)
+;;; The size in bytes of the GENCGC pages. Should be a multiple of the
+;;; architecture code size.
+(def!constant gencgc-page-size 4096)
+
(def!constant float-sign-shift 31)
;;; comment from CMU CL:
Index: src/compiler/x86-64/parms.lisp
===================================================================
RCS file: /cvsroot/sbcl/sbcl/src/compiler/x86-64/parms.lisp,v
retrieving revision 1.15
diff -u -r1.15 parms.lisp
--- src/compiler/x86-64/parms.lisp 11 Dec 2005 04:23:05 -0000 1.15
+++ src/compiler/x86-64/parms.lisp 11 Dec 2005 06:52:38 -0000
@@ -35,6 +35,10 @@
;;; addressable object
(def!constant n-byte-bits 8)
+;;; The size in bytes of the GENCGC pages. Should be a multiple of the
+;;; architecture code size.
+(def!constant gencgc-page-size 4096)
+
(def!constant float-sign-shift 31)
;;; comment from CMU CL:
Index: src/runtime/gc.h
===================================================================
RCS file: /cvsroot/sbcl/sbcl/src/runtime/gc.h,v
retrieving revision 1.12
diff -u -r1.12 gc.h
--- src/runtime/gc.h 12 Oct 2005 21:42:48 -0000 1.12
+++ src/runtime/gc.h 11 Dec 2005 06:52:38 -0000
@@ -16,7 +16,7 @@
#ifndef _GC_H_
#define _GC_H_
typedef signed long page_index_t;
-typedef signed int generation_index_t;
+typedef signed short generation_index_t;
extern void gc_init(void);
extern void gc_initialize_pointers(void);
Index: src/runtime/gencgc-internal.h
===================================================================
RCS file: /cvsroot/sbcl/sbcl/src/runtime/gencgc-internal.h,v
retrieving revision 1.13
diff -u -r1.13 gencgc-internal.h
--- src/runtime/gencgc-internal.h 12 Oct 2005 23:53:47 -0000 1.13
+++ src/runtime/gencgc-internal.h 11 Dec 2005 06:52:38 -0000
@@ -19,14 +19,12 @@
#ifndef _GENCGC_INTERNAL_H_
#define _GENCGC_INTERNAL_H_
+#include <limits.h>
#include "gc.h"
#include "gencgc-alloc-region.h"
#include "genesis/code.h"
-/* Size of a page, in bytes. FIXME: needs to be conditionalized per
- * architecture, preferably by someone with a clue as to what page
- * sizes are on archs other than x86 and PPC - Patrik */
-#define PAGE_BYTES 4096
+#define PAGE_BYTES GENCGC_PAGE_SIZE
void gc_free_heap(void);
inline page_index_t find_page_index(void *);
@@ -34,6 +32,11 @@
int gencgc_handle_wp_violation(void *);
struct page {
+ /* The name of this field is not well-chosen for its actual use.
+ * This is the offset from the start of the page to the start
+ * of the alloc_region which contains/contained it. It's negative or 0
+ */
+ long first_object_offset;
unsigned int
/* This is set when the page is write-protected. This should
@@ -57,27 +60,33 @@
/* If the page is part of a large object then this flag is
* set. No other objects should be allocated to these pages.
* This is only valid when the page is allocated. */
- large_object :1;
+ large_object :1,
+ /* True if the page is known to contain only zeroes. */
+ need_to_zero :1;
+
+ /* the number of bytes of this page that are used. This may be less
+ * than the actual bytes used for pages within the current
+ * allocation regions. It should be 0 for all unallocated pages (not
+ * hard to achieve).
+ *
+ * Currently declared as an unsigned short to make the struct size
+ * smaller. This means that GENCGC-PAGE-SIZE is constrained to fit
+ * inside a short.
+ */
+ unsigned short bytes_used;
+
+#if USHRT_MAX < PAGE_BYTES
+#error "PAGE_BYTES too large"
+#endif
/* the generation that this page belongs to. This should be valid
* for all pages that may have objects allocated, even current
* allocation region pages - this allows the space of an object to
* be easily determined. */
generation_index_t gen;
-
- /* the number of bytes of this page that are used. This may be less
- * than the actual bytes used for pages within the current
- * allocation regions. It should be 0 for all unallocated pages (not
- * hard to achieve). */
- int bytes_used;
-
- /* The name of this field is not well-chosen for its actual use.
- * This is the offset from the start of the page to the start
- * of the alloc_region which contains/contained it. It's negative or 0
- */
- long first_object_offset;
};
+
/* values for the page.allocated field */
Index: src/runtime/gencgc.c
===================================================================
RCS file: /cvsroot/sbcl/sbcl/src/runtime/gencgc.c,v
retrieving revision 1.90
diff -u -r1.90 gencgc.c
--- src/runtime/gencgc.c 4 Dec 2005 22:25:07 -0000 1.90
+++ src/runtime/gencgc.c 11 Dec 2005 06:52:39 -0000
@@ -70,22 +70,6 @@
* that don't have pointers to younger generations? */
boolean enable_page_protection = 1;
-/* Should we unmap a page and re-mmap it to have it zero filled? */
-#if defined(__FreeBSD__) || defined(__OpenBSD__) || defined(__NetBSD__) || defined(__sun)
-/* comment from cmucl-2.4.8: This can waste a lot of swap on FreeBSD
- * so don't unmap there.
- *
- * The CMU CL comment didn't specify a version, but was probably an
- * old version of FreeBSD (pre-4.0), so this might no longer be true.
- * OTOH, if it is true, this behavior might exist on OpenBSD too, so
- * for now we don't unmap there either. -- WHN 2001-04-07 */
-/* Apparently this flag is required to be 0 for SunOS/x86, as there
- * are reports of heap corruption otherwise. */
-boolean gencgc_unmap_zero = 0;
-#else
-boolean gencgc_unmap_zero = 1;
-#endif
-
/* the minimum size (in bytes) for a large object*/
unsigned long large_object_size = 4 * PAGE_BYTES;
@@ -139,6 +123,13 @@
* contained a pagetable entry).
*/
boolean gencgc_partial_pickup = 0;
+
+/* If defined, free pages are read-protected to ensure that nothing
+ * accesses them.
+ */
+
+/* #define READ_PROTECT_FREE_PAGES */
+
/*
* GC structures and variables
@@ -433,6 +424,63 @@
* allocation routines
*/
+void fast_bzero(void*, size_t); /* in <arch>-assem.S */
+
+/* Zero the pages from START to END (inclusive), but use mmap/munmap instead
+ * if zeroing it ourselves, i.e. in practice give the memory back to the
+ * OS. Generally done after a large GC.
+ */
+void zero_pages_with_mmap(page_index_t start, page_index_t end) {
+ int i;
+ void *addr = (void *) page_address(start), *new_addr;
+ size_t length = PAGE_BYTES*(1+end-start);
+
+ if (start > end)
+ return;
+
+ os_invalidate(addr, length);
+ new_addr = os_validate(addr, length);
+ if (new_addr == NULL || new_addr != addr) {
+ lose("remap_free_pages: page moved, 0x%08x ==> 0x%08x", start, new_addr);
+ }
+
+ for (i = start; i <= end; i++) {
+ page_table[i].need_to_zero = 0;
+ }
+}
+
+/* Zero the pages from START to END (inclusive). Generally done just after
+ * a new region has been allocated.
+ */
+static void
+zero_pages(page_index_t start, page_index_t end) {
+ if (start > end)
+ return;
+
+ fast_bzero(page_address(start), PAGE_BYTES*(1+end-start));
+}
+
+/* Zero the pages from START to END (inclusive), except for those
+ * pages that are known to already zeroed. Mark all pages in the
+ * ranges as non-zeroed.
+ */
+static void
+zero_dirty_pages(page_index_t start, page_index_t end) {
+ page_index_t i;
+
+ for (i = start; i <= end; i++) {
+ if (page_table[i].need_to_zero == 1) {
+ zero_pages(start, end);
+ break;
+ }
+ }
+
+ for (i = start; i <= end; i++) {
+ page_table[i].need_to_zero = 1;
+ }
+}
+
+
/*
* To support quick and inline allocation, regions of memory can be
* allocated and then allocated from with just a free pointer and a
@@ -606,6 +654,22 @@
}
}
}
+
+#ifdef READ_PROTECT_FREE_PAGES
+ os_protect(page_address(first_page),
+ PAGE_BYTES*(1+last_page-first_page),
+ OS_VM_PROT_ALL);
+#endif
+
+ /* If the first page was only partial, don't check whether it's
+ * zeroed (it won't be) and don't zero it (since the parts that
+ * we're interested in are guaranteed to be zeroed).
+ */
+ if (page_table[first_page].bytes_used) {
+ first_page++;
+ }
+
+ zero_dirty_pages(first_page, last_page);
}
/* If the record_new_objects flag is 2 then all new regions created
@@ -952,7 +1016,15 @@
}
thread_mutex_unlock(&free_pages_lock);
- return((void *)(page_address(first_page)+orig_first_page_bytes_used));
+#ifdef READ_PROTECT_FREE_PAGES
+ os_protect(page_address(first_page),
+ PAGE_BYTES*(1+last_page-first_page),
+ OS_VM_PROT_ALL);
+#endif
+
+ zero_dirty_pages(first_page, last_page);
+
+ return page_address(first_page);
}
static page_index_t gencgc_alloc_start_page = -1;
@@ -3080,31 +3152,12 @@
&& (page_table[last_page].bytes_used != 0)
&& (page_table[last_page].gen == from_space));
- /* Zero pages from first_page to (last_page-1).
- *
- * FIXME: Why not use os_zero(..) function instead of
- * hand-coding this again? (Check other gencgc_unmap_zero
- * stuff too. */
- if (gencgc_unmap_zero) {
- void *page_start, *addr;
-
- page_start = (void *)page_address(first_page);
-
- os_invalidate(page_start, PAGE_BYTES*(last_page-first_page));
- addr = os_validate(page_start, PAGE_BYTES*(last_page-first_page));
- if (addr == NULL || addr != page_start) {
- lose("free_oldspace: page moved, 0x%08x ==> 0x%08x\n",
- page_start, addr);
- }
- } else {
- long *page_start;
-
- page_start = (long *)page_address(first_page);
- memset(page_start, 0,PAGE_BYTES*(last_page-first_page));
- }
-
+#ifdef READ_PROTECT_FREE_PAGES
+ os_protect(page_address(first_page),
+ PAGE_BYTES*(last_page-first_page),
+ OS_VM_PROT_NONE);
+#endif
first_page = last_page;
-
} while (first_page < last_free_page);
bytes_allocated -= bytes_freed;
@@ -3816,6 +3869,32 @@
return 0; /* dummy value: return something ... */
}
+static void
+remap_free_pages (page_index_t from, page_index_t to)
+{
+ page_index_t first_page, last_page;
+
+ for (first_page = from; first_page <= to; first_page++) {
+ if (page_table[first_page].allocated != FREE_PAGE_FLAG ||
+ page_table[first_page].need_to_zero == 0) {
+ continue;
+ }
+
+ last_page = first_page + 1;
+ while (page_table[last_page].allocated == FREE_PAGE_FLAG &&
+ last_page < to &&
+ page_table[last_page].need_to_zero == 1) {
+ last_page++;
+ }
+
+ zero_pages_with_mmap(first_page, last_page-1);
+
+ first_page = last_page;
+ }
+}
+
+generation_index_t small_generation_limit = 1;
+
/* GC all generations newer than last_gen, raising the objects in each
* to the next older generation - we finish when all generations below
* last_gen are empty. Then if last_gen is due for a GC, or if
@@ -3824,13 +3903,15 @@
*
* We stop collecting at gencgc_oldest_gen_to_gc, even if this is less than
* last_gen (oh, and note that by default it is NUM_GENERATIONS-1) */
-
void
collect_garbage(generation_index_t last_gen)
{
generation_index_t gen = 0, i;
int raise;
int gen_to_wp;
+ /* The largest value of last_free_page seen since the time
+ * remap_free_pages was called. */
+ static page_index_t high_water_mark = 0;
FSHOW((stderr, "/entering collect_garbage(%d)\n", last_gen));
@@ -3932,11 +4013,25 @@
gc_assert((boxed_region.free_pointer - boxed_region.start_addr) == 0);
gc_alloc_generation = 0;
+ /* Save the high-water mark before updating last_free_page */
+ if (last_free_page > high_water_mark)
+ high_water_mark = last_free_page;
update_dynamic_space_free_pointer();
auto_gc_trigger = bytes_allocated + bytes_consed_between_gcs;
if(gencgc_verbose)
fprintf(stderr,"Next gc when %ld bytes have been consed\n",
auto_gc_trigger);
+
+ /* If we did a big GC (arbitrarily defined as gen > 1), release memory
+ * back to the OS.
+ */
+ if (gen > small_generation_limit) {
+ if (last_free_page > high_water_mark)
+ high_water_mark = last_free_page;
+ remap_free_pages(0, high_water_mark);
+ high_water_mark = 0;
+ }
+
SHOW("returning from collect_garbage");
}
@@ -4101,6 +4196,7 @@
page_table[page].write_protected = 0;
page_table[page].write_protected_cleared = 0;
page_table[page].dont_move = 0;
+ page_table[page].need_to_zero = 1;
if (!gencgc_partial_pickup) {
first=gc_search_space(prev,(ptr+2)-prev,ptr);
@@ -4290,6 +4386,23 @@
region->end_addr = page_address(0);
}
+static void
+zero_all_free_pages()
+{
+ page_index_t i;
+
+ for (i = 0; i < last_free_page; i++) {
+ if (page_table[i].allocated == FREE_PAGE_FLAG) {
+#ifdef READ_PROTECT_FREE_PAGES
+ os_protect(page_address(i),
+ PAGE_BYTES,
+ OS_VM_PROT_ALL);
+#endif
+ zero_pages(i, i);
+ }
+ }
+}
+
/* Things to do before doing a final GC before saving a core (without
* purify).
*
@@ -4344,6 +4457,8 @@
gencgc_alloc_start_page = -1;
collect_garbage(HIGHEST_NORMAL_GENERATION+1);
+ /* The dumper doesn't know that pages need to be zeroed before use. */
+ zero_all_free_pages();
save_to_filehandle(file, filename, SymbolValue(RESTART_LISP_FUNCTION,0));
/* Oops. Save still managed to fail. Since we've mangled the stack
* beyond hope, there's not much we can do.
Index: src/runtime/x86-64-assem.S
===================================================================
RCS file: /cvsroot/sbcl/sbcl/src/runtime/x86-64-assem.S,v
retrieving revision 1.8
diff -u -r1.8 x86-64-assem.S
--- src/runtime/x86-64-assem.S 1 Jul 2005 11:00:32 -0000 1.8
+++ src/runtime/x86-64-assem.S 11 Dec 2005 06:52:40 -0000
@@ -344,4 +344,42 @@
ret
.size GNAME(post_signal_tramp),.-GNAME(post_signal_tramp)
- .end
+ .text
+ .align align_8byte,0x90
+ .global GNAME(fast_bzero)
+ .type GNAME(fast_bzero),@function
+
+GNAME(fast_bzero):
+ /* A fast routine for zero-filling blocks of memory that are
+ * guaranteed to start and end at a 4096-byte aligned address.
+ */
+ shr $6, %rsi /* Amount of 64-byte blocks to copy */
+ jz Lend /* If none, stop */
+ movups %xmm7, -16(%rsp) /* Save XMM register */
+ xorps %xmm7, %xmm7 /* Zero the XMM register */
+ jmp Lloop
+ .align 16
+Lloop:
+
+ /* Copy the 16 zeroes from xmm7 to memory, 4 times. MOVNTDQ is the
+ * non-caching double-quadword moving variant, i.e. the memory areas
+ * we're touching are not fetched into the L1 cache, since we're just
+ * going to overwrite the memory soon anyway.
+ */
+ movntdq %xmm7, 0(%rdi)
+ movntdq %xmm7, 16(%rdi)
+ movntdq %xmm7, 32(%rdi)
+ movntdq %xmm7, 48(%rdi)
+
+ add $64, %rdi /* Advance pointer */
+ dec %rsi /* Decrement 64-byte block count */
+ jnz Lloop
+ mfence /* Ensure that the writes are globally visible, since
+ * MOVNTDQ is weakly ordered */
+ movups -16(%rsp), %xmm7 /* Restore the XMM register */
+Lend:
+ ret
+ .size GNAME(fast_bzero), .-GNAME(fast_bzero)
+
+
+ .end
Index: src/runtime/x86-assem.S
===================================================================
RCS file: /cvsroot/sbcl/sbcl/src/runtime/x86-assem.S,v
retrieving revision 1.24
diff -u -r1.24 x86-assem.S
--- src/runtime/x86-assem.S 12 Nov 2005 19:50:48 -0000 1.24
+++ src/runtime/x86-assem.S 11 Dec 2005 06:52:40 -0000
@@ -800,5 +800,186 @@
ret
.size GNAME(post_signal_tramp),.-GNAME(post_signal_tramp)
-
- .end
+ /* fast_bzero implementations and code to detect which implementation
+ * to use.
+ */
+
+ .global GNAME(fast_bzero_pointer)
+ .data
+ .align 4
+ .type GNAME(fast_bzero_pointer), @object
+ .size GNAME(fast_bzero_pointer), 4
+GNAME(fast_bzero_pointer):
+ /* Variable containing a pointer to the bzero function to use.
+ * Initially points to a function that detects which implementation
+ * should be used, and then updates the variable. */
+ .long fast_bzero_detect
+
+ .text
+ .align align_8byte,0x90
+ .global GNAME(fast_bzero)
+ .type GNAME(fast_bzero),@function
+GNAME(fast_bzero):
+ /* Indirect function call */
+ jmp *fast_bzero_pointer
+ .size GNAME(fast_bzero), .-GNAME(fast_bzero)
+
+
+ .text
+ .align align_8byte,0x90
+ .global GNAME(fast_bzero_detect)
+ .type GNAME(fast_bzero_detect),@function
+GNAME(fast_bzero_detect):
+ /* Decide whether to use SSE, MMX or REP version */
+ push %eax /* CPUID uses EAX-EDX */
+ push %ebx
+ push %ecx
+ push %edx
+ mov $1, %eax
+ cpuid
+ test $0x04000000, %edx /* SSE2 needed for MOVNTDQ */
+ jnz Lsse2
+ test $0x00800000, %edx /* MMX needed for MOVNTQ */
+ jnz Lmmx
+Lbase:
+ movl $fast_bzero_base, fast_bzero_pointer
+ jmp Lrestore
+Lsse2:
+ movl $fast_bzero_sse, fast_bzero_pointer
+ jmp Lrestore
+Lmmx:
+ movl $fast_bzero_mmx, fast_bzero_pointer
+
+Lrestore:
+ pop %edx
+ pop %ecx
+ pop %ebx
+ pop %eax
+ jmp *fast_bzero_pointer
+
+ .size GNAME(fast_bzero_detect), .-GNAME(fast_bzero_detect)
+
+
+ .text
+ .align align_8byte,0x90
+ .global GNAME(fast_bzero_sse)
+ .type GNAME(fast_bzero_sse),@function
+
+GNAME(fast_bzero_sse):
+ /* A fast routine for zero-filling blocks of memory that are
+ * guaranteed to start and end at a 4096-byte aligned address.
+ */
+ push %esi /* Save temporary registers */
+ push %edi
+ mov 16(%esp), %esi /* Parameter: amount of bytes to fill */
+ mov 12(%esp), %edi /* Parameter: start address */
+ shr $6, %esi /* Amount of 64-byte blocks to copy */
+ jz Lend_sse /* If none, stop */
+ movups %xmm7, -16(%esp) /* Save XMM register */
+ xorps %xmm7, %xmm7 /* Zero the XMM register */
+ jmp Lloop_sse
+ .align 16
+Lloop_sse:
+
+ /* Copy the 16 zeroes from xmm7 to memory, 4 times. MOVNTDQ is the
+ * non-caching double-quadword moving variant, i.e. the memory areas
+ * we're touching are not fetched into the L1 cache, since we're just
+ * going to overwrite the memory soon anyway.
+ */
+ movntdq %xmm7, 0(%edi)
+ movntdq %xmm7, 16(%edi)
+ movntdq %xmm7, 32(%edi)
+ movntdq %xmm7, 48(%edi)
+
+ add $64, %edi /* Advance pointer */
+ dec %esi /* Decrement 64-byte block count */
+ jnz Lloop_sse
+ movups -16(%esp), %xmm7 /* Restore the XMM register */
+ sfence /* Ensure that weakly ordered writes are flushed. */
+Lend_sse:
+ pop %edi /* Restore temp registers */
+ pop %esi
+ ret
+ .size GNAME(fast_bzero_sse), .-GNAME(fast_bzero_sse)
+
+
+ .text
+ .align align_8byte,0x90
+ .global GNAME(fast_bzero_mmx)
+ .type GNAME(fast_bzero_mmx),@function
+
+GNAME(fast_bzero_mmx):
+ /* A fast routine for zero-filling blocks of memory that are
+ * guaranteed to start and end at a 4096-byte aligned address.
+ */
+ push %esi /* Save temporary registers */
+ push %edi
+ mov 16(%esp), %esi /* Parameter: amount of bytes to fill */
+ mov 12(%esp), %edi /* Parameter: start address */
+ shr $6, %esi /* Amount of 64-byte blocks to copy */
+ jz Lend_mmx /* If none, stop */
+ fnsave -108(%esp) /* Save x87 state (MMX and x87 registers
+ * are aliased, so don't need to worry
+ * about saving %mm0) */
+ pxor %mm0, %mm0 /* Zero the MMX register */
+ jmp Lloop_mmx
+ .align 16
+Lloop_mmx:
+
+ /* Copy the 8 zeroes from mm0 to memory, 8 times. MOVNTQ is the
+ * non-caching double-quadword moving variant, i.e. the memory areas
+ * we're touching are not fetched into the L1 cache, since we're just
+ * going to overwrite the memory soon anyway.
+ */
+ movntq %mm0, 0(%edi)
+ movntq %mm0, 8(%edi)
+ movntq %mm0, 16(%edi)
+ movntq %mm0, 24(%edi)
+ movntq %mm0, 32(%edi)
+ movntq %mm0, 40(%edi)
+ movntq %mm0, 48(%edi)
+ movntq %mm0, 56(%edi)
+
+ add $64, %edi /* Advance pointer */
+ dec %esi /* Decrement 64-byte block count */
+ jnz Lloop_mmx
+ emms
+ frstor -108(%esp) /* Restore x87 state */
+ lock add $0, 0(%esp) /* Ensure that weakly ordered writes are
+ * flushed. */
+Lend_mmx:
+ pop %edi /* Restore temp registers */
+ pop %esi
+ ret
+ .size GNAME(fast_bzero_mmx), .-GNAME(fast_bzero_mmx)
+
+
+ .text
+ .align align_8byte,0x90
+ .global GNAME(fast_bzero_base)
+ .type GNAME(fast_bzero_base),@function
+
+GNAME(fast_bzero_base):
+ /* A fast routine for zero-filling blocks of memory that are
+ * guaranteed to start and end at a 4096-byte aligned address.
+ */
+ push %eax /* Save temporary registers */
+ push %ecx
+ push %edi
+ mov 20(%esp), %ecx /* Parameter: amount of bytes to fill */
+ mov 16(%esp), %edi /* Parameter: start address */
+ xor %eax, %eax /* Zero EAX */
+ shr $2, %ecx /* Amount of 4-byte blocks to copy */
+ jz Lend_base
+ cld /* Set direction of STOSL to increment */
+ rep stosl /* Store EAX to *EDI, ECX times, incrementing
+ * EDI by 4 after each store */
+Lend_base:
+ pop %edi /* Restore temp registers */
+ pop %ecx
+ pop %eax
+ ret
+ .size GNAME(fast_bzero_base), .-GNAME(fast_bzero_base)
+
+
+ .end
\ No newline at end of file
This is SBCL 0.9.6.26, an implementation of ANSI Common Lisp. More information about SBCL is available at <http://www.sbcl.org/>. SBCL is free software, provided as is, with absolutely no warranty. It is mostly in the public domain; some portions are provided under BSD-style licenses. See the CREDITS and COPYING files in the distribution for more information. Benchmark Reference 0.9.7.3 0.9.7.3.gc.1 0.9.7.3.gc.4 ------------------------------------------------------------------------------------- COMPILER [ 5.29|0.03] 1.00|0.0066 0.94|0.0035 0.93|0.0026 LOAD-FASL [ 4.21|0.01] 1.00|0.0027 0.97|0.0013 0.98|0.0040 SUM-PERMUTATIONS [ 4.16|0.01] 1.00|0.0015 0.90|0.0008 0.87|0.0021 WALK-LIST/SEQ [ 4.54|0.01] 1.00|0.0026 1.00|0.0023 1.00|0.0008 BOYER [ 32.30|0.28] 1.00|0.0085 0.91|0.0039 0.88|0.0036 BROWSE [ 3.40|0.01] 1.00|0.0031 0.84|0.0017 0.78|0.0021 DDERIV [ 3.15|0.01] 1.00|0.0036 0.58|0.0026 0.51|0.0015 DERIV [ 3.04|0.01] 1.00|0.0018 0.63|0.0012 0.53|0.0026 DESTRUCTIVE [ 2.89|0.01] 1.00|0.0021 0.84|0.0012 0.78|0.0014 DIV2-TEST-1 [ 2.83|0.01] 1.00|0.0019 0.56|0.0005 0.52|0.0005 DIV2-TEST-2 [ 2.91|0.01] 1.00|0.0032 0.76|0.0016 0.67|0.0012 FFT [ 3.83|0.004] 1.00|0.0010 1.00|0.0013 1.00|0.0003 FRPOLY/FIXNUM [ 3.63|0.01] 1.00|0.0017 0.88|0.0022 0.84|0.0022 FRPOLY/BIGNUM [ 3.98|0.01] 1.00|0.0035 0.81|0.0015 0.77|0.0288 FRPOLY/FLOAT [ 3.79|0.01] 1.00|0.0016 0.89|0.0015 0.84|0.0067 PUZZLE [ 4.35|0.01] 1.00|0.0024 1.00|0.0011 1.00|0.0009 TAK [ 4.44|0.01] 1.00|0.0017 1.00|0.0005 1.00|0.0008 CTAK [ 4.18|0.01] 1.00|0.0014 1.00|0.0016 1.00|0.0000 TRTAK [ 4.45|0.002] 1.00|0.0005 1.00|0.0005 1.00|0.0011 TAKL [ 4.30|0.01] 1.00|0.0027 1.00|0.0008 1.00|0.0005 STAK [ 4.41|0.02] 1.00|0.0039 1.01|0.0000 1.00|0.0060 FPRINT/UGLY [ 1.87|0.01] 1.00|0.0054 1.02|0.0031 1.04|0.0070 FPRINT/PRETTY [ 2.32|0.01] 1.00|0.0032 0.95|0.0010 0.94|0.0015 TRAVERSE [ 2.52|0.04] 1.00|0.0167 1.03|0.0092 1.02|0.0029 TRIANGLE [ 4.65|0.01] 1.00|0.0023 1.00|0.0008 1.00|0.0011 RICHARDS [ 4.21|0.05] 1.00|0.0107 1.04|0.0153 0.99|0.0120 FACTORIAL [ 3.87|0.01] 1.00|0.0024 0.69|0.0018 0.60|0.0037 FIB [ 4.45|0.03] 1.00|0.0068 1.00|0.0027 1.11|0.0571 FIB-RATIO [ 4.10|0.01] 1.00|0.0020 0.85|0.0009 0.80|0.0013 ACKERMANN [ 5.26|0.003] 1.00|0.0005 1.00|0.0041 1.00|0.0020 MANDELBROT/COMPLEX [ 3.25|0.001] 1.00|0.0004 0.76|0.0021 0.70|0.0036 MANDELBROT/DFLOAT [ 3.12|0.005] 1.00|0.0015 0.76|0.0004 0.70|0.0020 BIGNUM/ELEM-100-1000 [ 4.31|0.01] 1.00|0.0012 0.97|0.0008 0.94|0.0029 BIGNUM/ELEM-1000-100 [ 4.23|0.01] 1.00|0.0017 0.98|0.0038 0.98|0.0014 BIGNUM/ELEM-10000-1 [ 4.20|0.002] 1.00|0.0004 1.00|0.0008 1.00|0.0022 BIGNUM/PARI-100-10 [ 4.28|0.004] 1.00|0.0008 0.92|0.0011 0.86|0.0006 BIGNUM/PARI-200-5 [ 4.46|0.01] 1.00|0.0017 0.90|0.0006 0.86|0.0031 PI-DECIMAL/SMALL [ 4.75|0.01] 1.00|0.0017 0.84|0.0013 0.80|0.0045 PI-ATAN [ 3.94|0.01] 1.00|0.0038 0.87|0.0043 0.82|0.0021 PI-RATIOS [ 4.71|0.01] 1.00|0.0020 0.96|0.0020 0.92|0.0017 SLURP-LINES [ 6.61|0.01] 1.00|0.0022 0.83|0.0019 0.79|0.0019 HASH-STRINGS [ 4.17|0.01] 1.00|0.0020 0.92|0.0021 0.88|0.0003 HASH-INTEGERS [ 4.07|0.01] 1.00|0.0020 0.94|0.0017 0.94|0.0024 BOEHM-GC [ 4.03|0.01] 1.00|0.0013 0.78|0.0017 0.71|0.0011 DEFLATE-FILE [ 4.34|0.01] 1.00|0.0014 1.05|0.0019 1.00|0.0003 1D-ARRAYS [ 4.20|0.005] 1.00|0.0011 1.00|0.0006 1.00|0.0006 2D-ARRAYS [ 4.78|0.004] 1.00|0.0008 1.02|0.0057 1.01|0.0008 3D-ARRAYS [ 4.61|0.01] 1.00|0.0013 1.00|0.0027 1.00|0.0012 BITVECTORS [ 3.35|0.02] 1.00|0.0054 0.61|0.0032 0.61|0.0027 BENCH-STRINGS [ 4.09|0.01] 1.00|0.0027 1.00|0.0006 1.00|0.0009 SEARCH-SEQUENCE [ 4.49|0.004] 1.00|0.0009 1.00|0.0005 1.00|0.0039 CLOS/defclass [ 2.75|0.003] 1.00|0.0010 0.95|0.0021 0.95|0.0013 CLOS/defmethod [ 8.13|0.02] 1.00|0.0021 0.94|0.0025 0.95|0.0031 CLOS/instantiate [ 9.83|0.01] 1.00|0.0007 0.92|0.0023 0.85|0.0023 CLOS/simple-instantiate [ 0.36|0.00] 1.00|0.0000 0.84|0.0037 0.77|0.0099 CLOS/methodcalls [ 1.15|0.004] 1.00|0.0031 0.82|0.0020 0.77|0.0023 CLOS/method+after [ 8.06|0.02] 1.00|0.0019 0.95|0.0036 0.99|0.0021 CLOS/complex-methods [ 0.71|0.002] 1.00|0.0024 0.90|0.0155 0.90|0.0132 EQL-SPECIALIZED-FIB [ 4.57|0.02] 1.00|0.0042 0.99|0.0044 0.97|0.0077 Reference time in first column is in seconds; other columns are relative Reference implementation: SBCL 0.9.7.3 Impl 0.9.7.3: SBCL 0.9.7.3 Impl 0.9.7.3.gc.1: SBCL 0.9.7.3.gc.1 Impl 0.9.7.3.gc.4: SBCL 0.9.7.3.gc.4 === Test machine === Machine-instance: ldb Machine-type: X86-64 Machine-version: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ Linux ldb 2.6.14-1.1644_FC4smp #1 SMP Sun Nov 27 03:37:58 EST 2005 x86_64 x86_64 x86_64 GNU/Linux
This is SBCL 0.9.7.3-x86.gc.2, an implementation of ANSI Common Lisp. More information about SBCL is available at <http://www.sbcl.org/>. SBCL is free software, provided as is, with absolutely no warranty. It is mostly in the public domain; some portions are provided under BSD-style licenses. See the CREDITS and COPYING files in the distribution for more information. Benchmark Reference 0.9.7.3-x86 0.9.7.3-x86.gc.1 0.9.7.3-x86.gc.4 ------------------------------------------------------------------------------------- COMPILER [ 2.39|0.01] 1.00|0.0023 0.96|0.0004 0.97|0.0061 LOAD-FASL [ 1.47|0.002] 1.00|0.0017 0.79|0.0000 0.78|0.0031 SUM-PERMUTATIONS [ 1.47|0.02] 1.00|0.0105 0.92|0.0051 0.89|0.0020 WALK-LIST/SEQ [ 0.74|0.001] 1.00|0.0007 1.02|0.0162 1.00|0.0081 BOYER [ 0.55|0.01] 1.00|0.0100 0.97|0.0018 0.95|0.0009 BROWSE [ 0.99|0.002] 1.00|0.0025 0.90|0.0020 0.91|0.0061 DDERIV [ 0.63|0.00] 1.00|0.0000 0.76|0.0040 0.67|0.0008 DERIV [ 0.63|0.001] 1.00|0.0016 0.77|0.0032 0.68|0.0016 DESTRUCTIVE [ 0.45|0.001] 1.00|0.0011 0.92|0.0056 0.91|0.0256 DIV2-TEST-1 [ 0.63|0.00] 1.00|0.0000 0.70|0.0008 0.60|0.0110 DIV2-TEST-2 [ 0.92|0.001] 1.00|0.0005 0.79|0.0011 0.71|0.0011 FFT [ 0.80|0.002] 1.00|0.0025 1.00|0.0031 1.14|0.0277 FRPOLY/FIXNUM [ 0.64|0.00] 1.00|0.0000 0.96|0.0016 1.03|0.0647 FRPOLY/BIGNUM [ 1.00|0.001] 1.00|0.0005 0.91|0.0005 0.89|0.0010 FRPOLY/FLOAT [ 0.77|0.001] 1.00|0.0019 0.99|0.0000 0.90|0.0019 PUZZLE [ 0.55|0.00] 1.00|0.0000 1.03|0.0018 0.99|0.0009 TAK [ 0.91|0.001] 1.00|0.0016 1.01|0.0022 1.00|0.0000 CTAK [ 1.16|0.004] 1.00|0.0030 1.00|0.0004 1.00|0.0017 TRTAK [ 0.91|0.001] 1.00|0.0016 1.00|0.0038 1.03|0.0208 TAKL [ 0.63|0.00] 1.00|0.0000 1.01|0.0000 1.00|0.0000 STAK [ 0.80|0.002] 1.00|0.0031 1.00|0.0000 1.01|0.0019 FPRINT/UGLY [ 0.40|0.004] 1.00|0.0101 0.99|0.0013 1.04|0.0177 FPRINT/PRETTY [ 0.56|0.001] 1.00|0.0018 1.01|0.0000 1.01|0.0009 TRAVERSE [ 0.98|0.02] 1.00|0.0224 0.99|0.0025 0.99|0.0336 TRIANGLE [ 0.79|0.004] 1.00|0.0057 1.00|0.0019 1.01|0.0113 RICHARDS [ 0.71|0.001] 1.00|0.0021 1.00|0.0035 1.00|0.0063 FACTORIAL [ 1.16|0.01] 1.00|0.0052 0.83|0.0306 0.71|0.0013 FIB [ 0.77|0.002] 1.00|0.0032 1.00|0.0052 1.00|0.0019 FIB-RATIO [ 0.80|0.001] 1.00|0.0006 0.91|0.0006 0.89|0.0044 ACKERMANN [ 2.89|0.03] 1.00|0.0104 1.00|0.0007 0.98|0.0024 MANDELBROT/COMPLEX [ 0.70|0.001] 1.00|0.0022 0.82|0.0029 0.76|0.0065 MANDELBROT/DFLOAT [ 0.73|0.02] 1.00|0.0338 0.78|0.0007 0.72|0.0014 MRG32K3A [ 0.02|0.001] 1.00|0.0270 1.00|0.0270 1.03|0.0000 BIGNUM/ELEM-100-1000 [ 0.63|0.001] 1.00|0.0008 1.01|0.0048 1.02|0.0048 BIGNUM/ELEM-1000-100 [ 1.45|0.01] 1.00|0.0066 1.00|0.0010 1.00|0.0000 BIGNUM/ELEM-10000-1 [ 2.60|0.01] 1.00|0.0023 1.00|0.0031 1.00|0.0025 BIGNUM/PARI-100-10 [ 0.57|0.02] 1.00|0.0290 0.93|0.0009 0.92|0.0035 BIGNUM/PARI-200-5 [ 0.65|0.001] 1.00|0.0015 0.98|0.0015 0.97|0.0046 PI-DECIMAL/SMALL [ 0.92|0.002] 1.00|0.0027 0.93|0.0005 0.92|0.0011 PI-ATAN [ 1.18|0.001] 1.00|0.0013 0.91|0.0017 0.88|0.0017 PI-RATIOS [ 1.24|0.001] 1.00|0.0008 0.97|0.0008 0.97|0.0048 HASH-STRINGS [ 0.87|0.002] 1.00|0.0023 0.96|0.0052 0.96|0.0011 HASH-INTEGERS [ 0.71|0.002] 1.00|0.0035 0.97|0.0000 0.96|0.0021 BOEHM-GC [ 1.74|0.001] 1.00|0.0003 0.87|0.0012 0.82|0.0003 DEFLATE-FILE [ 0.93|0.01] 1.00|0.0123 0.99|0.0027 0.99|0.0048 1D-ARRAYS [ 0.96|0.002] 1.00|0.0021 1.00|0.0021 1.00|0.0010 2D-ARRAYS [ 1.21|0.004] 1.00|0.0037 1.01|0.0008 1.00|0.0033 3D-ARRAYS [ 2.41|0.01] 1.00|0.0027 1.00|0.0021 1.03|0.0259 BITVECTORS [ 0.94|0.002] 1.00|0.0021 0.90|0.0196 0.77|0.0365 BENCH-STRINGS [ 1.03|0.005] 1.00|0.0048 1.02|0.0029 1.01|0.0136 SEARCH-SEQUENCE [ 0.54|0.001] 1.00|0.0009 1.01|0.0028 1.00|0.0056 CLOS/defclass [ 3.11|0.03] 1.00|0.0103 0.97|0.0021 0.98|0.0069 CLOS/defmethod [ 8.79|0.05] 1.00|0.0061 0.98|0.0098 0.98|0.0002 CLOS/instantiate [ 5.57|0.05] 1.00|0.0088 0.89|0.0066 0.92|0.0019 CLOS/simple-instantiate [ 0.58|0.003] 1.00|0.0052 0.82|0.0035 0.75|0.0017 CLOS/methodcalls [ 0.57|0.001] 1.00|0.0026 0.87|0.0009 0.87|0.0097 CLOS/method+after [ 3.93|0.003] 1.00|0.0008 0.97|0.0024 0.98|0.0024 CLOS/complex-methods [ 0.38|0.003] 1.00|0.0079 0.98|0.0157 0.94|0.0013 EQL-SPECIALIZED-FIB [ 0.93|0.002] 1.00|0.0021 1.00|0.0048 1.01|0.0027 Reference time in first column is in seconds; other columns are relative Reference implementation: SBCL 0.9.7.3-x86 Impl 0.9.7.3-x86: SBCL 0.9.7.3-x86 Impl 0.9.7.3-x86.gc.1: SBCL 0.9.7.3-x86.gc.1 Impl 0.9.7.3-x86.gc.4: SBCL 0.9.7.3-x86.gc.4 === Test machine === Machine-instance: xyz Machine-type: X86 Machine-version: Intel(R) Pentium(R) 4 CPU 2.80GHz Linux xyz 2.6.8 #1 SMP Thu Oct 6 17:02:15 EEST 2005 i686 GNU/Linux
This is SBCL 0.9.7.3-x86.gc.2, an implementation of ANSI Common Lisp. More information about SBCL is available at <http://www.sbcl.org/>. SBCL is free software, provided as is, with absolutely no warranty. It is mostly in the public domain; some portions are provided under BSD-style licenses. See the CREDITS and COPYING files in the distribution for more information. Benchmark Reference 0.9.7.3-x86 0.9.7.3-x86.gc.1 0.9.7.3-x86.gc.4 ------------------------------------------------------------------------------------- COMPILER [ 6.98|0.02] 1.00|0.0029 0.99|0.0011 0.98|0.0058 LOAD-FASL [ 4.08|0.02] 1.00|0.0043 1.00|0.0058 1.03|0.0020 SUM-PERMUTATIONS [ 5.15|0.01] 1.00|0.0028 1.00|0.0008 1.01|0.0006 WALK-LIST/SEQ [ 8.22|0.001] 1.00|0.0001 1.00|0.0004 1.00|0.0000 BOYER [ 1.87|0.04] 1.00|0.0217 0.99|0.0032 0.95|0.0029 BROWSE [ 3.36|0.002] 1.00|0.0007 0.97|0.0003 0.93|0.0003 DDERIV [ 2.30|0.01] 1.00|0.0026 0.94|0.0007 0.79|0.0002 DERIV [ 2.16|0.001] 1.00|0.0007 0.95|0.0000 0.79|0.0005 DESTRUCTIVE [ 1.56|0.001] 1.00|0.0006 0.98|0.0003 0.91|0.0006 DIV2-TEST-1 [ 2.63|0.001] 1.00|0.0006 0.87|0.0004 0.78|0.0000 DIV2-TEST-2 [ 2.93|0.001] 1.00|0.0005 0.93|0.0005 0.75|0.0007 FFT [ 2.41|0.004] 1.00|0.0017 1.00|0.0004 1.00|0.0004 FRPOLY/FIXNUM [ 2.12|0.002] 1.00|0.0009 0.98|0.0007 0.96|0.0002 FRPOLY/BIGNUM [ 3.34|0.001] 1.00|0.0001 0.97|0.0001 0.98|0.0004 FRPOLY/FLOAT [ 2.67|0.001] 1.00|0.0006 0.96|0.0000 1.03|0.0004 PUZZLE [ 2.82|0.001] 1.00|0.0002 0.93|0.0005 0.93|0.0004 TAK [ 2.33|0.001] 1.00|0.0002 1.00|0.0004 1.08|0.0000 CTAK [ 2.86|0.001] 1.00|0.0002 1.03|0.0000 1.03|0.0000 TRTAK [ 2.34|0.001] 1.00|0.0002 1.08|0.0004 1.08|0.0006 TAKL [ 2.46|0.001] 1.00|0.0004 1.09|0.0004 1.07|0.0000 STAK [ 2.23|0.001] 1.00|0.0004 0.96|0.0004 0.98|0.0004 FPRINT/UGLY [ 1.20|0.001] 1.00|0.0004 1.16|0.0000 1.03|0.0004 FPRINT/PRETTY [ 1.70|0.00] 1.00|0.0000 1.03|0.0006 1.03|0.0000 TRAVERSE [ 4.91|0.002] 1.00|0.0004 1.01|0.0103 0.97|0.0030 TRIANGLE [ 2.45|0.001] 1.00|0.0004 1.08|0.0004 1.08|0.0002 RICHARDS [ 2.16|0.001] 1.00|0.0002 1.02|0.0007 1.09|0.0007 FACTORIAL [ 3.77|0.00] 1.00|0.0000 0.94|0.0011 0.80|0.0017 FIB [ 2.41|0.001] 1.00|0.0002 1.00|0.0004 1.00|0.0002 FIB-RATIO [ 3.16|0.001] 1.00|0.0002 0.93|0.0002 0.96|0.0008 ACKERMANN [ 17.54|0.16] 1.00|0.0090 1.01|0.0099 1.01|0.0099 MANDELBROT/COMPLEX [ 2.72|0.001] 1.00|0.0006 0.94|0.0002 0.91|0.0004 MANDELBROT/DFLOAT [ 2.70|0.001] 1.00|0.0002 0.94|0.0002 0.98|0.0004 MRG32K3A [ 0.07|0.00] 1.00|0.0000 1.06|0.0000 1.00|0.0000 BIGNUM/ELEM-100-1000 [ 1.72|0.001] 1.00|0.0006 1.00|0.0003 1.00|0.0000 BIGNUM/ELEM-1000-100 [ 3.52|0.001] 1.00|0.0001 1.00|0.0001 1.00|0.0007 BIGNUM/ELEM-10000-1 [ 5.48|0.001] 1.00|0.0001 1.00|0.0001 1.00|0.0000 BIGNUM/PARI-100-10 [ 1.63|0.001] 1.00|0.0006 0.99|0.0003 0.98|0.0009 BIGNUM/PARI-200-5 [ 1.70|0.001] 1.00|0.0003 1.00|0.0003 0.98|0.0006 PI-DECIMAL/SMALL [ 2.48|0.00] 1.00|0.0000 0.98|0.0008 0.94|0.0002 PI-ATAN [ 3.53|0.001] 1.00|0.0001 0.97|0.0006 0.86|0.0003 PI-RATIOS [ 3.40|0.001] 1.00|0.0003 0.99|0.0007 0.98|0.0004 HASH-STRINGS [ 2.71|0.001] 1.00|0.0004 1.01|0.0002 0.99|0.0002 HASH-INTEGERS [ 2.65|0.00] 1.00|0.0000 1.02|0.0006 0.99|0.0000 BOEHM-GC [ 6.08|0.01] 1.00|0.0011 0.96|0.0001 0.83|0.0002 DEFLATE-FILE [ 3.12|0.001] 1.00|0.0002 1.00|0.0002 1.00|0.0003 1D-ARRAYS [ 3.50|0.001] 1.00|0.0003 1.00|0.0021 1.01|0.0006 2D-ARRAYS [ 4.30|0.01] 1.00|0.0013 0.99|0.0103 0.98|0.0000 3D-ARRAYS [ 5.89|0.01] 1.00|0.0012 1.02|0.0107 1.02|0.0002 BITVECTORS [ 5.62|0.08] 1.00|0.0133 1.42|0.0017 1.43|0.0066 BENCH-STRINGS [ 9.02|0.00] 1.00|0.0000 1.00|0.0019 1.00|0.0006 SEARCH-SEQUENCE [ 2.65|0.00] 1.00|0.0000 1.02|0.0183 1.04|0.0000 CLOS/defclass [ 9.02|0.02] 1.00|0.0019 0.99|0.0033 0.98|0.0020 CLOS/defmethod [ 24.21|0.03] 1.00|0.0013 0.98|0.0011 0.99|0.0001 CLOS/instantiate [ 17.35|0.01] 1.00|0.0006 0.97|0.0058 0.89|0.0055 CLOS/simple-instantiate [ 2.03|0.00] 1.00|0.0000 0.95|0.0012 0.83|0.0005 CLOS/methodcalls [ 2.14|0.001] 1.00|0.0002 0.96|0.0047 0.93|0.0007 CLOS/method+after [ 11.59|0.01] 1.00|0.0008 0.98|0.0006 0.97|0.0002 CLOS/complex-methods [ 1.10|0.001] 1.00|0.0005 0.94|0.0077 0.90|0.0005 EQL-SPECIALIZED-FIB [ 3.32|0.001] 1.00|0.0003 0.96|0.0065 1.00|0.0002 Reference time in first column is in seconds; other columns are relative Reference implementation: SBCL 0.9.7.3-x86 Impl 0.9.7.3-x86: SBCL 0.9.7.3-x86 Impl 0.9.7.3-x86.gc.1: SBCL 0.9.7.3-x86.gc.1 Impl 0.9.7.3-x86.gc.4: SBCL 0.9.7.3-x86.gc.4 === Test machine === Machine-instance: pop Machine-type: X86 Machine-version: Pentium III (Coppermine) Linux pop 2.6.10-5-686 #1 Fri Jun 24 17:33:34 UTC 2005 i686 GNU/Linux
This is SBCL 0.9.7.3-x86.gc.2, an implementation of ANSI Common Lisp. More information about SBCL is available at <http://www.sbcl.org/>. SBCL is free software, provided as is, with absolutely no warranty. It is mostly in the public domain; some portions are provided under BSD-style licenses. See the CREDITS and COPYING files in the distribution for more information. Benchmark Reference 0.9.7.3-x86 0.9.7.3-x86.gc.1 0.9.7.3-x86.gc.4 ------------------------------------------------------------------------------------- COMPILER [ 80.19|0.07] 1.00|0.0009 0.98|0.0049 0.97|0.0080 LOAD-FASL [ 42.54|0.16] 1.00|0.0038 1.06|0.0230 1.00|0.0125 SUM-PERMUTATIONS [ 51.39|0.19] 1.00|0.0037 1.10|0.0427 0.97|0.0149 WALK-LIST/SEQ [ 26.45|0.02] 1.00|0.0008 1.01|0.0046 1.00|0.0011 BOYER [ 17.76|0.26] 1.00|0.0148 1.07|0.0119 0.96|0.0083 BROWSE [ 44.53|0.28] 1.00|0.0062 0.71|0.0207 0.78|0.0156 DDERIV [ 22.91|0.71] 1.00|0.0310 0.83|0.0025 0.81|0.0037 DERIV [ 19.17|0.03] 1.00|0.0015 0.95|0.0508 0.91|0.0092 DESTRUCTIVE [ 12.64|0.05] 1.00|0.0036 0.95|0.0022 0.95|0.0041 DIV2-TEST-1 [ 20.34|0.04] 1.00|0.0020 0.89|0.0015 0.88|0.0037 DIV2-TEST-2 [ 21.48|0.09] 1.00|0.0042 0.92|0.0127 0.88|0.0012 FFT [ 22.74|0.30] 1.00|0.0130 1.00|0.0076 1.00|0.0069 FRPOLY/FIXNUM [ 16.71|0.24] 1.00|0.0141 0.97|0.0047 1.00|0.0062 FRPOLY/BIGNUM [ 35.64|0.96] 1.00|0.0269 0.95|0.0074 0.94|0.0137 FRPOLY/FLOAT [ 27.06|2.44] 1.00|0.0903 0.87|0.0004 1.09|0.0309 PUZZLE [ 20.82|0.31] 1.00|0.0147 1.01|0.0061 1.02|0.0221 TAK [ 11.72|0.004] 1.00|0.0003 1.00|0.0008 1.00|0.0008 CTAK [ 15.55|0.01] 1.00|0.0008 1.00|0.0010 1.00|0.0054 TRTAK [ 11.75|0.001] 1.00|0.00004 1.00|0.0014 1.00|0.0006 TAKL [ 11.84|0.02] 1.00|0.0015 0.99|0.0013 1.00|0.0013 STAK [ 13.46|0.05] 1.00|0.0040 0.99|0.00004 1.01|0.0024 FPRINT/UGLY [ 27.90|0.07] 1.00|0.0023 1.06|0.0654 1.03|0.0313 FPRINT/PRETTY [ 28.92|0.32] 1.00|0.0112 1.00|0.0289 0.98|0.0145 TRAVERSE [ 21.02|0.08] 1.00|0.0038 1.00|0.0049 1.00|0.0005 TRIANGLE [ 12.51|0.07] 1.00|0.0052 1.00|0.0029 1.00|0.0001 RICHARDS [ 15.87|0.001] 1.00|0.00003 1.00|0.0029 1.02|0.0067 FACTORIAL [ 36.16|0.98] 1.00|0.0272 0.92|0.0017 0.87|0.0020 FIB [ 11.95|0.04] 1.00|0.0034 1.00|0.0005 1.00|0.0000 FIB-RATIO [ 50.59|3.09] 1.00|0.0610 0.86|0.0135 0.88|0.0254 ACKERMANN [ 110.24|0.58] 1.00|0.0052 0.98|0.0045 0.99|0.0018 MANDELBROT/COMPLEX [ 29.92|0.51] 1.00|0.0171 0.91|0.00003 0.90|0.0120 MANDELBROT/DFLOAT [ 26.53|0.03] 1.00|0.0011 0.87|0.0307 0.87|0.0137 MRG32K3A [ 1.13|0.02] 1.00|0.0194 1.05|0.0442 0.97|0.0053 BIGNUM/ELEM-100-1000 [ 16.58|0.10] 1.00|0.0061 0.99|0.0210 0.99|0.0093 BIGNUM/ELEM-1000-100 [ 32.88|0.11] 1.00|0.0035 1.00|0.0033 1.00|0.0007 BIGNUM/ELEM-10000-1 [ 59.27|0.28] 1.00|0.0047 1.01|0.0002 1.01|0.0035 BIGNUM/PARI-100-10 [ 16.46|0.36] 1.00|0.0218 0.96|0.0207 0.97|0.0051 BIGNUM/PARI-200-5 [ 16.69|0.16] 1.00|0.0097 1.00|0.0095 0.98|0.0063 PI-DECIMAL/SMALL [ 25.73|0.33] 1.00|0.0128 0.95|0.0157 0.96|0.0149 PI-ATAN [ 29.90|0.14] 1.00|0.0047 0.96|0.0170 0.92|0.0068 PI-RATIOS [ 33.22|0.41] 1.00|0.0124 0.95|0.0198 0.98|0.0152 HASH-STRINGS [ 23.92|0.10] 1.00|0.0043 0.98|0.0005 0.95|0.0057 HASH-INTEGERS [ 20.49|0.68] 1.00|0.0331 1.01|0.0048 1.03|0.0506 BOEHM-GC [ 45.74|0.33] 1.00|0.0072 0.92|0.0019 0.94|0.0161 DEFLATE-FILE [ 26.15|0.06] 1.00|0.0021 0.97|0.0024 0.96|0.0241 1D-ARRAYS [ 25.06|0.15] 1.00|0.0059 0.99|0.0014 1.00|0.0041 2D-ARRAYS [ 53.27|0.08] 1.00|0.0014 1.02|0.0030 1.01|0.0053 3D-ARRAYS [ 84.24|0.22] 1.00|0.0026 1.01|0.0042 1.00|0.0023 BITVECTORS [ 39.24|0.48] 1.00|0.0122 1.05|0.0051 1.06|0.0048 BENCH-STRINGS [ 42.06|0.48] 1.00|0.0113 0.99|0.0004 1.01|0.0074 SEARCH-SEQUENCE [ 23.52|0.04] 1.00|0.0019 1.00|0.0005 1.00|0.0003 CLOS/defclass [ 113.57|0.26] 1.00|0.0023 0.99|0.0114 0.96|0.0074 CLOS/defmethod [ 274.55|0.59] 1.00|0.0022 1.00|0.0025 1.01|0.0022 CLOS/instantiate [ 251.53|9.98] 1.00|0.0397 1.03|0.0021 0.96|0.0193 CLOS/simple-instantiate [ 17.24|0.06] 1.00|0.0037 1.12|0.0347 0.97|0.0457 CLOS/methodcalls [ 27.75|0.54] 1.00|0.0194 0.94|0.0365 0.94|0.0102 CLOS/method+after [ 138.57|0.83] 1.00|0.0060 0.98|0.0079 0.99|0.0059 CLOS/complex-methods [ 24.19|0.62] 1.00|0.0257 0.98|0.0608 0.99|0.0623 EQL-SPECIALIZED-FIB [ 21.53|0.89] 1.00|0.0412 0.92|0.0180 0.93|0.0042 Reference time in first column is in seconds; other columns are relative Reference implementation: SBCL 0.9.7.3-x86 Impl 0.9.7.3-x86: SBCL 0.9.7.3-x86 Impl 0.9.7.3-x86.gc.1: SBCL 0.9.7.3-x86.gc.1 Impl 0.9.7.3-x86.gc.4: SBCL 0.9.7.3-x86.gc.4 === Test machine === Machine-instance: dogbert Machine-type: X86 Machine-version: Pentium 75 - 200 Linux dogbert 2.6.11-1-386 #1 Sun Jun 12 11:09:43 MDT 2005 i586 GNU/Linux
This is SBCL 0.9.7.3-x86.gc.2, an implementation of ANSI Common Lisp. More information about SBCL is available at <http://www.sbcl.org/>. SBCL is free software, provided as is, with absolutely no warranty. It is mostly in the public domain; some portions are provided under BSD-style licenses. See the CREDITS and COPYING files in the distribution for more information. Benchmark Reference 0.9.7.3-x86 0.9.7.3-x86.gc.1 0.9.7.3-x86.gc.4 ------------------------------------------------------------------------------------- COMPILER [ 7.67|0.13] 1.00|0.0165 0.95|0.0000 0.97|0.0179 LOAD-FASL [ 3.99|0.14] 1.00|0.0351 0.97|0.0053 1.01|0.0385 SUM-PERMUTATIONS [ 3.83|0.01] 1.00|0.0014 0.94|0.0009 0.97|0.0009 WALK-LIST/SEQ [ 6.24|0.002] 1.00|0.0004 1.00|0.0004 1.00|0.0002 BOYER [ 1.44|0.00] 1.00|0.0000 0.96|0.0007 1.00|0.0017 BROWSE [ 2.26|0.001] 1.00|0.0002 1.13|0.0011 0.99|0.0002 DDERIV [ 1.94|0.005] 1.00|0.0026 0.76|0.0000 0.92|0.0005 DERIV [ 1.85|0.001] 1.00|0.0008 0.78|0.0008 0.93|0.0000 DESTRUCTIVE [ 1.18|0.003] 1.00|0.0025 0.93|0.0000 0.94|0.0013 DIV2-TEST-1 [ 2.32|0.01] 1.00|0.0026 0.74|0.0006 0.92|0.0002 DIV2-TEST-2 [ 2.33|0.02] 1.00|0.0071 0.87|0.0006 0.94|0.0002 FFT [ 1.63|0.004] 1.00|0.0022 1.00|0.0003 1.00|0.0012 FRPOLY/FIXNUM [ 1.55|0.004] 1.00|0.0029 0.96|0.0019 0.98|0.0039 FRPOLY/BIGNUM [ 2.46|0.07] 1.00|0.0299 0.90|0.0031 0.95|0.0006 FRPOLY/FLOAT [ 1.96|0.005] 1.00|0.0026 0.96|0.0020 1.03|0.0020 PUZZLE [ 1.81|0.002] 1.00|0.0014 1.05|0.0003 1.01|0.0003 TAK [ 1.72|0.002] 1.00|0.0012 1.01|0.0003 1.02|0.0023 CTAK [ 1.65|0.001] 1.00|0.0006 1.00|0.0006 0.99|0.0003 TRTAK [ 1.73|0.001] 1.00|0.0003 1.01|0.0006 1.00|0.0017 TAKL [ 1.70|0.001] 1.00|0.0003 0.92|0.0009 0.91|0.0018 STAK [ 1.73|0.005] 1.00|0.0029 0.98|0.0003 0.98|0.0017 FPRINT/UGLY [ 0.82|0.002] 1.00|0.0025 1.15|0.0006 1.13|0.0006 FPRINT/PRETTY [ 1.07|0.001] 1.00|0.0005 1.02|0.0037 1.13|0.0000 TRAVERSE [ 7.16|0.01] 1.00|0.0018 1.00|0.0001 1.00|0.0001 TRIANGLE [ 1.83|0.001] 1.00|0.0005 1.00|0.0000 1.00|0.0005 RICHARDS [ 1.43|0.02] 1.00|0.0112 1.11|0.0974 1.00|0.0018 FACTORIAL [ 3.11|0.01] 1.00|0.0026 0.81|0.0066 0.91|0.0008 FIB [ 1.68|0.01] 1.00|0.0045 1.00|0.0024 1.00|0.0009 FIB-RATIO [ 2.19|0.001] 1.00|0.0005 0.84|0.0027 0.88|0.0023 ACKERMANN [ 16.72|0.07] 1.00|0.0042 1.00|0.0008 0.99|0.0085 MANDELBROT/COMPLEX [ 2.03|0.005] 1.00|0.0025 0.85|0.0079 0.91|0.0010 MANDELBROT/DFLOAT [ 2.14|0.005] 1.00|0.0023 0.85|0.0033 0.99|0.0028 MRG32K3A [ 0.05|0.00] 1.00|0.0000 1.03|0.0294 0.98|0.0000 BIGNUM/ELEM-100-1000 [ 1.13|0.001] 1.00|0.0009 1.04|0.0009 1.04|0.0018 BIGNUM/ELEM-1000-100 [ 2.20|0.001] 1.00|0.0007 1.00|0.0002 1.00|0.0005 BIGNUM/ELEM-10000-1 [ 3.12|0.001] 1.00|0.0002 1.00|0.0003 1.00|0.0002 BIGNUM/PARI-100-10 [ 1.04|0.001] 1.00|0.0014 1.01|0.0029 1.01|0.0000 BIGNUM/PARI-200-5 [ 1.09|0.001] 1.00|0.0005 1.00|0.0009 1.00|0.0005 PI-DECIMAL/SMALL [ 1.57|0.00] 1.00|0.0000 0.96|0.0006 0.97|0.0003 PI-ATAN [ 2.45|0.001] 1.00|0.0004 0.98|0.0000 0.95|0.0006 PI-RATIOS [ 2.23|0.001] 1.00|0.0002 0.99|0.0002 0.99|0.0002 HASH-STRINGS [ 2.22|0.01] 1.00|0.0034 1.00|0.0002 1.00|0.0004 HASH-INTEGERS [ 2.21|0.001] 1.00|0.0002 1.01|0.0000 0.99|0.0000 BOEHM-GC [ 4.49|0.06] 1.00|0.0130 0.89|0.0129 0.96|0.0001 DEFLATE-FILE [ 2.04|0.004] 1.00|0.0020 1.04|0.0005 0.95|0.0005 1D-ARRAYS [ 2.15|0.001] 1.00|0.0005 1.00|0.0002 0.99|0.0002 2D-ARRAYS [ 4.14|0.06] 1.00|0.0151 1.00|0.0014 0.98|0.0006 3D-ARRAYS [ 5.84|0.20] 1.00|0.0345 0.98|0.0003 0.97|0.0001 BITVECTORS [ 7.07|0.04] 1.00|0.0052 1.05|0.0022 1.04|0.0114 BENCH-STRINGS [ 6.75|0.26] 1.00|0.0391 0.96|0.0010 0.96|0.0010 SEARCH-SEQUENCE [ 2.26|0.001] 1.00|0.0007 1.00|0.0011 1.00|0.0000 CLOS/defclass [ 9.71|0.13] 1.00|0.0139 0.97|0.0152 0.97|0.0113 CLOS/defmethod [ 24.43|0.02] 1.00|0.0009 0.97|0.0013 0.99|0.0006 CLOS/instantiate [ 18.08|0.04] 1.00|0.0023 1.00|0.0060 0.97|0.0004 CLOS/simple-instantiate [ 1.64|0.004] 1.00|0.0024 0.82|0.0024 0.93|0.0003 CLOS/methodcalls [ 1.39|0.004] 1.00|0.0025 0.90|0.0061 0.95|0.0000 CLOS/method+after [ 9.56|0.03] 1.00|0.0028 0.95|0.0021 0.96|0.0006 CLOS/complex-methods [ 0.80|0.05] 1.00|0.0609 0.99|0.0333 1.04|0.0000 EQL-SPECIALIZED-FIB [ 2.16|0.004] 1.00|0.0021 0.99|0.0090 0.99|0.0014 Reference time in first column is in seconds; other columns are relative Reference implementation: SBCL 0.9.7.3-x86 Impl 0.9.7.3-x86: SBCL 0.9.7.3-x86 Impl 0.9.7.3-x86.gc.1: SBCL 0.9.7.3-x86.gc.1 Impl 0.9.7.3-x86.gc.4: SBCL 0.9.7.3-x86.gc.4 === Test machine === Machine-instance: caladan Machine-type: X86 Machine-version: AMD Duron(tm) Processor Linux caladan 2.6.14-1-k7 #1 Tue Nov 1 16:19:43 JST 2005 i686 GNU/Linux
This is SBCL 0.9.7.3-x86.gc.2, an implementation of ANSI Common Lisp. More information about SBCL is available at <http://www.sbcl.org/>. SBCL is free software, provided as is, with absolutely no warranty. It is mostly in the public domain; some portions are provided under BSD-style licenses. See the CREDITS and COPYING files in the distribution for more information. Benchmark Reference 0.9.7.3-x86 0.9.7.3-x86.gc.1 0.9.7.3-x86.gc.4 ------------------------------------------------------------------------------------- COMPILER [ 6.06|0.41] 1.00|0.0677 0.93|0.0108 0.98|0.0670 LOAD-FASL [ 3.10|0.04] 1.00|0.0122 1.01|0.0031 1.02|0.0417 SUM-PERMUTATIONS [ 3.14|0.07] 1.00|0.0221 1.03|0.0035 0.97|0.0251 WALK-LIST/SEQ [ 6.62|0.20] 1.00|0.0302 0.97|0.0029 1.00|0.0362 BOYER [ 0.96|0.02] 1.00|0.0224 0.99|0.0037 1.00|0.0318 BROWSE [ 1.80|0.03] 1.00|0.0181 1.03|0.0028 0.99|0.0245 DDERIV [ 1.68|0.05] 1.00|0.0316 0.83|0.0030 0.91|0.0307 DERIV [ 1.60|0.04] 1.00|0.0272 0.85|0.0034 0.92|0.0306 DESTRUCTIVE [ 1.05|0.02] 1.00|0.0143 0.94|0.0124 0.94|0.0110 DIV2-TEST-1 [ 2.07|0.06] 1.00|0.0310 0.80|0.0034 0.90|0.0283 DIV2-TEST-2 [ 2.04|0.04] 1.00|0.0216 0.93|0.0032 0.93|0.0299 FFT [ 1.47|0.02] 1.00|0.0126 0.99|0.0027 1.01|0.0140 FRPOLY/FIXNUM [ 1.30|0.01] 1.00|0.0084 1.03|0.0161 0.97|0.0150 FRPOLY/BIGNUM [ 1.96|0.10] 1.00|0.0497 1.00|0.0018 0.93|0.0211 FRPOLY/FLOAT [ 1.60|0.04] 1.00|0.0231 1.02|0.0143 1.02|0.0112 PUZZLE [ 1.64|0.03] 1.00|0.0153 0.99|0.0067 1.00|0.0153 TAK [ 1.48|0.09] 1.00|0.0590 1.04|0.0027 1.06|0.0105 CTAK [ 1.49|0.02] 1.00|0.0144 0.98|0.0030 0.98|0.0138 TRTAK [ 1.53|0.04] 1.00|0.0275 1.02|0.0062 1.02|0.0065 TAKL [ 1.53|0.02] 1.00|0.0101 0.95|0.0393 0.92|0.0095 STAK [ 1.55|0.02] 1.00|0.0122 1.01|0.0006 0.99|0.0148 FPRINT/UGLY [ 0.74|0.02] 1.00|0.0284 1.10|0.0061 1.12|0.0304 FPRINT/PRETTY [ 0.95|0.02] 1.00|0.0257 0.98|0.0058 1.12|0.0288 TRAVERSE [ 2.91|0.17] 1.00|0.0577 0.96|0.0105 1.01|0.0543 TRIANGLE [ 1.65|0.01] 1.00|0.0088 1.02|0.0408 1.03|0.0466 RICHARDS [ 1.25|0.004] 1.00|0.0028 1.01|0.0036 0.98|0.0016 FACTORIAL [ 2.73|0.09] 1.00|0.0346 0.86|0.0029 0.92|0.0285 FIB [ 1.53|0.01] 1.00|0.0066 0.99|0.0010 1.00|0.0115 FIB-RATIO [ 1.80|0.03] 1.00|0.0145 0.93|0.0150 0.85|0.0095 ACKERMANN [ 11.11|0.66] 1.00|0.0593 0.94|0.0074 1.02|0.0480 MANDELBROT/COMPLEX [ 1.67|0.05] 1.00|0.0329 0.93|0.0039 0.90|0.0275 MANDELBROT/DFLOAT [ 1.73|0.06] 1.00|0.0344 0.95|0.0052 0.93|0.0555 MRG32K3A [ 0.04|0.002] 1.00|0.0488 1.04|0.0610 0.99|0.0122 BIGNUM/ELEM-100-1000 [ 1.00|0.02] 1.00|0.0151 1.04|0.0025 1.03|0.0126 BIGNUM/ELEM-1000-100 [ 1.97|0.02] 1.00|0.0112 1.01|0.0018 1.01|0.0145 BIGNUM/ELEM-10000-1 [ 2.79|0.03] 1.00|0.0099 1.00|0.0011 1.00|0.0122 BIGNUM/PARI-100-10 [ 0.91|0.01] 1.00|0.0149 1.04|0.0039 1.00|0.0160 BIGNUM/PARI-200-5 [ 0.95|0.02] 1.00|0.0179 1.04|0.0037 0.99|0.0121 PI-DECIMAL/SMALL [ 1.37|0.02] 1.00|0.0135 1.00|0.0004 0.97|0.0171 PI-ATAN [ 2.22|0.04] 1.00|0.0198 1.02|0.0043 0.94|0.0164 PI-RATIOS [ 1.96|0.03] 1.00|0.0143 1.00|0.0049 0.99|0.0151 HASH-STRINGS [ 1.96|0.03] 1.00|0.0135 0.97|0.0033 0.98|0.0342 HASH-INTEGERS [ 2.03|0.05] 1.00|0.0269 0.98|0.0049 0.98|0.0306 BOEHM-GC [ 3.53|0.08] 1.00|0.0229 1.00|0.0033 0.94|0.0304 DEFLATE-FILE [ 1.76|0.02] 1.00|0.0128 1.01|0.0028 0.99|0.0284 1D-ARRAYS [ 1.88|0.03] 1.00|0.0144 0.99|0.0064 0.98|0.0130 2D-ARRAYS [ 3.05|0.07] 1.00|0.0219 1.01|0.0082 1.01|0.0219 3D-ARRAYS [ 4.50|0.09] 1.00|0.0189 1.01|0.0056 0.99|0.0208 BITVECTORS [ 4.60|0.07] 1.00|0.0151 1.30|0.0003 1.29|0.0909 BENCH-STRINGS [ 5.77|0.01] 1.00|0.0017 1.00|0.0048 1.03|0.0328 SEARCH-SEQUENCE [ 2.00|0.01] 1.00|0.0065 0.99|0.0002 1.01|0.0197 CLOS/defclass [ 6.64|0.004] 1.00|0.0006 0.99|0.0037 1.04|0.0573 CLOS/defmethod [ 19.27|0.18] 1.00|0.0094 1.01|0.0148 1.03|0.0405 CLOS/instantiate [ 10.10|0.09] 1.00|0.0090 1.11|0.0578 1.02|0.0100 CLOS/simple-instantiate [ 1.34|0.004] 1.00|0.0033 0.95|0.0234 0.92|0.0007 CLOS/methodcalls [ 1.12|0.004] 1.00|0.0040 1.01|0.0215 0.93|0.0058 CLOS/method+after [ 7.60|0.09] 1.00|0.0118 1.02|0.0372 0.97|0.0025 CLOS/complex-methods [ 0.67|0.01] 1.00|0.0097 1.07|0.0307 1.12|0.0464 EQL-SPECIALIZED-FIB [ 1.92|0.02] 1.00|0.0115 1.02|0.0104 0.98|0.0021 Reference time in first column is in seconds; other columns are relative Reference implementation: SBCL 0.9.7.3-x86 Impl 0.9.7.3-x86: SBCL 0.9.7.3-x86 Impl 0.9.7.3-x86.gc.1: SBCL 0.9.7.3-x86.gc.1 Impl 0.9.7.3-x86.gc.4: SBCL 0.9.7.3-x86.gc.4 === Test machine === Machine-instance: arrakis Machine-type: X86 Machine-version: AMD Athlon(tm) Processor Linux arrakis 2.6.14-2-k7 #1 Sat Nov 26 14:04:05 UTC 2005 i686 GNU/Linux
RSS Feed