caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Timing Ocaml
@ 2002-06-10  5:35 Blair Zajac
  2002-06-10  6:24 ` Chris Hecker
  2002-06-10 15:01 ` Xavier Leroy
  0 siblings, 2 replies; 13+ messages in thread
From: Blair Zajac @ 2002-06-10  5:35 UTC (permalink / raw)
  To: Caml Mailing List

Reading that the bytecode interpreter for Ocaml runs 2/3 as fast
when compiled with VC 6 compared to gcc, has anybody done any
timing comparisons with VisualStudio.Net, Intel C++ 5.x or
Intel C++ 6.0?

If I were to do these timing tests with these compilers and
with different gcc versions (2.95.3, 3.0.4 and 3.1) which
script/program should I use to get a fair estimate of the
compiler?

Also, in INSTALL, it says

* The GNU C compiler gcc is recommended, as the bytecode
  interpreter takes advantage of gcc-specific features to enhance
  performance.

What is the nature of these optimizations?

Blair

-- 
Blair Zajac <blair@orcaware.com>
Web and OS performance plots - http://www.orcaware.com/orca/
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Caml-list] Timing Ocaml
  2002-06-10  5:35 [Caml-list] Timing Ocaml Blair Zajac
@ 2002-06-10  6:24 ` Chris Hecker
  2002-06-10 12:02   ` Dmitry Bely
  2002-06-10 15:01 ` Xavier Leroy
  1 sibling, 1 reply; 13+ messages in thread
From: Chris Hecker @ 2002-06-10  6:24 UTC (permalink / raw)
  To: Blair Zajac, Caml Mailing List


>* The GNU C compiler gcc is recommended, as the bytecode
>   interpreter takes advantage of gcc-specific features to enhance
>   performance.
>What is the nature of these optimizations?

GCC lets you take the address of a label.  You can see in byterun/interp.c 
that it uses a jump table instead of a switch when you're using GCC.

At least, that's what it looks like.

Chris

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Caml-list] Timing Ocaml
  2002-06-10  6:24 ` Chris Hecker
@ 2002-06-10 12:02   ` Dmitry Bely
  2002-06-10 12:50     ` Remi VANICAT
  0 siblings, 1 reply; 13+ messages in thread
From: Dmitry Bely @ 2002-06-10 12:02 UTC (permalink / raw)
  To: caml-list

Chris Hecker <checker@d6.com> writes:

>>* The GNU C compiler gcc is recommended, as the bytecode
>>   interpreter takes advantage of gcc-specific features to enhance
>>   performance.
>>What is the nature of these optimizations?
>
> GCC lets you take the address of a label.  You can see in
> byterun/interp.c that it uses a jump table instead of a switch when
> you're using GCC.
>
> At least, that's what it looks like.

I would rather say that gcc allows to force register allocation for some
specific variable, while MSVC always ignore "register" specifier.

#if defined(__GNUC__) && !defined(DEBUG)
[...]
#ifdef __i386__
#define PC_REG asm("%esi")
#define SP_REG asm("%edi")
#define ACCU_REG
#endif
[...]
#endif

/* The interpreter itself */

value interprete(code_t prog, asize_t prog_size)
{
#ifdef PC_REG
  register code_t pc PC_REG;
  register value * sp SP_REG;
  register value accu ACCU_REG;
#else
  register code_t pc;
  register value * sp;
  register value accu;
#endif

In the same time MSVC has very good optimizer and it is very strange, that
two explicit register variables lead to 30% performance gain...

Hope to hear from you soon,
Dmitry


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Caml-list] Timing Ocaml
  2002-06-10 12:02   ` Dmitry Bely
@ 2002-06-10 12:50     ` Remi VANICAT
  2002-06-10 14:19       ` Lionel Fourquaux
  0 siblings, 1 reply; 13+ messages in thread
From: Remi VANICAT @ 2002-06-10 12:50 UTC (permalink / raw)
  To: caml-list

Dmitry Bely <dbely@mail.ru> writes:

> Chris Hecker <checker@d6.com> writes:
> 
> >>* The GNU C compiler gcc is recommended, as the bytecode
> >>   interpreter takes advantage of gcc-specific features to enhance
> >>   performance.
> >>What is the nature of these optimizations?
> >
> > GCC lets you take the address of a label.  You can see in
> > byterun/interp.c that it uses a jump table instead of a switch when
> > you're using GCC.
> >
> > At least, that's what it looks like.
> 
> I would rather say that gcc allows to force register allocation for some
> specific variable, while MSVC always ignore "register" specifier.
> 
> #if defined(__GNUC__) && !defined(DEBUG)
> [...]
> #ifdef __i386__
> #define PC_REG asm("%esi")
> #define SP_REG asm("%edi")
> #define ACCU_REG
> #endif
> [...]
> #endif

well, it seem that threaded code also depend of being compile with
gcc:

#if defined(__GNUC__) && __GNUC__ >= 2 && !defined(DEBUG) && !defined (SHRINKED_
GNUC)
#define THREADED_CODE
#endif

so both register assignation and threaded code can imply a lot of
speedup. 
-- 
Rémi Vanicat
vanicat@labri.u-bordeaux.fr
http://dept-info.labri.u-bordeaux.fr/~vanicat
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [Caml-list] Timing Ocaml
  2002-06-10 12:50     ` Remi VANICAT
@ 2002-06-10 14:19       ` Lionel Fourquaux
  0 siblings, 0 replies; 13+ messages in thread
From: Lionel Fourquaux @ 2002-06-10 14:19 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 2011 bytes --]

> From: owner-caml-list@pauillac.inria.fr [mailto:owner-caml-
> list@pauillac.inria.fr] On Behalf Of Remi VANICAT
> Sent: Monday, June 10, 2002 2:50 PM
> To: caml-list@inria.fr
> Subject: Re: [Caml-list] Timing Ocaml
> 
> Dmitry Bely <dbely@mail.ru> writes:
> 
> > Chris Hecker <checker@d6.com> writes:
> >
> > >>* The GNU C compiler gcc is recommended, as the bytecode
> > >>   interpreter takes advantage of gcc-specific features to enhance
> > >>   performance.
> > >>What is the nature of these optimizations?
> > >
> > > GCC lets you take the address of a label.  You can see in
> > > byterun/interp.c that it uses a jump table instead of a switch
when
> > > you're using GCC.
> > >
> > > At least, that's what it looks like.
> >
> > I would rather say that gcc allows to force register allocation for
some
> > specific variable, while MSVC always ignore "register" specifier.

	No, that's not the problem. MSVC is usually very good at
register allocation.

> >
> > #if defined(__GNUC__) && !defined(DEBUG)
> > [...]
> > #ifdef __i386__
> > #define PC_REG asm("%esi")
> > #define SP_REG asm("%edi")
> > #define ACCU_REG
> > #endif
> > [...]
> > #endif
> 
> well, it seem that threaded code also depend of being compile with
> gcc:
> 
> #if defined(__GNUC__) && __GNUC__ >= 2 && !defined(DEBUG) && !defined
> (SHRINKED_
> GNUC)
> #define THREADED_CODE
> #endif
> 
> so both register assignation and threaded code can imply a lot of
> speedup.

	If you look at the generated code, you can see that MSVC uses
registers very efficiently, and that the difference comes only from the
threaded code. Mainly, it is forced to do two nearly successive jumps,
and I think that this causes some pipeline problem in modern processors.

	If you check that bytecode ops are valid before the execution,
and if you use __assume(0) as the default case, you can gain about 10%
in execution speed, but the two successive jumps are still there.

	I don't know what MSVC 7 does, but I'd be interested.

-- 
  Lionel Fourquaux



[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 1484 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Caml-list] Timing Ocaml
  2002-06-10  5:35 [Caml-list] Timing Ocaml Blair Zajac
  2002-06-10  6:24 ` Chris Hecker
@ 2002-06-10 15:01 ` Xavier Leroy
  2002-06-10 16:29   ` Dmitry Bely
  2002-06-10 18:19   ` Blair Zajac
  1 sibling, 2 replies; 13+ messages in thread
From: Xavier Leroy @ 2002-06-10 15:01 UTC (permalink / raw)
  To: Blair Zajac; +Cc: Caml Mailing List

> Reading that the bytecode interpreter for Ocaml runs 2/3 as fast
> when compiled with VC 6 compared to gcc, has anybody done any
> timing comparisons with VisualStudio.Net, Intel C++ 5.x or
> Intel C++ 6.0?

As others mentioned, the reason why gcc does a better job on the Caml
bytecode interpreter is not that gcc generates better code all by
itself (it doesn't), but that it supports "computed gotos" as a C
language extension.  The bytecode interpreter takes advantage of this
feature by replacing opcodes with the addresses of the code fragments that
execute them, saving a significant amount of time in the bytecode
interpretation loop.

Microsoft's C compilers don't support this extension, and I doubt
Intel's compilers do, at least under Windows.  (Although I seem to
remember that Intel's compiler for Linux implements gcc extensions.)

Someone else mentioned the explicit register declarations in the
bytecode interpreter.  This is another gcc-specific extension, but
actually the bytecode interpreter uses them to work around the poor
register allocation performed by gcc (it fails to guess correctly
which local variables of the bytecode interpreter are most critical
and should end up in registers).  So, it's really a gcc feature used
to work around a gcc deficiency :-)  Other C compilers might actually
get the registers right by themselves.

- Xavier Leroy
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Caml-list] Timing Ocaml
  2002-06-10 15:01 ` Xavier Leroy
@ 2002-06-10 16:29   ` Dmitry Bely
  2002-06-10 16:49     ` Lionel Fourquaux
  2002-06-10 18:19   ` Blair Zajac
  1 sibling, 1 reply; 13+ messages in thread
From: Dmitry Bely @ 2002-06-10 16:29 UTC (permalink / raw)
  To: caml-list

Xavier Leroy <xavier.leroy@inria.fr> writes:

>> Reading that the bytecode interpreter for Ocaml runs 2/3 as fast
>> when compiled with VC 6 compared to gcc, has anybody done any
>> timing comparisons with VisualStudio.Net, Intel C++ 5.x or
>> Intel C++ 6.0?
>
> As others mentioned, the reason why gcc does a better job on the Caml
> bytecode interpreter is not that gcc generates better code all by
> itself (it doesn't), but that it supports "computed gotos" as a C
> language extension.  The bytecode interpreter takes advantage of this
> feature by replacing opcodes with the addresses of the code fragments that
> execute them, saving a significant amount of time in the bytecode
> interpretation loop.
>
> Microsoft's C compilers don't support this extension, and I doubt
> Intel's compilers do, at least under Windows.  (Although I seem to
> remember that Intel's compiler for Linux implements gcc extensions.)

Thank a lot for the explanation. But why then not to use inline asm for
MSVC, something like that:

#if defined(__GNUC__) && __GNUC__ >= 2
#define indirect_goto(addr) goto (addr)
#elif defined(_MSC_VER)
#define indirect_goto(addr) \
  { void* a = addr; __asm jmp dword ptr a; }
#endif

Hope to hear from you soon,
Dmitry


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [Caml-list] Timing Ocaml
  2002-06-10 16:29   ` Dmitry Bely
@ 2002-06-10 16:49     ` Lionel Fourquaux
  2002-06-11  8:28       ` Dmitry Bely
  0 siblings, 1 reply; 13+ messages in thread
From: Lionel Fourquaux @ 2002-06-10 16:49 UTC (permalink / raw)
  To: 'Dmitry Bely', caml-list

> From: owner-caml-list@pauillac.inria.fr [mailto:owner-caml-
> list@pauillac.inria.fr] On Behalf Of Dmitry Bely
> Sent: Monday, June 10, 2002 6:30 PM
> To: caml-list@inria.fr
> Subject: Re: [Caml-list] Timing Ocaml
> 
> Thank a lot for the explanation. But why then not to use inline asm
for
> MSVC, something like that:

	Because any fragment of inline asm disable a lot of
optimisations in MSVC, and you end up with a much slower interpreter.



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Caml-list] Timing Ocaml
  2002-06-10 15:01 ` Xavier Leroy
  2002-06-10 16:29   ` Dmitry Bely
@ 2002-06-10 18:19   ` Blair Zajac
  2002-06-11  9:23     ` Florian Hars
  1 sibling, 1 reply; 13+ messages in thread
From: Blair Zajac @ 2002-06-10 18:19 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: Caml Mailing List

Xavier Leroy wrote:
> 
> > Reading that the bytecode interpreter for Ocaml runs 2/3 as fast
> > when compiled with VC 6 compared to gcc, has anybody done any
> > timing comparisons with VisualStudio.Net, Intel C++ 5.x or
> > Intel C++ 6.0?
> 
> As others mentioned, the reason why gcc does a better job on the Caml
> bytecode interpreter is not that gcc generates better code all by
> itself (it doesn't), but that it supports "computed gotos" as a C
> language extension.  The bytecode interpreter takes advantage of this
> feature by replacing opcodes with the addresses of the code fragments that
> execute them, saving a significant amount of time in the bytecode
> interpretation loop.
> 
> Microsoft's C compilers don't support this extension, and I doubt
> Intel's compilers do, at least under Windows.  (Although I seem to
> remember that Intel's compiler for Linux implements gcc extensions.)
> 
> Someone else mentioned the explicit register declarations in the
> bytecode interpreter.  This is another gcc-specific extension, but
> actually the bytecode interpreter uses them to work around the poor
> register allocation performed by gcc (it fails to guess correctly
> which local variables of the bytecode interpreter are most critical
> and should end up in registers).  So, it's really a gcc feature used
> to work around a gcc deficiency :-)  Other C compilers might actually
> get the registers right by themselves.

Thanks for the info.

And do you recommend a particular program or set of programs to run
to get a general relative performance number for each compiler, or
does it really matter?

Blair

-- 
Blair Zajac <blair@orcaware.com>
Web and OS performance plots - http://www.orcaware.com/orca/
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Caml-list] Timing Ocaml
  2002-06-10 16:49     ` Lionel Fourquaux
@ 2002-06-11  8:28       ` Dmitry Bely
  2002-06-11  9:08         ` Xavier Leroy
  2002-06-11 12:52         ` Mattias Waldau
  0 siblings, 2 replies; 13+ messages in thread
From: Dmitry Bely @ 2002-06-11  8:28 UTC (permalink / raw)
  To: caml-list

"Lionel Fourquaux" <lionel.fourquaux@wanadoo.fr> writes:

>> Thank a lot for the explanation. But why then not to use inline asm
> for
>> MSVC, something like that:
>
> 	Because any fragment of inline asm disable a lot of
> optimisations in MSVC, and you end up with a much slower interpreter.

I see... But there is another solution: use C switch() operator in interp
main loop that is translated to jump table by MSVC optimizer (don't know if
gcc is capable to do this). A small example:

int f( int i )
{
  int j = 0;
  switch( i ){ 
    case 1: j = 2; break;
    case 2: j = 4; break;
    case 3: j = 8; break;
    case 4: j = 16; break;
  }
  return j;
}

cl -c -Ox -Fatest.lst test.c

	TITLE	test.c
	.386P
include listing.inc
if @Version gt 510
.model FLAT
else
_TEXT	SEGMENT PARA USE32 PUBLIC 'CODE'
_TEXT	ENDS
_DATA	SEGMENT DWORD USE32 PUBLIC 'DATA'
_DATA	ENDS
CONST	SEGMENT DWORD USE32 PUBLIC 'CONST'
CONST	ENDS
_BSS	SEGMENT DWORD USE32 PUBLIC 'BSS'
_BSS	ENDS
_TLS	SEGMENT DWORD USE32 PUBLIC 'TLS'
_TLS	ENDS
FLAT	GROUP _DATA, CONST, _BSS
	ASSUME	CS: FLAT, DS: FLAT, SS: FLAT
endif
PUBLIC	_f
_TEXT	SEGMENT
_i$ = 8
_f	PROC NEAR
; File test.c
; Line 4
	mov	ecx, DWORD PTR _i$[esp-4]
	xor	eax, eax
	dec	ecx
	cmp	ecx, 3
	ja	SHORT $L526
	jmp	DWORD PTR $L536[ecx*4]
$L529:
; Line 5
	mov	eax, 2
; Line 11
	ret	0
$L530:
; Line 6
	mov	eax, 4
; Line 11
	ret	0
$L531:
; Line 7
	mov	eax, 8
; Line 11
	ret	0
$L532:
; Line 8
	mov	eax, 16					; 00000010H
$L526:
; Line 11
	ret	0
	npad	1
$L536:
	DD	$L529
	DD	$L530
	DD	$L531
	DD	$L532
_f	ENDP
_TEXT	ENDS
END

Hope to hear from you soon,
Dmitry


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Caml-list] Timing Ocaml
  2002-06-11  8:28       ` Dmitry Bely
@ 2002-06-11  9:08         ` Xavier Leroy
  2002-06-11 12:52         ` Mattias Waldau
  1 sibling, 0 replies; 13+ messages in thread
From: Xavier Leroy @ 2002-06-11  9:08 UTC (permalink / raw)
  To: Dmitry Bely; +Cc: caml-list

> I see... But there is another solution: use C switch() operator in interp
> main loop that is translated to jump table by MSVC optimizer (don't know if
> gcc is capable to do this).

Dmitry, don't be naive: of course the bytecode interpretor loop uses
switch() if computed gotos are not available, and of course any C
compiler translates this switch() to a jump table.  But the jumptable
is still significantly slower than the computed goto trick, since it
involves one extra compare-and-branch and one extra memory load.

This discussion ("efficient bytecode interpreters") is getting
off-topic for caml-list, so please let's stop here.  If you're still
curious, the best way to understand the issues at hand is to stare at
the assembly code generated for interp.c :-)

- Xavier Leroy
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Caml-list] Timing Ocaml
  2002-06-10 18:19   ` Blair Zajac
@ 2002-06-11  9:23     ` Florian Hars
  0 siblings, 0 replies; 13+ messages in thread
From: Florian Hars @ 2002-06-11  9:23 UTC (permalink / raw)
  To: Blair Zajac; +Cc: Caml Mailing List

Blair Zajac wrote:
> And do you recommend a particular program or set of programs to run
> to get a general relative performance number for each compiler, or
> does it really matter?

As micro benchmarks, you might try the code from the shootout (linked to 
from the main ocaml page at INRIA), and then devise some application 
level benchmarks using (to cite some of the poster child apllications) 
Coq, FFTW, Hevea and Unison (if you can devise a setup that will not be 
IO-bound in the latter cases).

Yours, Florian

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [Caml-list] Timing Ocaml
  2002-06-11  8:28       ` Dmitry Bely
  2002-06-11  9:08         ` Xavier Leroy
@ 2002-06-11 12:52         ` Mattias Waldau
  1 sibling, 0 replies; 13+ messages in thread
From: Mattias Waldau @ 2002-06-11 12:52 UTC (permalink / raw)
  To: caml-list

SICStus Prolog had the same problem as O'Caml with VC++. They solved it
by first 
running VC++ and generating ASM-code.

Then they have a small Perl-script that rearranges the code, and at 
last they compile the assembler code using MASM.

This improved the performance with 20-30% and in some cases 100% 
(for very simple byte code instructions where the switch overhead 
is relatively larger, for example the fameous naïve reverse.)

/mattias

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2002-06-11 12:52 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-06-10  5:35 [Caml-list] Timing Ocaml Blair Zajac
2002-06-10  6:24 ` Chris Hecker
2002-06-10 12:02   ` Dmitry Bely
2002-06-10 12:50     ` Remi VANICAT
2002-06-10 14:19       ` Lionel Fourquaux
2002-06-10 15:01 ` Xavier Leroy
2002-06-10 16:29   ` Dmitry Bely
2002-06-10 16:49     ` Lionel Fourquaux
2002-06-11  8:28       ` Dmitry Bely
2002-06-11  9:08         ` Xavier Leroy
2002-06-11 12:52         ` Mattias Waldau
2002-06-10 18:19   ` Blair Zajac
2002-06-11  9:23     ` Florian Hars

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).