From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by yquem.inria.fr (Postfix) with ESMTP id 4422CBC37 for ; Thu, 3 Sep 2009 07:38:06 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ArIBAJ/ynkrRVdvZkGdsb2JhbACafT8BAQEBCQkMBxMDvmmEGwU X-IronPort-AV: E=Sophos;i="4.44,323,1249250400"; d="scan'208";a="32116659" Received: from mail-ew0-f217.google.com ([209.85.219.217]) by mail2-smtp-roc.national.inria.fr with ESMTP; 03 Sep 2009 07:38:06 +0200 Received: by ewy17 with SMTP id 17so1410985ewy.26 for ; Wed, 02 Sep 2009 22:38:05 -0700 (PDT) MIME-Version: 1.0 Received: by 10.216.86.139 with SMTP id w11mr358513wee.10.1251956285779; Wed, 02 Sep 2009 22:38:05 -0700 (PDT) In-Reply-To: <20090903131453.59a2d2e7.mle+ocaml@mega-nerd.com> References: <20090903111944.6479d156.mle+ocaml@mega-nerd.com> <4A9F2264.7000909@mcmaster.ca> <20090903131453.59a2d2e7.mle+ocaml@mega-nerd.com> Date: Thu, 3 Sep 2009 07:38:05 +0200 Message-ID: <7d8707de0909022238w3124a0a7v1756a577e8467f76@mail.gmail.com> Subject: Re: [Caml-list] Ocaml clone detector From: Andrej Bauer To: caml-list@inria.fr Content-Type: text/plain; charset=UTF-8 X-Spam: no; 0.00; ocaml:01 andrej:01 andrej:01 caml-list:01 parentheses:01 strings:01 parenthesis:02 explanation:06 clone:06 structure:07 bauer:09 bauer:09 student:09 comparing:11 except:11 As far as student plagiarism goes, we found out that for Java programs, you can pretty much detect frauds by erasing everything from the programs except parentheses ( ) { } and then comparing the remaining strings for editing distance. My explanation is that students who copy code don't want to spend much time on it. In order to chance the parenthesis they would have to understand the structure of the program, which they don't. Andrej