To: caml-list@inria.fr Cc: Subject: RE: CFP 2007 Third International Predictor Models in Software Engineering (PROMISE) Workshop Reply-To: caml-list@inria.fr Mime-Version: 1.0 Content-Type: text/html; charset=us-ascii X-Virus-Scanned: Symantec AntiVirus Scan Engine X-j-chkmail-Score: MSGID : 4580A4EB.002 on concorde : j-chkmail score : XXX : 5/20 1 0.000 -> 3 X-Miltered: at concorde with ID 4580A4EB.002 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; cfp:01 cfp:01 predictors:01 predictors:01 bug:01 model:01 model:01 desc:01 unpublished:01 menuitem:01 boehm:01 cty:01 fraunhofer:01 menuitem:01 gary:98

Hi ,

I would like to invite you to consider submitting a paper to the 2007 PROMISE workshop which will be held in conjunction with ICSE.

The CFP is given below.

Thanks!

Gary

Call For Papers (CFP) (ICSE related workshop): Third International Workshop -
PredictOr Models In Software Engineering (PROMISE)

Third International Workshop on Predictor Models in Software Engineering
                               (PROMISE 2007)
                         http://promisedata.org/2007/CFP.html
                         Sunday May 20, 2007
                      Minneapolis, Minnesota   USA

In conjunction with 29th Int. Conf. on Software Engineering
http://web4.cs.ucl.ac.uk/icse07/

Objectives
----------

As in any engineering field, realistic prior assessment
of the potential cost, problems, timing, performance,
safety, security, and numerous other properties of software
projects is essential for effective and efficient planning,
design, and implementation of those projects.

A mature engineering discipline needs to have a standard
set of predictive methods that practitioners can use, as
well as standards for interpreting the results of those
methods. To become widely accepted and used in the field,
models need to be validated on data from a wide range
of applications, in different development environments,
and with different reliability and performance goals.

The PROMISE workshop aims to broaden knowledge of
predictive models that have been successfully developed,
to provide a forum for the discussion of new models,
to provide a catalog of system data that researchers can
use to evaluate proposed models so that practitioners can
use these models to compare predicted results to their
own projects.

As a follow-up to last year's workshop, this workshop
focuses upon "issues and challenges surrounding building
predictive software models." Predictor models already exist
for software development effort and fault injections as
well as co-update or change predictors, software quality
estimators and software escalation ("escalation" predictors
try to guess what bug reports will require the attention of
the senior experts). However, in most cases they have been
presented in venues that cover a diverse set of interests.

Goals of the Workshop
---------------------
The goals of this one-day workshop are:

* To expand the current public repository of data sets
related to software engineering in order to conduct repeatable,
refutable or improvable experiments. Such an empirical
process is essential to the maturity of the field of
predictive software models and software engineering
in general. After only two years, the current PROMISE
repository already contains 24 data sets.

* To deliver to the software engineering community useful
and usable and verified models or methods:

o "Models" predict software properties of interest to
    21st century software practitioners. Numerous such
        models are already under development, including models
        that predict software quality, development effort,
        requirements/design/code
    traceability etc.

o "Methods" are learning systems for building particular
models for particular situations.

* To compile a list of open research questions that are
deemed essential by the researchers in the field.

* To show, by example, to the next generation of software
engineering researchers that empiricism is useful,
practical, exciting, and insightful.

* To bring together researchers and practitioners with
the aim of sharing experience and expertise.

* To steer discussion and debate on various aspects and
issues related to building predictive software models.

Public Data Policy
------------------

PROMISE 2007 gives the highest priority to case studies,
experience reports, and presented results that are based
on publically available datasets. To increase the chance
of acceptance, authors are urged to submit papers that
use such datasets. Data can come from anywhere including
the workshop Web site. Such papers should include the URL
address of the dataset(s) used.

A copy of the public datasets used in the accepted papers
will be posted on "The PROMISE Software Engineering
Repository. " Therefore, if applicable, the authors should
obtain the necessary permission to donate the data prior
to submitting their paper. All donors will be acknowledged
on the PROMISE repository Web site.

The use of publicly available datasets will facilitate
generation of repeatable, verifiable, refutable, and
improvable results, as well as providing an opportunity
for researchers to test and develop their hypothesis,
algorithms, and ideas on a diverse set of software
systems. Examples of such datasets can be found at

http://promisedata.org/repository

We ask all researchers in the field to assist us with
expanding the PROMISE repository by donating their data
sets. For inquiries regarding data donation please send
an email to mail@promisedata.org

Topics of Interest
------------------
In line with the above mentioned goals, the main topics
of interest include:

* Applications of predictive models to software
engineering data.

* What predictive models can be learned from
software engineering data?

* Strengths and limitations of predictive models.

* Empirical Model Evaluation Techniques.
o What are best baseline models for different classes of
    predictive software models?
o Are existing measures and techniques to evaluate
    and compare model goodness such as precision, recall,
    error rate, or ROC analysis adequate for evaluating software
    models? Or are more specific measures geared toward
    software engineering domain needed?
o Are certain measures better suited for certain
    classes of models?
o What are the appropriate techniques to test the
    generated models e.g. hold-out, cross-validation, or
    chronological splitting?

* Field evaluation challenges and techniques.
o What are the best practices in evaluating the generated
    software models in the real world?
o What are the obstacles in the way of field testing a
    model in the real world?
o How to overcome obstacles in the acceptance of
    predictive models in the real world?

* How to test the generated models?

* What are the obstacles in the way of field testing
a model in the real world?
o What predictive models are more prone to
model shift? (Concept drift).
o When does a model need to be replaced?
o What are the best approaches to keeping the model
in sync with software changes?

* Building models using machine learning, statistical
methods, and other methods.
o How do these techniques lend themselves to building
    predictive software models?
o Are some methods better suited for certain
    classes of models?
o How do these algorithms scale up when handling
    very large amounts of data?
o What are the challenges posed by the nature of data
    stored in software
    repositories that make certain techniques less
    effective than the others?

* Cost benefit analysis of predictive models
o Is cost-benefit analysis a necessary step in evaluating
    all predictive models?
o What are the requirements for one to be able to perform
    a cost benefit analysis?
o What particular costs and benefits should be considered
    for these models?

* Case studies on building predictive software models.

Benchmark Dataset Papers
------------------------

To encourage data sharing and/or publicize new and
challenging research direction, a special category of
papers will be considered for inclusion in the workshop.
Papers submitted under this category should at least
include the following information:

* The public URL to a new dataset
* Background notes on the domain
* What problem does the data represent?
* What would be gained if the problem was solved?

* Proposes a measure of goodness to be used to judge the
results; for instance a good defect detector has a
high probability of detection and a low probability
of false alarm.

* A review of current work in the field (e.g. what is
wrong with current solutions or why has no one solved
this problem before?)

* Description of data format.

Recommended format is Attribute-Relation File Format (ARFF)

http://www.cs.waikato.ac.nz/~ml/weka/arff.html

For an example of such a dataset see

"Cocomo NASA/Software cost estimation"

on the "PROMISE Software Engineering Repository"

http://promisedata.org/repository

   However, if ARFF is not an appropriate format for
   your data, please provide a detailed description of
   your data format in the paper. A guideline from UCI
   Machine Learning repository for documenting datasets
   can be found in

ftp://ftp.ics.uci.edu/pub/machine-learning-databases/DOC- REQUIREMENTS

   This information is placed before the actual data
   when using ARFF format. However, if you are using an
   alternative format that does not support comments in
   the dataset, provide this information in a separate file
   with extension .desc, and submit the URL of this file.

* Preferably some baseline results

Submission Process
------------------

Submissions are five to ten pages long (max). Papers must
be original and previously unpublished. SUBMISSIONS WHICH
INCLUDE EMPIRICAL RESULTS BASED ON PUBLICLY ACCESSIBLE
DATASETS WILL BE GIVEN THE HIGHEST PRIORITY.

Accepted papers and other materials for the Proceedings
must be revised to conform to IEEE style guidelines
defined at:

http://www.computer.org/portal/site/ieeecs/menuitem.c5efb9b8ade9096b8a9ca0108bcd45 f3/index.jsp?&pName=ieeecs_level1 &path=ieeecs/publications/cps&file=cps_format1.xml&xsl=generic.xsl &

Templates for submissions are found at:

* Latex: http://promisedata.org/2007/style/latex/
* Word: http://promisedata.org/2007/style/word/

Accepted file formats are Postscript and PDF. The details
of paper and data submission process are available at:

http://promisedata.org/2007/CFP.html

To submit papers:

* Email them to: 2007@promisedata.org
* Make the title of that email
"[SUBMISSION]: your paper title"

Each paper will be reviewed by the program committee in
terms of their technical content and their relevance to
the scope of the workshop, as well as its ability to
stimulate discussion. At least one author of accepted
papers is required to register and attend the workshop.

Prior to the workshop the accepted papers will be posted
on the workshop web page at:

http://promisedata.org/2007

This is to facilitate a more fruitful discussion during
the workshop.

Journal of Empirical Software Engineering: Special Issue
--------------------------------------------------------

Papers accepted to PROMISE 2007 (and 2006) will be
eligible for submission to a special issue of the Journal
of Empirical Software Engineering on repeatable experiments
in software engineering.

The issue will be edited by Tim Menzies.

Important Dates
---------------
Submission of workshop papers January 20, 2007
Notification of workshop papers February 10, 2007
Publication ready copy March 5, 2007

General Chair
-------------
Gary Boetticher Univ. of Houston - Clear Lake

Steering Committee
------------------
Gary Boetticher     Univ. of Houston - Clear Lake
Tim Menzies         West Virginia University, US
Tom Ostrand         AT&T

Program Committee
-----------------
Vic Basili         University of Maryland, US
Dan Berry          University of Waterloo, Canada, US
Barry Boehm        University of Southern California
Gary Boetticher    Univ. of Houston - Clear Lake, US
Lionel Briand      Carleton University, Canada
Bojan Cukic        West Virginia University, USA
Alex Dekhtyar      University of Kentucky, US
Martin Feather     NASA JPL, US
Norman Fenton      Queen Mary (U. of London), UK
Jane Hayes         University of Kentucky, USA
Jairus Hihn        NASA JPL's Deep Space Network, US
Gunes Koru         U. of Maryland, Balt. Cty US
Tim Menzies        West Virginia University, US
Martin Neil        Queen Mary(U. of London), UK
Allen Nikora       NASA JPL, US
Tom Ostrand        AT&T, US
Daniel Port        University of Hawaii, USA
Julian Richardson NASA ARC, US
Guenther Ruhe      University of Calgary, Canada
Martin Shepperd    Brunel University, UK
Forrest Shull      Fraunhofer Centre Maryland, USA
Willem Visser      NASA ARC, US
Elaine Weyuker     AT&T, US
Laurie Williams    North Carolina State Univ., USA
Marv Zelkowitz     University of Maryland, US
Du Zhang           Cal. State Univ., Sacramento, USA