From mboxrd@z Thu Jan 1 00:00:00 1970 From: bogus@does.not.exist.com () Date: Mon, 4 May 2009 19:42:15 +0000 Subject: No subject Message-ID: Topicbox-Message-UUID: 007235d4-ead5-11e9-9d60-3106f5b1d025 "To return to Knuth=92s paper: everything there---even input conversion and sorting---is programmed monolithically and from scratch. In particular the isolation of words, the handling of punctuation, and the treatment of case distinctions are built in. Even if data-filtering programs for these exact purposes were not at hand, these operations would well be implemented separately: for separation of concerns, for easier development, for piecewise debugging, and for potential reuse. The small gain in efficiency from integrating them is not likely to warrant the resulting loss of flexibility. And the worst possible eventuality eventuality---being forced to combine programs---is not severe. The simple pipeline given above will suffice to get answers right now, not next week or next month. It could well be enough to finish the job. But even for a production project, say for the Library of Congress, it would make a handsome down payment, useful for testing the value of the answers and for smoking out follow-on questions." Jason Catena