2006-11-09

BPMN jottings

In re:
http://www.bpmn.org/

BPMN is something I'll be spending more time on in the day job, and I've just started looking at it. Some notes:


  • Looks a lot like UML activity diagrams. That's a good thing. More expansive icon set for their activites and notifications, which is conceptually extremely helpful, although potentially just syntactic sugar as far as the underlying formal model is concerned. Or maybe not: I'll find out from the spec.
  • These guys at Potsdam Uni are algebraically manipulating BPMN diagrams to reason about them, using an algebra they call pi-calculus. The wiki is spotty, but at the least they're doing soundness checks. This is very cool, and a result of the same formality of such graphs that enables them to be turned into working code. Process diagram's ain't just napkin fodder.
  • The Potsdamites use OmniGraffle to generate their BPMN diagrams, and applescript it to generate XML; the XML is what they feed to their pi-calculus engine (in Ruby). Tres nifty again. But it highlights a problem with the napkin-to-xml translation: no standard XML tagset. Actually, that is a malicious and repulsive lie; of course there is an XML, BPEL4WS --- but that involves translation, not just restyling; and the BPMI says in their spec of BPMN, sect. 2.3, that they intend to create a diagram exchange format between tools, which may be an XML or an XMI -- they just haven't yet. In the meantime, we're left with generic diagram exchange formats. For my two target apps, Omnigraffle and Websphere, that means VISIO XML. I'd hate to think VISIO XML becomes the de facto standard; at any rate, my first attempt to go from Omnigraffle to Websphere via VISIO XML failed. I'll come back to this, and I may well be buying a vowel or two from the Potsdamites' applescript.
  • Websphere costs a lot more than Omnigraffle, so you'd hope it does more. And it certainly does. Of course, the diagrams Omnigraffle produces are of a transcendent beauty immanent in their Macness. On the other hand, live syntax checking of your BPMN diagram in Websphere as you draw it? You gotta love that. There'll be a lot of stuff to explore over in that package over the next few weeks. But remember: if you don't have at least 1GB of RAM on your PC, don't even bother: it's 800 MB of Java Virtual Machine goodness. (It seems to be running just fine on Parallels on my MacBook, but I've juiced it up to 2GB.)

The perishability of Word

In re:
http://ptsefton.com/blog/2006/11/08/self_preservation_1

Peter Sefton's trying to recover his 1994 Word thesis into a sustainable document format, and migrating from 10 year old Word formats and media is no fun at all. He's right: act now, while Mac Classic is still somewhat accessible. Been there, doing that again soon with my PhD (Word 5, 1998). I did styles like Pete did, so I was somewhat virtuous, but I did go somewhat ape, so I'll be making life difficult for myself anyway.

I have two major problems Peter didn't. One, I used Endnote 4. Proprietary bibliographical software which didn't migrate well: the author names in the Endnote library itself autovanished long ago, and there was a serious compatibility issue resulting in Endnote not talking to the migrated version of the document. I've decided to cut my losses, go with Bookends as biblio software (more proprietary software, but I'm not switching to TeX in a hurry), not bother about migrating, and convert the version of the thesis with the Endnote references spelt out. Problem here is, Endnote 4 used control characters to delimit references, which when you migrate the Word file turn up as ugly splotchy fields. Fields you cannot globally find and delete -- you cannot search inside the field for text, so you'd end up deleting all fields. And I don't want to do that, because I occasionally used fields in mathematical typesetting, to get diacritics positioned correctly. *snarl*

Second problem is the thesis predates Unicode -- or rather, Microsoft allowing Unicode into the Mac version. So lots of non-future-proof 8-bit fonts: Ismini for the Greek, SILDoulosIPA 93 for the IPA, TimesDiacrit for Latin-2 characters, and (because I went ape) the occasional instance of Arabic, Hebrew, Cyrillic, and Linear B. Lots of tedious global replaces. And some hurdles:

* Word 2004 will import the Word 5 files, but is UNUSABLE on a MacBook.
* Word 2004 will do Unicode alright, but it will not even display SILDoulosIPA 93: turns it to blank squares.
* NeoOffice is usable on a MacBook, but OpenOffice has forgotten so far to implement "replace in all open documents". We're talking 10 documents here. This means macros.
* NeoOffice LOSES the font information for 8-bit fonts. And yes, I used styles, but I didn't use character styles (the main reason being that char styles weren't supported in Word 5). Which means I'll be opening these files in Word 2000 (so I can still see the 8-bit fonts), globally replace each font with a different colour, and work off global replaces based on the colours in NeoOffice. (I just did that with someone else, and the colours didn't always come through; maybe I'll try char styles after all instead.)

You can see why I've been putting this off for so long. But again: a couple of years from now is probably too late. A couple of years ago, as a research assistant, I was asked to recover a file of Don Laycock's from Word for DOS 2 -- it was a published dictionary of a Papuan language, but we couldn't grep a dead tree. Nothing on campus would read Word 84 -- Microsoft had taken their converter offline months before, and was showing no inclination to put it back up. The only way I was able to get anything out of it was ... opening it in Word 5, minted in 1991. And in a couple of years with Classic going extinct, even that will be impossible. Needless to say, the IPA font Don had used was unrecoverable and long gone; I ended up having to infer the engmas by elimination.

Yeah, proprietary, binary Word processing formats really do bite. Thank God I went easy on the diagrams, the preservability of old MacDraw PICTs is even worse...

2006-11-08

The Complutensian Polyglot, ahead of the times

In re:
http://www.supakoo.com/rick/ricoblog/Permalink.aspx?guid=873cc194-46b8-4dca-a0a8-d6ab8b688a3b

As I had added into the Wikipedia entry, the Complutensian Polyglot edition of the Bible in the 1520s marked the highpoint of the initial trend in Greek typography to come up with an unconnected Greek typeface. By the time of the Complutensian, 40 years in from the first attempts of the 1470s, the results were beautiful. Around that time, Aldus Manutius decided to go with the contemporary cursive as the model for both his Roman and Greek typeface; and everyone followed suit for the next couple of centuries of Greek. Now (as I've seen in a typographer's blog someplace), this made commercial sense --- Aldus used the bookhand his scholarly audience was familiar with from their manuscripts; and the results for Roman script were the beauty of italics. The results for Greek was squiggle, and by the 19th people was considered ugly. (That's because it *is* ugly.)

So as typographers tried to distance themselves from Aldus' typeface, there was a trend to try to go back to a lost ideal of Greek typography, nicely commented on in John Bowman's paper (in Greek) on British typography of Greek. The Complutensian begat Robert Proctor's Otter Greek font (see p. 158 of Bowman); Otter Greek begat Scholderer's Neohellenic (cf. GFS Neohellenic); Neohellenic begat Athenian font; and the Complutensian again begat the Greek Font Society's GFS Complutensian Greek, which I'm informed is planned for release by the Greek Font Society next year. (See also the enlightening thread on the Typophile blog.)

The point is, the Complutensian has long been fetishised as a lost ideal of Greek typography, and I wanted to get me some. I've just received a 500 MB pdf of the PDF, and it contains a surprise I hadn't noticed. But first, a brief comment on what it looks like.

The Complutensian is pretty well described in a blog entry by Rick Brannan (ricoblog), and I suggest you pop out to it before continuing.

OK, you're back. :-) It's apparent from the gifs Brannan provides, but it truly hits you when you see the pages; the Old and the New Testament look totally different. The Old Testament looks impressive, and is quite a technical feat; but it does not look pretty. It's very busy, for one:




Septuagint Greek (with interlinear Latin)Vulgate LatinHebrew
Targum Onkelos Aramaic paraphraseLatin translation of Targum Onkelos


Some malicious bishop commented that the Vulgate text looks like Jesus with the two thieves crucified either side of him, and I can see why now. The Hebrew is Hebrew; it does look out of place next to the Latin, which is inevitable, although I'm not familiar enough with Hebrew script to tell if it's a good looking Dysmas. The Septuagint gets to be Gestas, the Bad Thief. The Septuagint column is not evil per se, and it's very utilitarian, but it's also quite messy: the Greek's in squiggle, the interlinear Latin's in a Bastarda that crowds out the spindly Greek it's meant to be a crutch for; ick. In the middle, the Latin's in a gorgeous, self-assured Antiqua. The Vulgate wins.

Zooming forwards to the New Testament is a shock to the eyes; it's sort of a Darien moment. Just two simple columns: no prima donna in the centre. The Latin's back in Bastarda, but it's a Bastarda that's been given room to breathe, instead of tripping over interlinear squiggle; and at full size, it's quite elegant. The shock of course is the Greek. It is simply gorgeous.

But the real shock is when you zoom in. (You can see it in the first gif on Ricoblog, but you have to click to enlarge and concentrate). The Complutensian typeface, the pinnacle of early Greek typography, the Eden from which Aldus' serpentine Greek expelled us and which has haunted several 20th century typographers, the bestest Greek font ever...

... is monotonic.

Seriously. No circumflexes or graves; no accents on monosyllables; no iota subscripts; no smooth breathings. There are rough breathings, but they're actually displaced to the left of the vowel, as they are normally on capitals; the Complutensian's pretty much treating them as letters not diacritics. (You can see it in the Ricoblog gif, ῾υπέρ, second line from the end.)

That's a shock alright. And it's a deliberate aesthetic choice: Jimenez' Spaniards certainly knew about accents, and their squiggle font in the Septuagint is drenched in them. The forerunners of their typeface -- da Spira and Jenson in 1470 -- used accents (see Zapf's paper on the history of Greek typefaces, p. 6). It's like the Complutensians said, we're designing the most beautiful Greek leters ever --- and we say we have no room on top of those letters for distracting squiggles. It's a deliciously bold decision.

2006-11-07

Ο Νικολάου Τοναμύντωρ

In re:
http://www.sarantakos.com/language/l-akrotites.html

Ξαναδιάβαζα χτες τις σελίδες του φίλου Νίκο Σαραντάκου για τη γλώσσα, μεταξύ των οποίων και καταδίκες για το φαινόμενο της υπεράσπισης του πολυτονικού, ως το πιο πρόσφατο επεισόδιο στη διαμάχη γλωσσαμυντόρων και... άλλων γλωσσαμυντόρων. (Για να μην ξεχνάμε την εύστοχη παρατήρηση του Πήτερ Μάκριτιζ, πως η λογοτεχνική δημοτική δεν ήταν λιγότερο τεχνητή γλώσσα εν τέλει από την καθαρεύουσα.) Και παρότι συμφωνώ εν πολλοίς με το 40κο, βλέπω ότι στο θέμα του πολυτονικού, περνάω όλο και περισσότερο στη συντήρηση, καταπώς κάνω και σε κάποια άλλα κοινωνικά θέματα (π.χ. μοιχεία --- η σειρά "Και οι παντρεμένοι έχουν ψυχή" με κάνει μπαρούτι κάθε που το βλέπω· και μην αρχίσω για το ρεσιτάλ δεοντολογίας και κοινωνικής ευθύνης που αποτελεί το "Θα βρεις το δάσκαλό σου"). Για να γίνω σαφής: συμφωνώ ότι η δημοτική όπως τη γνωρίζουμε με πολυτονικό δεν γράφεται -- ή μάλλον γράφεται με ικανή αυθαιρεσία, μέχρι να αποφασίσεις τι θα πεις μακρό και τι βραχύ σε μια γλώσσα που μακρά και βραχέα δεν σκαμπάζει. Αλλά όταν βλέπω αρχαία σε μονοτονικό (κάτι που ο 40κος κάνει πειραματικά, αλλά που είθισται πλέον), ξενίζομαι. Και το ίδιο εν παρόδω νοιώθω για το μικτό λόγο των πρώιμων δημωδών λογοτεχνημάτων. Αν είναι αυθαίρετο να κάνουμε το <ούζο> <οὖζο>, αυθαίρετο είναι να κάνουμε και το <ᾦ> <ώ>.

Και αυτό νομίζω απορρέει από το ιδιάζον της γενιάς μου. Είμαι στην πρώτη γενιά που αποποιήθηκε το πολυτονικό στην εκπαίδευση --- το '81 ήμουν Ε! δημοτικού, και το άγιος και αγνός μετά χαράς αποποιήθηκα. Συμμεριζόμουν τον τρόμο της προηγούμενης γενιάς για την αντιδραστική και νεκρωμένη καθαρεύουσα --- δεν μπορούσα καν να διαβάσω την καθαρεύουσα του Νικολάου Πολίτη χωρίς δυσφορία. Αλλά λίγο η τριβή με το TLG (όπου μετά χρόνια ξανάμαθα το πολυτονικό, και έπρεπε και να το επιβάλω στον έλεγχο των κειμένων), λίγο η έκθεση στην καλαίσθητη καθαρεύουσα του Χατζιδάκι (καλαίσθητη στη σύνταξη, γιατί ειρμό τα κατεβατά του δεν έχουν ούτε για δείγμα), και πολύ η απουσία από την Ελλάδα, μου αναχαίτησαν την παλιά δυσφορία. Τους τόνους τους βρίσκω τώρα χαριτωμένους, αν και όχι σε βαθμό να τους χρησιμοποιώ τακτικά στο νεοελληνικό μου λόγο: γι' αυτό και η προμετωπίδα είναι πολυτονική και Μπόστεια ("ὁπουτζοῦ"), ενώ τα επιμέρους άρθρα είναι σε μονοτονικό.

Τη δυνατότητα να βλέπουμε το γλωσσικό μας παρελθόν ως χαριτωμένο μάς την έβλαψε και η επιβολή του αρχαΐζοντος ιδιώματος, αλλά και η πολιτική στροφή του γλωσσικού προβλήματος: η χρήση πλέον αρχαΐζοντος ιδιώματος από νεοέλληνα είναι πολιτικά βεβαρημένη. (Πάντως ευγενέστατη βρήκα τη συμβολή του Τέττιγα στο Ιστολόγιον, όπου σωστά κατακεραυνώνει τις συνήθεις ατοπολογίες για την πενία της νεοελληνικής, αλλά το κάνει... στα αρχαία. Και χαλάλι του το μονοτονικό.)

Σκέφτομαι το ευφυέστατο ιστολόγιο του ψευδο-Τσώσερ, Geoffrey Chaucer Hath A Blog, και αναρωτιέμαι αν είναι καν δυνατόν την σήμερον να γράψει κανείς ανάλογο στα ελληνικά. Ο ψευδο-Τσώσερ, η ψευδο-κουνιάδα του και ο ψευδο-ντε Μαντεβίλ διακωμωδούν την Πάρις Χίλτον και τον πόλεμο με το Ιράκ, ή τη θρησκοπληξία της εποχής τους, και είναι τρομερά αστείοι. Αστείοι, γιατί η γλώσσα αποτελεί σημαίνον και σ' αυτήν την περίσταση (πώς λέμε "το μέσον είναι το μήνυμα" κατά ΜακΛούχαν; και ο κώδικας είναι το μήνυμα, με την έννοια πάντα του Jakobson -- πομπός , δέκτης, κανάλι, κώδικας). Aλλά το σημαινόμενο της Τσωσέρειας αγγλικής ως κώδικα στο σημερινό αναγνώστη δεν επεκτείνεται πολύ πιο πέρα από το «γράφτηκα το 1400»· οπότε το χιούμορ του απροόπτου και του αναχρονισμού προκύπτει αβίαστα. Αν κάποιος ιστολογήσει περί Τατιάνας Στεφανίδου και Πανίκου Ψωμιάδη, μην πω με τη γλώσσα και το προσωπείο του Πλάτωνα ή καν του Άη Παύλου (τρομάρα μας), αλλά έστω του Πτωχοπρόδρομου, θα πάει ο νους μας απλώς στον αναχρονισμό; Νομίζω πως όχι. Είτε θα σκεφτούμε «αντιδραστικός» -- το πολιτικά βεβαρημένο σημαινόμενο της διγλωσσίας· είτε θα σκεφτούμε Μποστ -- διότι Μπόστ είναι το προηγούμενο της χιουμοριστικής χρήσης αρχαΐζοντος ιδιώματος, αλλά το χλευαζόμενο δεν είναι το περιεχόμενο του μηνύματος, μα πάλι ο κώδικας: η ελληνικούρα, η γλωσσική έπαρση. (Αποτελεί αντιπαράδειγμα ο Αστερίξ στα Αρχαία; Δεν κάνω ρητορική ερώτηση.)

Αν αυτό δεν ισχύει, τότε όντως έληξε πλέον το κεφάλαιο «γλωσσικό ζήτημα». Και για να κάνω τη σύνδεση με το τοναμυτορλίκι μου --- μπορώ τώρα να συμπαθώ μια πολυτονική που απαλλάχτηκα, διότι κάπως για μένα τουλάχιστον έχει αφαιμαχθεί πλέον ο καβγάς αυτός...

Ορολογία για ψηφιακές βιβλιοθήκες

In re:
http://conference.lis.upatras.gr/topics.php

Έπεσα τυχαία στην παραπάνω σελίδα καθώς έψαχνα τα του νέου μου επαγγέλματος. Καράφλιασα με την καταχώρηση:


* Οντολογίες (Ontologies)


... !! Και επίσης άσχημο μου φάνηκε το αμετάφραστο Tutorials. Όσο για το "Μάνατζμεντ" βιβλιοθηκών αντί διαχείρηση --- νισάφι! Βέβαια μετά μια ματιά στα μιξοεγγλέζικα του Μάξιμ ελληνιστί (ε, ο θεός να τα κάνει ελληνιστί αλλά τέλος πάντων), δεν δικαιούμαι να 'χω και πολλές αξιώσες. (Μου φαίνεται, ή τα εληνικά του πάλαι ΚΛΙΚ ήταν πιο πηγαία;)

Μάλιστα η ορολογία είχε και τα ωραία του: μ' αυτές τις εξαιρέσεις, φαίνεται συνειδητή προσπάθεια να παραχθεί ελληνική ορολογία στο πεδίο. Ιδίως μου άρεσε η απόδοση του Institutional Repository: "Ιδρυματικό Αποθετήριο". (Κρίμα βέβαια η ομοιότητα με το "αποχωρητήριο... :-) )

2006-11-06

Thoughts on permanent identifiers

In re:
http://ptsefton.com/blog/2006/11/01/repository-maintenance


Some random thoughts on permanent identifiers (my day job), triggered from Peter Sefton's post above.



  • The HTTP proxy address to resolve a Handles (or whatever else) permanent identifier for a resource is binding the permanent identifier to a particular protocol (HTTP) and particular host ( hdl.handle.net, arrow.monash.edu.au, whatever). This has the advantage of actually working in the current web infrastructure, which a URI based on Handles (or whatever) does not. This is turn links up with Norman Walsh's contention that "if I want DNS I know where to find it" --- i.e. why come up with and fund a shadow to DNS in Handles (or whatever), when DNS is already working. That's a question I'm not getting into yet, but it is true enough that HTTP addresses are real, and hdl: URIs (or whatever) are currently not outside a very small number of browsers.

  • However, there is nothing permanent about an HTTP link to begin with -- that's the whole point of having a persistent identifier that isn't a URL. After all, there may not always be an HTTP; and HTTP URLs as is have a half-life of what, six months? As Sefton points out, there may also not always be an arrow.monash.edu.au, so rewriting URLs containing arrow.monash to something else is a big risk. A plus of having a national infrastructure for identifiers would be that, while there may not always be an arrow.monash (or even, heavens forfend, a Monash), there will always be an Australian Government(*), and one can expect the Australian Government to always be able to resolve those identifiers.

    * NOTE: by "always", I mean of course "next few decades". I'll save the "I'm laminating my papers and burying them in Spitzbergen" tirade for another time.

  • So a couple of things I think should happen (right now, a week into the job, and with no idea of what I'm talking about) are



    1. While having the Handles-resolving URL at your HTTP proxy (http://hdl.handle.net/<HANDLE> or http://arrow.monash.edu.au/<HANDLE> ) is a good and valid and practical thing, it's not a persistent identifier itself; just a link to one. Argal, the digital object should include a Handle URI, distinct from the HTTP link, for future-proofing's sake. Similarly, people should be encouraged to cite the Handle URI, as well as or instead of the URL. After all, HTTP proxies can change (and will, and will be autogenerated from your repository). But the data itself should bear and contain its permanent identifier, which should travel with the digital object to wherever it ends up. To recover the <HANDLE> from the proxy URL requires that I know where the proxy ends and where the handle begins. Since a Handle can contain more than one slash, it ain't unambiguous: given http://example.com/hdl/77/99 , I cannot know whether the handle is hdl/77/99 (naming authority: hdl) or 77/99 (naming authority: 77). And knowing which Handle proxy servers were around at the time the URL was minted shouldn't be necessary for me to recover the identifier.

    2. We may have a national infrastructure for Handles (or whatever), but that need not mean national-level management of the Handles. It would be pointless to make a request to Canberra every time a repository in Australia needs to register a new object --- even if the request is instantaneous and light enough not to require human intervention. One of the unsung assets of the Handles system is that individual fields of the Handle record can be managed by different administrators. To me, that means a federated identifier infrastructure; Canberra can override and step in in case of emergency or disaster, but the day-to-day management of identifiers can stay with the repository managers who actually know what's going on in their repository.

    3. Accordingly, the national identifer should make migration of permanent identifiers possible: if a naming authority is dissolved, the national-level identifier management should either pass on the naming authority to some other institution, or take over the naming authority itself. If there's no such guarantee, the identifiers are not permanent. (That is assuming there will always be an Australian government, for which see above.)



  • I agree with Peter that the browser should (for RFC 2119 values of "should") display a Handles-like rather than VITAL-like URL, since the VITAL URL is not even a shadow of a permanent identifier. A common URL format is also a "should". But without minimising the importance of getting the HTTP links migratable, I still think it's the Handles URI inclusion that is the "must".






The "Wherefore Identifiers" post that preceded the above on Pete's blog is more of a challenge; the Norman Walsh riposte and Pete's query on full-text local names made me forget who I was and what I was doing here. I'll come back to it when I have more time and less confusion...

2006-10-31

... to the Alexandria you are losing.

In re: http://cavafis.compupress.gr/kave_20.htm

So at my last day at Melbourne Uni (16 years all up, but who's counting), I adjourn for one last beer or two with a couple of colleagues at the Lincoln, have some very fancy bangers and mash, and hie me thence at 9:30 so I can make it to Readings before closing time, to redeem the gift voucher I managed to talk my "chers collégues, cari colleghi, liebe Kollegen, kjara kollege, xaverim yekarim, дорогий колеги, αγαπητοί συνάδελφοι" into donating. One collected set of Shostakovich symphonies and Bach's collected Passions later, I wander back out onto Lygon St, to make the trek home. On the opposite side of Lygon St, by the entrance of the food court, there's an old man playing Bach on the violin. Not just any Bach, either; he's playing the partita, although he's got a while yet to get to the Chaconne. I've heard him before in the CBD; he's not the best of violinists, but with this, it really is the thought that counts.

Bach partitas at 10 pm as the restaurants close shop. That's what I'm leaving behind...

Review: Pulier & Taylor. Understanding Enterprise SOA.

In re: http://www.manning.com/pulier/

I understand it, enough already! The content of the first half of the book could have been done in twenty pages, and I'm not convinced there's much in there that isn't already in Wikipedia. The authors admit to having committed hype in their past, and their hyping of SOA is still more obvious in the book then they would have liked. Their inclusion of a "Savvy Manager" paragraph to point out problems with SOA is a conscious attempt to redress this bias, but it doesn't go very far.

Ironically, the most concretely useful stuff in the book comes in what should be the fluffiest -- not Part 1, which is supposed to be the technology of SOA but is repetitive hype, but Part 2, which is the business use case study. Not that it couldn't have been done in 20 pages as well, but the contextualisation in a case study and the specific guidelines are welcome. Ch. 13 for example goes through the lumping together of services into a service map, by domain (services associated with a major functional area of the business), and by actor (which processes will the same actor be invoking --- group all of these together in a portal, rather than implementing the separate applications as standalones). Identifying recurring services turns out as expected to be a compelling argument for cutting costs.

To my surprise, there is even a technical point nestled within the business case study: Ch 15.2 --- programming with SOA means you have to concentrate ahead of time on reusability (which is the point of the modularity). (This is natch antithetical to Agile programming, where you don't introduce features you don't immediately need --- like reusability.) This even leads to a revision of the software requirements process: after drafting the software reqs, brainstorm alternative use cases with stakeholders, and spec a web service that will cover these alternatives as well; *then* proceed to develop. ("Developing web services takes longer than developing traditional software. There is simply no way around that.") So to make the service reusable, you incorporate potential uses from stakeholders along with the concrete requirements you already know of from your client.

This is a book for management rather than techos, and it shows. (I miscalculated in the purchase, because the books I was familiar with from Manning's Ottoman Costume series were programming manuals.) Half the book is taken up with the case study narrative, and the beginning half gets very repetitive in going through scenarios hey-presto resolved through SOAs, with virtually no specifics of how it all happens. The book is antithetical to _Mastering your Organisation's Processes_: that book also does case studies, is addressed to management, and evangelises for a solution to business process problems --- but it is hard-hatted, with lots of specifics and concrete guidelines, and real scepticism where appropriate. This book does a little of that in the second half, but still not quite enough to measure up to the other.

In all, services oriented architecture is not that revolutionary an approach --- just modularity writ large, with open-standards SOAP glue rather than proprietary Enterprise Application Integration. I can still envision the specific standards being advocated --- the HTTP protocol, the SOAP, the monitoring of services through SOAP interceptors --- being a transitory matter rather than the thousand-year enterprise solution the book stumbles over itself for. The benefit of service oriented *approaches* to systems design lies only in their emphasis on services as modules, and the protocols for them to interact; what these modules look like, and even whether they remain truly modular, or glued over the web, is incidental. Or at least, contingent. In that regard, e-Frameworks has chosen its emphasis prudently.

(Yet again, a point lost in the first half of the book, and made in the second, Ch. 16: not only must a real-world SOA deployment prioritise which functions to turn into services first, but there are some functions that it makes no sense to turn into services --- they'll never get reused, they're working just fine, they're too tangled up in the current business logic. In fact, implicit in the criteria proposed for which functions to prioritise for SOA --- migration likelihood, isolation from other functions, flexibility and reusability --- there is an underlying criterion implicit, though its correlation is not 100%: complexity. Or as the authors put it, simple business logic functionality. The more tangled a function is, especially in terms of program logic, the less easy it will be to pull out into a *reusable* module, the less flexible it will be to disparate use contexts, and the less likely anyone is going to want to reimplement the whole thing as a migration in the first place. Which has a bearing on establishing the granularity of services: they won't be 2 + 2 level, but they will still be more fine-grained than the minimum.)

The book does have one cautionary note it sounds enough: security concerns make everything more difficult, and SOA is not at a stage yet where that will take care of itself.

The performance hit of packing and unpacking XML is also non-negligible, and something I've seen first hand with my attempt to insist on the X(ml) of AJAX in my interface to the Thesaurus Linguae Graecae lemmatiser. (XML decoding brings Safari to its knees for a big or complex enough chunk of XML; writ large the same can happen in an enterprise, and Robby Robson's scenario of a new identifier minted per microsecond is not going to fly with real-time SOAP.) A web service can be all things to all comers with a transparent WSDL; but there needs to be a business case to be made that it should be, since there are penalties for opening things up like that: not just performance, but accountability as a maintainer --- managing an Open Source toolkit is no sinecure.

And since I'm not immune to style --- a thousand times the Very British, martini-dry humour of _Mastering your Organisation's Processes_, over the Dorothy Dixers in the starting paragraphs of each chapter here.

Review: Robertson & Robertson: Mastering the Requirements Process

In re: http://www.amazon.com/Mastering-Requirements-Process-Suzanne-Robertson/dp/0201360462

A nicely methodical textbook, with overview, step-by-step breakdowns, and some needed contextualisation. The authors are sympathetic to agile development, and tailor their advice to analysts going down that path; but they recognise the tension between agile development and explicit requirements, and insist that requirements come from the whiteboard, not the keyboard, even if you needn't produce a form document of requirements at the end. (So agile projects do pen scenarios, because they need to understand the reqs; they just don't turn those scenarios into formal functional requirements, because understanding needn't mean writing up.) More generally, repeated emphasis on abstracting the requirements spec from particulars of implementation, and modelling enough of the business to discern how the product will fit into the ecology (although that's closer to O'Connell et al.'s worldview).

The requirements specification has a very useful way of identifying application scope from your business analysis:

* A business event is a single external stimulus (from an adjacent system, be it actor or automated), which triggers a response: a finite sequence of activities within the work (the business). This response is a business use case; as much as possible, it is autonomous from the other business use cases in the system.

* A system carries out several use cases, just as a Service Usage Model has several columns.

* The analyst gets to choose the scope of the business use case: narrowly within a computer system, or preferably closer out to the actor. The more stuff going on is encapsulated by the business use case, and the blacker the box as far as the actor is concerned, the more flexibility in design is afforded.

* The *product* use case scenario is the stuff going on in the business use case that you choose to automate.

So a business use case is a single activity diagram, with a single trigger as input. Its implementation will involve products with interfaces: these products would be service expressions in e-Framework terms, or sequences thereof, as Rehak points out. They also involve wetware and piping binding the service expressions into a column of the Service Usage Model; that can and will be outside e-Framework scope. So a service expression specifies a module of working code; and working code (either a single module, or a concatenation thereof) is an implementation of a product, which carries out a subset of the activities specified as a use case within the business process --- but do not necessarily exhaust it: the system still involves actors, after all.

The size of the product --- how much of the business use case, the process, it incorporates --- is a design decision, and would not be deterministic. The discovery of the entire business system, including the adjacent systems to interact with it, enables a well-informed decision on product scope. But the scope of the service expression is a judgement call:

"To make decisions about the product boundary for this business use case, we need to define the constraints. [These are physical and policy constraints which are going to be specific to the business.] We also need input from the stakeholders who understand the technical and business implications and the possibilities for the product boundary along with the business goals for the project." (p. 151)

Note (and I was surprised to realise this, but shouldn't have been) that the activities in the use case to be incorporated into the product need not be sequential. The example given is of an airport passenger check in. Use case (done as scenario):

1. Locate reservation in the system.
2. Identify passenger.
3. Check passport.
4. Attach frequent flyer points.

The product goes from 1 to 4; it does not do 2 and 3, which are the business of a different system, talking to different databases with different authorisation. So the choreography of services into a product is a matter of timing (1 > 2 > 3 > 4) and interface (2 -> output -> 3); but the identification of distinct services is only a matter of interfaces: when in the use case the product is going to get hold of the data doesn't matter to what services go into the product. I know how to do 1 and 4; 2 and 3 are an external, adjacent system I will put myself on hold for.

So no neat mapping from business process map to services map. Well, that's no surprise, but makes life more interesting.

2006-10-26

The tale of φαῖο

In my capacity of working on the lemmatisation of Greek for the Thesaurus Linguae Graecae project, verbs are much more of a hassle than nouns, because Greek verbs just have more latitude to do idiosyncratic stuff than nouns. The running joke with Greek verbs, in fact, is that there is no such thing as a regular verb; even λύω, so favoured in textbooks as an exemplar (because its root ends in upsilon, one of the few consonants or vowels not to cause grief when it is juxtaposed to the tense suffixes), has variable length on that upsilon depending on the tense. However, the truly, insanely, every person is its own story verbs --- i.e. the athematic irregulars: εἰμί, εἶμι, ἠμί, φημί --- had already been covered by the lemmatiser I've been elaborating; and since they require a lot more deep Greek than I'm comfortable with, I've usually managed to steer clear of them. It doesn't help that these verbs are so irregular, that the lemmatiser deals with them in a completely different way from other verbs: they're basically treated not as stems + a class of suffixes, but each as their own set of suffixes, from scratch.

And so it was that a perfect storm of irregularities had me scratching my head for a full hour with a single verb in Moschus. Not a long verb, actually a rather short verb; which in Greek hurts you rather than helps you, because you can't, like, buy a vowel to work out what's going on. In fact, my problem was there were too many vowels. The verb was φαῖο, and it shows up in the following passage

αὐτὰρ ὃ μειλίχιον μυκήσατο· φαῖό κεν αὐλοῦ
Μυγδονίου γλυκὺν ἦχον ἀνηπύοντος ἀκούειν
(Moschus, Europa 97-98)


I have no idea what phaîo means. Neither does my lemmatiser. The thing doesn't look like a verb; but it looks like an Ancient Greek noun even less. I look for dictionary entries sarting with φα- that might help me make sense of this; no such luck. I leaf through my Smyth and (paper) Kühner-Blass grammars --- respectively the basic English and ludicrously detailed German grammar of Greek; no go. Now, in the cis-webic age, Google knows all, so I pop into the Project Gutenberg translation of Moschus to see if it would help; Moschus is writing bucolic Greek --- meaning Doric or Aeolic mooshed with Epic, and damn me if I have any idea what the mooing is about. The Gutenberg edition of the bucolic poets is old and out of copyright enough to be unhelpful at times -- Theocritus 30 is completely missing (and in the other Gutenberg Theocritus, the translator pretends it's addressed to a girl); but the Moschus passage has made it, moo and all:

Then he lowed so gently, ye would think ye heard the Mygdonian flute uttering a dulcet sound.


Nice to know bestiality wasn't as big a deal to the translator. OK, so φαῖο means "you think". Nope, still not getting it. I spend half an hour, uninformed Modern Greek speaker that I am, trying to yoke it to φαίνομαι "seem"; but there's no alchemy that's going to make that nu completely disappear absent a sigma to knock it over. (Keep that sigma in mind, it ends up with a candlestick in the conservatory.)

In desperation, I ask TLG central for a clue; and TLG central sends me the variant readings of the verb in their edition. The variants in the manuscripts are φαῖε, φαίε, and φαίης. Now, I don't know what φαίης means either, but the lemmatiser does: it's the optative active 2nd sg of φημί, "say"; so "you would say". "You would say" means pretty much "you would think"; the same happens in Modern Greek (λες και). Bell goes off in my head, I look at the verb table for φημί in my grammar, and I attain enlightenment.

Here's the perfect storm:



  • First up, the optative is reasonably rare in Greek to begin with, and died early; so with my slapdash knowledge of Ancient Greek, it would be easy for me to have not clicked to it.

  • Second, we have in φαῖο a middle/passive optative, not an active like φαίης. The middle/passive ending for the 2nd sg should have been -iso; but proto-Greek did away with its sigmas between vowels, so all that was left was -io. Which doesn't look like a verb ending I'd be familiar with, and for good reason: Middle Greek ended up putting such 2nd sg passive sigmas back in. (Ancient Greek has lou-omai "I am washed", *lou-esai > *lou-eai > lou-eːi "you are washed"; Modern Greek has restored the forms to lun-ome, lun-ese.)

  • Normally Greek verbs are thematic: they have a vowel, an -e- or -o- depending on the person, between the verb stem and the personal inflection. This means that the optative passive ending is normally -oîo; and -oîo I had seen before enough to recognise. But φημί is one of those irregular athematic vowels --- meaning it's archaic enough not to have a thematic vowel. The -io goes straight onto the verb stem pʰa-. pʰa + io = φαῖο. Because I had not grokked the optative by being force-fed it at school, I wasn't familiar enough with the ending to make the conection.

  • The killer was the change in Moschus. I had actually gone past the description of φημί in Kühner-Blass, who had a generous three or four pages about it, listing all its attested tenses. It turns out that standard, Attic Greek only used the verb for "say" in the active; having it in the middle voice is an other dialect thing, and since we have a lot more Attic than other dialects, we don't have all that many middle voice φημί in our literary texts to begin with. Nonetheless, Kühner and Blass (dunno which one, though Kühner's earlier solo edition is in Google Books) saw fit to include all known middle instances of φημί. They list the indicatives, the list the subjunctives, they list the infinitives, they list the imperatives.

    No optatives listed.

  • Which can only mean one thing. In the 19th century, the editors of Moschus had gone with the common, normal Attic form φαίης. Kühner-Blass is 19th century, and so is Liddell-Scott. When A.S.F. Gow did his new edition of Moschus in 1952, he looked at φαίης and φαῖο, and did what a philologist should: he went with lectio difficilior. (OK, I've got a normal boring Attic optative, and a weird unknown Doric optative. Maybe the scribe who wrote that manuscript thought he'd out-Doric Moschus, so he made φαῖο up. That might have happened, and Kühner-Blass is adamant that that kind of thing has happened with Herodotus. But it is rather likelier that the scribe took one look at Moschus' φαῖο, had the same reaction I did, and plugged in the form of the verb he was much more familiar with.) In doing so, Gow brought back into the corpus a likely authentic Doric optative. But noone is bothering to revise the 19th century grammars, so the grammars won't tell me that.

  • Moreover, the way irregular athematic verbs are handled in the lemmatiser, actives and middles are handled separately, as distinct classes of endings --- whereas normal verbs lump all the possible classes of endings together that would attach to a given tense stem. So it didn't guess at the middle version of the verb, whereas normally it would.



So that's the story of φαῖο. The lemmatiser now knows what the verb means; alas, it's the only instance of a middle optative of φημί in the corpus, but every verb counts (especially when it's in my performance index testbed). Not that I was happy to have been running around for an hour trying to work out a verb I should have grokked immediately. But that's how pioneers work, I guess.

Review: O'Connell, Pyke & Whitehead. Mastering your Organization's Processes.

In re: http://www.cambridge.org/0521839750

This book was looking at Business Process Management (with Capital Letters, since it's a distinct methodology), from a managerial rather than an IT perspective. Though it very occasionally got bogged down in detail of tactical approaches, overall it was a delight to read: judiciously cynical of everyone (especially IT, but also management fads and office politics), with dollops of Very British Wit, a dash of donnish humour, and quite practical about the constraints under which you could end up deploying Business Process Mgt.

It turns out I was mistaken about what this was about: the Management of Business Processes presupposes their analysis, but there wasn't much about the analysis in there --- appropriately so: 1, analysis is what IT does, not management, and 2, with BPM software, you end up doing a lot of the orchestrating of workflow and process components from your desktop yourself. One of the sidepoints for me of the book was making me realise the power of integration --- having absolutely eveything supporting your business processes talking to each other, and being able to reconfigure your processes in a hurry. This was really more big picture stuff than immediately useful, but it gives handy context --- including some quite useful lists of what to look for in management solutions, and it can inform the questions of how you interrogate an organisation's culture.

Did I mention how delightfully cynical it was? Actually, it reminded me of why I hated the 2nd edition of the Camel Book, and liked the 3rd. The 2nd was Larry Wall being wacky and petulant ("Perl is the way is it coz I said so, and aren't I cute"), and I couldn't stand reading it. The 3rd edition brought in a coauthor who actually ended up apologising for Perl's idiosyncracies through the book --- "this looks bizarre, but there is a reason why you might choose to do things that way". That was the edition I was able to read, and it was for a similar reason that I liked this one: I wasn't preached at or evangelised to, but addressed with some respect as a reader mature enough to make my own decisions. And the authors did the right thing by repeating themselves every few chapters and recapping: they are writing for people with short attention spans (they apologise at the start of the book for inadvertently insulting people's intelligence), and that honestly makes for a much more readable book in all. OK, ok, that's my humanities bias again.