12.17.07
Of B2B, DSLs, and the humble translator
I’ve been thinking lately about how data is transformed from one form to another, and the people that do that and the tools they use. Traditional discussion around DSLs is kind of an arcane — but important — computer science notion, but I think it really applies here. But first:
DSL == Domain Specific Language
Or at least it does in this context (my apologies for those of you reading this over a digital subscriber line), but not in the normal usage. I was in GXS’ Advanced Application Integrator (that is our translator) Training two weeks ago, and ever since I’ve been pondering the age old question (at least in the EDI world), why do people use high-end translators instead of just coding? My own background is in software development, so I am sympathetic to this question, but it is really from my on-again, off-again hobby of woodworking that I found my answer. If you are going to do something often enough, a specialized tool makes sense. In woodworking, there are many fine tools for doing joinery (strongly joining two pieces of wood together — usually with glue), such as mortise machines (kind of a specialized drill-press), pocket hole jigs, dovetail jigs, etc. I would argue that in the world of data manipulation, specialized tools are domain specific languages.
[]
VAR->OTCurMsgHierID = VAR->OTHierarchyID
VAR->OTPriorEvar = SET_EVAR(”HIERARCHY_KEY”, VAR->OTHierarchyID)
;VAR->SupplierName = $MsgCompanyName ; optional if used
;VAR->SupplierCode = $MsgCompanyCode ; optional if used
VAR->OTCurMsgStd = $MsgStd
VAR->OTCurMsgCommID = $MsgCommID
VAR->OTPriorEvar = SET_EVAR(”TARGET_MODEL”, $MsgOutMdlName)
VAR->OTPriorEvar = SET_EVAR(”TARGET_ACCESS”, $MsgOutAccName)
VAR->OTAttachName = “OTTrg.att”
[ VAR->OTCurMsgStd == “ANA” ]
VAR->OTCurMsgType = $ANAMsgFileFormat
[ VAR->OTCurMsgStd == “EDIFACT” ]
VAR->OTCurMsgType = $EFTMsgType
[ VAR->OTCurMsgStd == “X12″ || VAR->OTCurMsgStd == “TDCC” ]
VAR->OTCurMsgType = $X12MsgType
[ VAR->OTCurMsgStd == “A2A” ]
VAR->OTAttachName = STRCAT($MsgOutAtt, “.att”)
Whoa!
What the heck was that? That is a small sample of the powerful DSL that lies at the heart of the translator. (I promise, no more code in this entry… I just figured it could make the abstract idea of a language built for a specific purpose a bit more concrete…)
The idea is that if you have a well-understood problem that requires “power tools”, you can maximize the productivity of people by giving them a language that is built to be immensely powerful for that problem, but not really do much else. Back to my woodworking example, I think of the powerful dynamic scripting languages (like Python, Perl, or — very hot right now — Ruby) as a saber saw. Cuts curves, but not too tight; cuts straight cuts, but not too straight, etc. But a DSL like those of traditional data translation tools are more like a scroll saw — made to cut elaborate curves and create beautiful woodwork, in any shape and form imaginable, but not really intended for any other cutting task (for the woodworkers out there, commercial grade translators may actually equate better to a band saw — because of speed and “scalability” — but lets not get carried away with this…).
Dropping the metaphor for a moment (because no woodworking tools do this…), translators also completely eliminate certain work, liking parsing “envelopes” for standard transactions (envelopes are the outer layer of data that contain “meta” information like sender and receiver, document id, ids for batches/groups etc). Because the translator knows that you probably use popular forms of EDI (like X12, EDIFACT, RosettaNet, etc), it can identify them and handle the common parts. What is more, because standards are frequently used differently by different partners that you work, translators maintain databases about partners, standards, and “customizations” (that is the polite way of saying that your partner — of course you don’t do this — has “mis-used” a standard). Another major area of assistance is with something called “looping”, or “loop control”. B2B documents tend to be sent in groups of transactions, transactions have line items, items can have sub-lines, etc. When you write code to process a file format, you handle all of this for yourself, but when you use a translator, you can often accomplish this just by describing the structure (okay, I have a PO, and it needs a line item, which better have a SKU and a quantity… if it doesn’t something is wrong, so stop working on that doc and move on to the next one…).
This doesn’t seem like a big deal, but it actually is. To give you an idea, we operate over a hundred B2B operations on behalf of customers, which means operating hundreds of ftp servers, dozens of SAP bridges, etc. But while doing this we are managing over 10,000 maps (what we call the DSL “programs” we use for transforming data) — which is why we need power tools. What’s more, people that use managed services tend to have pretty complicated data formats, and pump data like you wouldn’t believe. One of the hardest things to explain to folks is why large scale any-to-any translation is different than the “hello world” variety of XML translation most developers cut their teeth on. Also — of course — translation is generally two way, so you need to be able to translate between any two formats, even those that are completely different.
But won’t growing standardization help us with this (and yes, I realize the grizzled e-commerce veterans out there are choking with laughter)? Yes actually, it does and will continue to do so — by constraining the problem. The “D” in “DSL” is domain, as in “problem domain” — meaning that our challenge is contained within some kind of boundary. Without standards, translation technology would enjoy no advantage over a general programming language. Standards contain the problem, but multiple standards are part of the problem, so there you are. In truth, this whole issue will go away if folks out there in the world will all agree on how to do everything (processes), and how they are going to exchange information (data formats) — so needless to say we will have this stuff with us forever.
So if specialized tools rule, and standards make them possible, at least this is a boring, mature market, right? Well, we have been playing in this space for over a decade and have added three new patents around this technology in the last 18 months alone! And lest you think this is a dressed up add for our product, know that this is a vibrant and competitive space with many fine products.
So here is a holiday toast to the humble translator and the technically skilled developers that uses it …

justindz said,
December 18, 2007 at 7:16 am
Martin Fowler, who gave us such magnum opuses (opii?) as Refactoring has a bunch of great web content on DSLs. Being a Ruby fan, I know that its extended library is chock-full of good DSL implementations. I’d love to see more EDI-related libraries show up in these communities for doing POC work and opening up the envelope, literally and metaphorically, to people who want to take it a step deeper in some cases. It might also open up the market for user-generated extensions to translators, which I think would be healthy and interesting.
On a lighter note, I was hoping to see you get the metaphor all the way to the end of the article. In retrospect, it makes sense to drop it where you did