Focus on Readers: DITA

Showing posts with label DITA. Show all posts

Sunday, March 23, 2014

DITA in times of contraction

When Pubs managers decide to move their doc content to DITA, all they see is the savings. It's ROI, ROI, ROI. "You have to spend to save," they argue, and they often start spending hundreds of thousands of dollars on software purchases, training, and non-writing personnel. All that's fine when a company is growing and has loads of cash, but what risks are Pubs managers exposing themselves to if the company hits bad times?

As I have argued before, in many cases DITA doesn't so much save money as redistribute it. Where before you spent the lion's share of your doc budget on salaries for writers, now you're spending the most money on tools developers, information architects, editors, and software.

I'll give you an example: I once worked in a DITA shop where a team of 11 writers was overseen by a manager, three team leads, three editors, and two information architects; and it was supported by nearly a dozen tools developers. There were almost twice as many non-writers working on the documentation as writers (and yet writers had to fill out complicated forms for the editors, as well as project-manage the localization of their docs). The CMS was enormously expensive, and then the CMS vendor end-of-lifed our database so we had to spend a pant-load on a new one, including two years of research, planning, tools redevelopment, migrating, and tweaking the migration.

In a DITA shop, teams become complexly interdependent. Much effort is expended on assimilating writers so that they give up their holistic approach to writing, and accept their role in a DITA production line that starts with information developers; relegates writers to the role of filling in content in designated, pre-typed topics; and ends with editors. As it was explained to me, the writer must learn to pass the baton. DITA proponents argue that writers who can't assimilate should be fired.

The CMS and publishing tools are enormously complex so that nothing can be published without the help of a large team of tools developers. In addition, the complex new processes and corresponding bureaucracy require training (and hence trainers) before new writers can become productive.

Now imagine that the company has a profit dip and needs to cut costs. Who and what is expendable?

Before you had a team of writers, and if the company got in trouble you could lay off some with minimal impact. But now, if you have to contract your Pubs department you're in a pickle. The information typing process relies on so many non-writers that it seems inevitable that when companies are in decline, a DITA shop is going to have to give up more writers than a non-DITA shop.

That fragile CMS doesn't run itself, and keeping it going requires expert skills: you're going to have to keep most of your tools developers unless you want to give up publishing documentation altogether. It's probably not possible to give up the expensive maintenance plan for the CMS, either.

Your complex processes are going to continue to require the trainers, team leads, information architects, and editors.

In short, you're left with an expensive behemoth that can't be easily dismantled... unless you decide to ditch DITA altogether and migrate to a simpler solution.

The risk of DITA is fine when there is real justification for adopting DITA: when there is real need for reuse, when translation savings can't be garnered by a simpler alternative like Docbook XML or Madcap Flare, when you absolutely need to enforce strict information typing on writers. The problem is that nearly all outfits that are adopting DITA do not have that real justification. They're wasting money on DITA, and that could get them into trouble when the cash stops flowing.

Sunday, September 8, 2013

The lesson of databases: use only when necessary

The current CIDM newsletter has an article about content management systems: Why do organizations hate their content management system?

The article is scathing about companies that make bad purchasing decisions, and scathing about CMS vendors that make difficult, bloated products.

But the article is missing something important. The fact is that relational databases are notoriously difficult to use. I spent over ten years documenting databases, so I've seen some of the messy innards first hand. You don't just buy an RDBMS and then figure out how to use it. You need Database Administrators with a lot of skills. You need to make an ongoing investment of money and time just to keep the thing working.

When I worked in the IT department of a large financial firm, there was a prohibition on databases. We used Excel in very sophisticated ways. We transferred millions of text files a night. But we avoided RDBMSs at all costs. Apparently the company had been burned badly by a database implementation and was unwilling to try again.

The CMSs used by doc teams present extra challenges. At one DITA shop where I worked, our CMS vendor decided to deprecate documentation use of their CMS and end support for our application of it. That meant that we had to spend an enormous amount of time and expense to choose a new CMS and get it set up. The cost must have been in the hundreds of thousands of dollars, none of which was figured into the initial ROI for moving to DITA (which we had done just a couple of years before).

I would admonish documentation departments to avoid using a CMS unless they really need it. As with DITA, it makes no sense to take on the enormous expense, steep learning curve, extra manpower requirements, and ongoing hassles - unless you really need it. "Really needing it" means that simpler options won't work for you. The complexity of reuse in most doc departments doesn't come close to justifying the enormous expense.

Wednesday, October 10, 2012

DITA and the future of tech writing

This post is part of a series of posts that question some of the claims made about the benefits of DITA adoption.

DITA is designed to work with a CMS to create a fully structured tech writing environment. In a full DITA implementation, the process of creating technical documentation is fundamentally different from what is done in a traditional writing department. There are so many variations of tech writing processes that it's impossible to describe either the non-DITA or DITA structure accurately, but (with some trepidation), I'll take a stab at it...

In a traditional setup, at least for documentation that requires a fair amount of specialized knowledge, the majority of members of the doc department are writers. Typically, each writer researches, designs and writes one or more deliverables (and is effectively the project manager for the deliverable). There may be an editor, or the group may rely on peer reviews. There is a manager/team lead, but often the management style is quite flat, with writers making a lot of the decisions on things like style guidelines, user research, and priorities. In a high-functioning team writers are active in the development teams they work with, adding to terminology decisions and usability, as well as editing resource strings. The doc department may have a small tools team, or a writer may do double duty on tools maintenance.

By contrast, a DITA implementation is supposed to be more like building a house: writers create the bricks, but other specialists design the house and build it. Writers create small, structured, reusable modules of content. Architects create templates for the modules, and possibly also oversee information mapping of the modules. Map editors create maps that use the modules to produce deliverables. Editors enforce consistency. A team of tools developers maintains the complicated software required by the process. Team leads or architects act as project managers.

Writers must accept that they must spend a higher percentage of their time on tools and bureaucracy than they did in the traditional doc setup. They must also accept that they have much less control over the final output. This fundamental change often results in writers being unhappy about working in DITA, and the DITA literature goes on about how writers must be assimilated, how failures to return on investment are usually caused by writers having bad attitudes. But stop a moment: When your employees balk at a change, shouldn't you respect their instincts? Unless part of your business model for a DITA transition is that you want to reduce quality for readers, you should at least listen to the people who are responsible for creating that quality.

Instead, DITA proponents say that writers must shape up or change careers. I have heard it put as baldly as that: DITA is sweeping the tech writing field and writers can no longer see themselves as project managers for readers. They are now simply a cog in a wheel. If they don't like it they won't get hired: they'll have to find a new line of work. The real tragedy of this attitude is that the writers who balk at losing their responsibility are the high quality, senior ones who are passionate about their readers and have a professional attitude about how they work. Crappy writers will be perfectly happy assimilating to less responsibility. (They might be less happy when they realize that the transition to structured writing means that it will be much easier to ship jobs off-shore.)

DITA is a beautiful solution... if you're trying to document the parts for an airplane. It would also be suitable if you're documenting 50 similar products, each with end user docs that overlap. My problem with DITA is that it has been sold as a general purpose doc solution. DITA advocates went too far in extolling the virtues of DITA, such as saying that any company that translates content should adopt it.

When people complain about DITA, DITA proponents like to say that it's just a tool: if you don't like the meal, don't blame the knife. But DITA is much more than a tool. It's a tool developed to be used in a particular way, and there's no sense adopting it unless you also adopt the system of structured writing it was created for. The literature about DITA has also created a culture - such as the emphasis on assimilating writers - that permeates many organizations that adopt DITA. And the way DITA is meant to be used, creating reusable modules of content, creates a tendency for doc deliverables to have a certain look and feel. (More on that in another post.)

In fact, DITA is having a profound effect on all aspects of technical writing: on the way we work, the productivity of doc departments, our job responsibilities, and the quality of our output. I know that some call what I'm doing "DITA bashing", but we are past due for a deep reflection on the pro's, cons, and appropriate use cases for DITA.

Tuesday, October 2, 2012

DITA ROI: Are translation savings all they seem?

This post is part of a series of posts that question some of the claims made about the benefits of DITA adoption. This post focuses on savings in translation costs.

Articles about DITA ROI make some rather sweeping claims about the money you can save by adopting DITA. One prominent DITA proponent writes, "If you have localization in your workflow, you can probably justify the cost of DITA implementation." I would argue that that claim is false: that most companies that localize their content would never recoup the costs of a full DITA/CMS implementation, and that DITA makes sense mostly in fairly extreme cases such as hardware documentation where there are hundreds of similar versions to be documented.

There are two main claims for translation savings with DITA: topic reuse and post-translation DTP costs.

Topic reuse
First, DITA is supposed to save you money because you can reuse topics. "Write once, use frequently" means that a topic is only translated once. Big savings, right?

Maybe yes, maybe no. Translators use Translation Memory. TM is very sophisticated: each sentence is read into memory, and each sentence is flagged if it is an identical or fuzzy match to a sentence before it. If you repeat a sentence, TM will ensure that it is only translated once.

There is still a cost for processing a 100% match, but it is minimal. Typically, the cost for identical repetitions is 15% to 30% of the cost of new translation.

What this all means is that if currently 10% of your topics are duplicates of other topics, your translation costs are higher by 1.5-3% than if you reused topics.

Note: You can get some additional savings from DITA with a CMS by transforming your ditamaps into an interchange format called XLIFF before sending them to the translator. This is a pretty complicated procedure; have a look a this link to see if your organization can handle it. (And I remian somewhat confused about XLIFF: my friend who runs a large translation company says, "Since our CAT tool can handle XML directly, it’s not necessary to go through the migration process into .xliff format.")

Keep in mind that the savings from topic reuse only apply to topics that you are currently maintaining in duplicate places. If you decide to start reusing other topics in more places, that could arguably improve your quality, but it does not improve your ROI. (Plus, I argued in another post that the reuse following DITA adoption is often actually harmful to reader usability: link)

It is true that translation costs rise for reused text when it gets out of sync - when different locations are updated differently. It is always a good idea before sending things for translation to spend some time preparing the files; syncing duplicate content should be part of that check, when it occurs. But even when translators get different versions of dupes, they charge less for fuzzy matches, so the price is not the same as translating the section twice.

My point here is about ROI, not how to write. I am not arguing that cutting and pasting content is good practice. But for many writing teams there is not so much duplication that there's any problem keeping up with it, and if there is, then there are many other systems that provide excellent mechanisms for reusing topics, including Docbook XML, other forms of XML, and Madcap Flare. An extremely expensive full-blown DITA implementation with a CMS is not the only way to reuse topics - and for many organizations, it is not the best. (More on that in a later post.)

Post-translation DTP costs
DITA is supposed to save you money because in other systems, work has to be done after translation. One prominent DITA proponent claims, "Typically, 30–50 percent of total localization cost in a traditional workflow is for desktop publishing. That is, after the files are translated from English into the target language, there is work to be done to accommodate text expansion and pagination changes."

This is a valid point, except that it doesn't state its assumption that you are using bad practices. When you start to localize your DTP content you should remove manual formatting and rely on styles instead. In addition, you can't use formatting that will cause problems in languages that have longer words or are more verbose. This means: stop adding manual page breaks, stop using format overrides (FrameMaker 10 provides an easy way to find and remove overrides), stop putting section headers in the margin, stop setting manual cell heights in tables, stop using forced line breaks (Shift-Enter).

These practices will hugely reduce the post-translation DTP costs (certainly to way less than the stated 30-50%, although there is still a per-page DTP fee). When we talk about the advantages of DITA, we assume people are using good practices; we shouldn't assume that the alternatives are created with bad practices.

Conclusion
Articles about DITA ROI often give you rules of thumb to use in your calculations. Their claims are almost always based on an unstated assumption that your current authoring environment is the most inefficient one possible, and even then their claims can be over the top. It is prudent to ignore this advice and instead go to your translation vendor to find out what your cost savings might be. I have become friendly with the managing director of a translation vendor I once worked with, and he assures me that translation cost is virtually the same when the source is DITA, Docbook, other forms of XML, Flare's XHTML, HTML, etc.

I have spoken with doc teams who are planning to move from Docbook XML to DITA simply because they are confused by these DITA ROI articles and think that the massive translation savings will apply to them. This is not a trivial issue. DITA proponents should be much more precise in the claims they make about DITA cost savings, and doc departments should be much better educated before jumping on the DITA bandwagon.

Note: I'm uneasy about quoting individuals. It isn't fair to single out any particular DITA proponents on how they justify DITA ROI, as many DITA proponents are saying similar things. In addition, I don't mean to impugn the motivations of anyone.

Update: I have a growing unease about quoting people and then knocking down what they say. I have now removed links to DITA proponents I quote. In later posts, I may even stop quoting.

Friday, September 28, 2012

The True Costs of DITA Adoption

For anyone who disagrees with this post: please leave a comment or send me an email. If I am incorrect about something, I will modify the post so that I am not spreading misinformation. And I would love to have a dialog on the topic.

There are many articles that advise doc managers about how to calculate return on investment for potential DITA adoption. Most of these articles seem to be written by consultants who make money by helping companies set up DITA systems: they have a vested interest in making DITA look beneficial. Also, they tend to help out during the initial migration and might not be around when some of the costs kick in: they simply might not be aware. Finally, they might deal mostly with large companies where large expenditures do not seem excessive. For whatever reasons, the literature seems to be underestimating the true cost of moving to DITA.

There can be a lot of costs related to DITA adoption. Here are some that might affect you. (Different implementations will vary somewhat.)

Most of us know about the cost of a CMS, which can set you back over $250,000 (or might be a lot less). You can do without the CMS, but DITA is designed for use with a CMS and you need one to get the full benefits. But the CMS is just the start.

In the early stages of your migration to DITA, you will likely need to hire DITA specialists (the consultants I mention above) to help you plan and set up your system.

You'll likely need more inhouse tools developers, and you may need developers with different skills than you currently have. This is not just to set up the new publishing system and so on, but also to troubleshoot publishing problems, adapt to new browser versions, and address all the bugs and glitches. In my experience there are all sorts of problems that crop up with the relational database (CMS), and also with the ditamaps, formatting of outputs, and many other things. Part of the problem is that the DITA Open Toolkit is notoriously difficult (and could require extra expense for things like a rendering engine). Some of the tools designed to work with DITA are arguably not quite up to speed yet. Your tools developers will also spend a lot more time helping writers.

(If you don't move to a CMS, but use something like FrameMaker and WebWorks ePublisher, you may find that you have a lot more headaches in producing docs without much in the way of DITA benefits.)

You need extremely skilled information architects to create a reuse strategy, engage in typing exercises, and design and edit your ditamaps. This isn't a skill set that people currently in your organization can easily acquire. Even most information architects have trouble adequately mapping topics. For a discussion of the sorts of challenges they'll face, see this series of posts; the moral is: if you don't have skilled architects working on your system, you may end up with Frankenbooks that are not particularly useful for your readers. I raise some additional topic reuse problems in an earlier post: link.

You'll need to spend significant time developing new processes, policies, and internal instruction manuals.

Your team will have to undergo intensive training. In coming years, new writers you hire will also need training. I have found that writers can move to Docbook XML with very little training, but DITA requires a great deal of training, not just for the CMS, but also to learn how to use ditamaps, reltables, and so on.

The migration of content will likely be quite time consuming, with manual work required to correct tagging that doesn't convert automatically, mapping, and a complete indepth edit.

Your writers will need to spend more time on non-writing activities. This can greatly reduce their productivity. Working with a relational database, especially an opaque one like a CMS, is much more time consuming than checking files out of a version control system. Creating reltables is a lot more work than adding links. Coordinating topics is a lot more work than designing and writing a standalone deliverable. Plus, there is a lot more bureaucracy associated with DITA workflows.

With most DITA implementations, topics exist in a workflow that only starts with the writer. You'll probably need more editors and more software.

You'll also probably need more supervisors. The DITA literature emphasizes the importance of assimilating writers to the new regime and then monitoring their attitudes. Pre-DITA, writers were project managers for their own content; with DITA they have to learn to hand that responsibility off to others.

There are some organizations, such as ones that have to cope with hundreds of versions of a hardware product, that have a clear ROI for DITA. But many (most?) organizations could find that DITA doesn't so much save money as redistribute money. Where before you spent the lion's share of your doc budget on salaries for writers, now writer salaries will be a much smaller proportion of your budget. In many cases, companies could find themselves facing higher costs than pre-adoption: they will never see return on their investment. And given the complexities of using DITA, ongoing hassles and escalating costs, some companies are going to find themselves having to ditch DITA and go through an expensive migration to another system.

Saturday, September 15, 2012

We need a frank, open discussion about the problems with DITA adoption

My second post on this blog was Case study: DITA topic architecture, in which I described some problems I inherited (twice) with DITA topic architecture.

Thanks to Mark Baker, author of Every Page is Page One, the post was widely read. (He referenced it on his blog and also tweeted it.) The post got hundreds of page hits and generated several comments and a few emails. It also spawned a somewhat defensive thread on an OASIS forum.

I have a lot to say about DITA. I have been holding back because I was concerned that my new blog would be written off as a DITA-bashing forum. I have a lot of other, less controversial (or differently controversial) things to say, and I didn't want to turn off a whole section of the tech writing community even before anyone knew who I was. But it seems that despite my best intentions I have been branded an antiditastablishmentarian. :-) So here I go...

I think it's time that we have a frank and open discussion of the pro's and cons of DITA. For years now all discussion of DITA has been dominated by its proponents; we have heard plenty of arguments for why to adopt it. We need an open discussion not to bash DITA, but to uncover issues so that we can address them.

Here are just a few of the issues I want to address:

Has DITA changed tech writing output? Is there a discernible style to docs created in DITA? If so, is this what we want - and how can we change it?
How has DITA changed the work environment for writers? Do writers have less control over their content in DITA shops? What is the effect of that on quality?
How has XML/CMS adoption affected the creative process for writers?
What is the culture of DITA, and how widespread is it? Has the emphasis on monitoring writer's attitudes towards DITA changed the culture of tech writing?
How much is DITA really costing companies, when you include the need for enhanced tools teams and information architects, CMSs, and more time spent by writers on non-writing activities?
DITA proponents make claims about the cost of non-DITA solutions, such as that writers spend 30-50% of their time formatting. How true are these claims?
Has the rise of DITA increased the influence of consultants on tech writing? How has the agenda of consultants (to attract business) changed our profession?

These issues are of immediate, practical interest to me. I lead a doc team that uses DITA. I have authored in XML for 12 years. I have been a judge in the international STC competition (that judges the highest scored winners of the local competitions) for over a decade, giving me a chance to see the trends in our profession.

To the DITA proponents, I want to say that there is more that unites us than divides us, and to let you know that my goal is always to eventually reach common ground. My other, much longer-running blog is largely about politics so I have experience with this approach. I hope some of you will stick around to duke it out so that we can reach some consensus.

My bottom line is: I think there are some things to be concerned about with the widespread adoption of DITA, and we can't fix them if we don't acknowledge them. Let's dive in and see where the discussion takes us.

Tuesday, July 31, 2012

Case study: DITA topic architecture

This post is part of a series of posts that question some of the claims made about the benefits of DITA adoption.

(Note: DITA uses the word "topic" to refer to a reusable module of content. Out in the rest of the world the word topic tends to refer to an HTML page or a section of a PDF. This may be seen as an infuriating oversight, but I suspect it's actually deliberate. For a while I tried replacing DITA "topic" with "reusable module of content" but I have given up and now just use the one word to mean multiple things.)

I started with DITA several years ago when I got a job in a large doc team that had been using DITA for a while. I inherited a few deliverables and was appalled at the way the content was broken up into concept, task and code sample topics. We optimized for HTML output and there were too many brief HTML pages that users had to click through. Even for a simple idea that could have been covered in one paragraph, these docs would have three topics. For example, a description of how to stop the server would have a concept, task and sample topic, each appearing on separate HTML pages.

My audience was developers. I did quite a lot of usability testing with them and found they were furious about the documentation. They hated having to click through multiple tiny pages. They hated the minimalism and choppiness. They described the docs as unfriendly, officious, insulting and unhelpful.

Conflict in the Pink Ghetto

When I was a child I had a recurring nightmare about pink and black. It always started with a visual of pink fluffy clouds, not dissimilar from cotton candy, and it gave me a feeling of intense well-being. Then a viscous black ooze would start to infiltrate, a little at first and then growing until the pink was obliterated, leaving me unsettled and frightened.

Nowadays I am far, far from childhood and I work in the field of technical writing. I was a tech writer early in my career and chose to return to it in the 90s, after deciding that I am best suited to cerebral, creative and mostly solitary work.

The problem was that, at least back then, technical writing was a classic pink ghetto occupation - female dominated, not highly regarded by other workers, and not easy to move up out of - and that this caused it to be a rather hostile environment.

Case in point. When I returned to tech writing I worked in a department that had about 50 writers. We produced excellent documentation on complex, technically challenging topics. The writers, who were mostly female, were all extremely bright. Almost everyone had a university degree in a math or science followed by a two-year tech writing diploma.

It was a vicious place. Writers regularly yelled at each other, threatened physical harm, and lodged harassment complaints against each other. The turnover in writers was staggering.

My next job had a mix of men and women. It was to my mind an excellent place to work - fascinating work, great pay, private offices, good treatment - but most of the other writers hated their jobs. One writer refused to implement tech reviews because he said it damaged his self-esteem. Two writers worked less than half the time they were supposed to. Another writer, the manager and I carried the load for a department of seven. It has been my experience that slackers can become paranoid and nasty, especially when they're caught in a web of lies to hide their lack of productivity, and that described this group.

The bad behavior of tech writers is a well known phenomenon, or at least it was up to five or ten years ago. I have heard the argument that writers behave badly because they're lesser-paid and lesser-respected members of R&D departments that are otherwise staffed by brainy, egotistical developers. In this argument, it's a fight for the top of the bottom rung.

However, there are other bottom-rung departments in R&D divisions (QA, testing, build) and I never saw them exist in a state of constant nastiness. Likewise, there are other female-dominated professions (marketing, PR, EAs) that don't seem to have these problems. Tech writing is perhaps different in that it is by nature submissive; we writers are always asking other people to give us information, and then always getting reviews that tell us what to change. That position rankles some people, and maybe it causes aggression.

Over time the tech writing profession managed to reinvent itself. It found a way to earn the respect it always deserved and needed. Job titles changed. Some people went the techy route ("Information Product Developer") and some the usability route ("User Advocate"). Previously, the cherished skills included layout design, grammar and clear writing. Now it has become more important to be skilled at complicated tools and to know markup and scripting languages.

As with many good things, the golden age of tech writing as a professional, respected profession seems to have lasted but a nanosecond. It is being replaced in many organizations by the mindset that came along with DITA.

DITA is a specification for creating documentation in XML, but has brought with it a whole new approach to tech writing. It's difficult not to resort to mixed metaphors here. In the DITA world, writers are treated like children and are cogs in a wheel. Previously a team of writers worked together to write with one voice. With DITA, writers follow strict guidelines and templates to produce small units of content. There is a tendency in DITA processes (enforced by CMSs and other tools) to remove responsibility for the final product from the writer. Workflows are imposed in which the writer creates the first draft, but then hands it off to editors and team leads who make changes and give approvals. Everything is dictated by the workflow - everything but quality, user focus and a process of continual improvement, which somehow got forgotten.

Tech writing changed with DITA because DITA is a top-down orthodoxy. The problem may be that DITA was influenced too much by consultants, academics and tools vendors, people with personal agendas and too great a distance from real world tech writing. The problem may be that it was developed for hardware documentation (doc sets that require a great deal of content reuse) and then was applied out of context. I think the mindset is gaining a foothold because it's a continuation of the drive to make tech writing seem more serious and difficult. (We develop reusable modules, just like object-oriented programming!) Paradoxically, the mindset is hindering writers from doing their best work.

DITA has also resulted in mind-blowingly expensive budgets for tech writing departments. DITA is optimized for working with a CMS, a relational database that can set you back over $100K. XML authoring has no off-the-shelf solution, and requires docs tools teams to troubleshoot, maintain the system and upgrade. Pre-DITA you could assign each writer some subject areas and leave them to it. With the new modular style, many organizations feel the need for more architects, editors and team leads to oversee the content production process. In some organizations, doc departments have a ratio of less than 2:1 of writers to support/supervisory staff. And many organizations don't achieve the degree of content reuse that justifies the extra cost.

None of the trends in tech writing occurred at all places simultaneously or for the same reasons, and some organizations will skip them altogether. But there seems to be a pulse of positive and negative influences, with a steady movement of tech writers trying to look more like developers. And in the predominant dialog of the last decade, there is far too little focus on the reader.