Jump to content
How does oreillynet publish and protect its eBooks
Asked by Davidd42
Posted Apr 17 2011 02:40 PM
Hi, I'm working with an language education publisher (Greek-English - greek123.com) and we want to move to publish in eBooks formats.
Is there a resource where we can see oreilly's workflow? Is this something we can emulate?
Specifically, how do authors submit work to you?
What is your conversion method?
How do you print sales info on you eBook pages?
Do you use Amazon et al to publish as well as through your own site (I can look that up of course, but I'm interested in the "why" of this too).
Thanks a ton - I too love your books.
Answered by macnlos
Posted Apr 18 2011 07:36 AM
The guys over at Pragmatic Books have a very interesting process on the creation of eBooks. I'm summarizing here, they have created their own markup language that all of their authors use to write their books. This markup is captured and versioned in Subversion and there is a continuous build process that is used to take that markup and generate the various e-formats.
You can visit them at PragProg.com and it may be worth an email to them. Both O'Reilly and PragProg are a great idea to model your internal processes on.
Sent to you from my
iPad, iPhone, BlackBerry,
Laptop, Desktop, or
Answered by Adam Witwer
Posted Apr 22 2011 01:36 PM
All of our content is in DocBook XML 4.5. Our authors submit books in a variety of formats, including MS Word, DocBook, AsciiDoc, and OpenOffice. We clean up or convert the files to create the subset of DocBook XML that our toolchain expects, using a combination of scripts that we maintain in house and conversion vendors, as necessary.
The core of the O'Reilly publishing toolchain is the open source DocBook XSL stylesheets:
The best overview of what they are and how they work is Bob Stayton's book:
Several years ago, we worked with Adobe to add EPUB as one of the output formats for those stylesheets. They are incredibly robust and comprehensive (so can handle pretty much any valid DocBook you throw at them) and are highly modular and customizable.
In addition to EPUB, we output XSL:FO, which is fed into a commercial processor, AntennaHouse, to get to PDF (for print and electronic sale). There is an open source FO processor, the Apache FOP project, though it isn't suitable for our commercial purposes.
On the authoring side, we've had increased success with the AsciiDoc wiki markup format, which can be automatically transformed into DocBook for feeding into the XSL stylesheets:
One of the fundamental principles of our toolchain is that when possible we've tried to adapt tools and techniques that developers already know (like version control and automated builds), both because they're familiar to our authors, and because developers spend more time collaboratively working with text than just about anybody, so it makes sense to look at what they use to better manage that process.
Here's an author's recent post on working in our toolchain:
Hope that helps answer some of your questions!
Answered by Davidd42
Posted May 02 2011 01:33 PM
Thank you gentlemen. I apologize for my tardiness in getting back to you. Your posts were helpful, informative, and encouraging..
Answered by tmo9d
Posted May 06 2011 08:09 AM
Adam's reply is the official answer. Here's an alternative process you can use if you don't have access to the AntennaHouse FOP processor. Use DocBook XSL + Wilfred Springer's docbkx plugin. This is what a number of large companies use for both documentation and training material.
Note: Apache FOP is great, but to get it to work with embedded PDF figure you'll need to depend on PDFBox and a few other bits and pieces. All of this is available if you copy the example of the Nexus Book here: https://github.com/sonatype/nexus-book - To run that you'll need Maven on Linux w/ ImageMagick already installed.
Also, working with DocBook and maintaining this sort of infrastructure is "no Spring picnic". To recreate the sort of process that O'Reilly uses, you'll be investing a serious amount of infrastructure in researching and fixing little nits here and there, and you'll never get to the pre-print output quality you'll get from O'Reilly. To do this successfully, you have to be ready to sacrifice days to fixing XSL:FO and XSLT issues which, in the end, may not even fix the problems you are trying to fix due to limitations in Apache FOP.