Schulbuch barrierefrei (accessible school books)
Co-operation between Publishers and Service Providers in Austria
This case study outlines the co-operation between Austrian school book publishers and service providers for people with special needs which aims at making books available in electronic format.
This presentation outlines the co-operation between Austrian school book publishers and service providers for people with special needs which aims at making books available in electronic format. In a project the following has been established:
- a minimum set of structural elements which documents from publishers have to contain to make them usable for the production of books in alternative formats (e.g. Braille, large print, ebooks)
- know-how and handouts for publishers on how to implement structured design with these elements using standard desktop publishing (DTP) systems (InDesign [Indes05], QarkExpress [Quark05])
- examples new books and redesigning existing books to learn how to do accessible document design in practice
- training materials, workshops and seminars to transfer the developed know-how to other publishers and design agencies
- a general agreement which gives the right of transferring books in electronic format to students with disabilities
- a Document Rights Management System including to prevent the data to be misused in practice
- a workflow for the co-operation between schools/teachers, service providers, publishers and the ministry.
The situation in Austria:
In Austria, the Federal Ministry for Social Affaires and Generations is providing educational materials like schoolbooks and other materials for primary and secondary education. Blind and visually handicapped students – and hopefully soon in the future other print disabled students - can order books in accessible formats.
Publishers, till this project, did not agree on handing over and distributing digital copies of books. The development of alternative formats starts from printed books with scanning, OCR or, when lots of graphics and/or formal structures like math are used, with typing. In this process structure was added to the book, headings were defined and lists and other structural elements were assigned to the text. This was a very time consuming process.
A more efficient alternative would be to use the electronic source documents from the publishers to create an accessible version for students with special needs. A study conducted by the institute Integriert Studieren in 2003 on PDF source documents from the publishers showed that:
- these PDFs are only of very limited or no use for students and
- these PDF’s are of very limited use for the production of accessible versions
...due to the lack of structural definition. The quality of the PDF, depends on the quality of the structural design of the source document. Only if the original document uses a good structure it is transferred to pdf or any other format in the conversion process. The quality of the documents was that bad that most of the time text was not exported in the right order. The study showed that structured document design is not practiced at publishers’ or design agencies’ site as it simply is not needed at the moment. Instead of using the features to structure the content authors or typesetters just use visual styles.
These results motivated to start a project which addressed the issues listed in the summary. Publishers are interested to take part as a) the new anti discrimination legislation [Behin05] will ask for accessibility of school books and b) they experience general problems in the publishing process when they want to use sources for different publishing purposes (e.g. print, online, CD, audio/multimedia). This convergence of interests led to a strong partnership for the project named:
"Multi Channel Publishing"
Five publishers take part in the project. Each of them is responsible for designing or redesigning one of their books based on a predefined set of structural elements. This basic structural design defined in the project guarantees that the electronic version of the book can be used for the production of alternative formats. An analysis of the publishing process at publishers’ site showed that service providers can only start from the final print ready version as the content, which is approved by public authorities, changes till this point. This final version today is most of the time a PDF generated from a DTP Tool (e.g. Adobe InDesign[Indes05] or QuarkXPress [Quark05]). Due to this, if electronic sources should be usable for services providers, structured design has to be implemented into the DTP work.
Definition of structural elements for electronic versions of books
To be able to collect the data of the source document and convert it into a XML File, we used the element Set of the TEI-Standard [TEI05], in particular the TEI Lite DTD. The TEI's Guidelines for Electronic Text Encoding and Interchange were first published in April 1994. This set of meta data is widely known by publisher and guarantees compatibility or convertibility to other definitions in use like DAISY [Daisy06]. Using TEI keeps the process close to the upcoming XML database schemes which publishers might use in the future using database structures for processing their documents. The TEI Lite DTD still consists of over 120 Elements for the tagging of books, most of them important for librarians. To simplify the work for all participating parties, a subset of those elements was selected. This subset consists of structural elements which are of general importance for structured document design and automatic content processing. This subset does not ask for special knowledge of accessible versions but can be seen as the basis for structured document design in general. Using this subset guarantees that the sources (or pdfs) can be used as a starting point for the production of accessible versions. In generals this sub-set of the TEI Lite DTD comprises structural meta data elements for:
- Divisions / Subdivisions
- Page breaks
It also comprises administrative meta data elements (e.g. Edition, Year of Publishing, Author(s), Publisher). The experience in the project showed, that this DTD Subset is sufficient to structure the content of the schoolbooks. Publishers after a short training were able to do the work by themselves. This subset also proved to be in accordance with new publishing systems based on XML databases.
After the definition of the XML DTD, knowledge was developed how the authoring tools could support efficiently marking up documents in the right way during the layout process. Further on routines for exporting the defined structure and layout data into XML. The two most widely used authoring tools were examined in detail:
- Adobe InDesign
InDesign from Adobe Inc. is a desktop publishing application (DTP) which can work with XML files. It is possible to import XML into InDesign and then prepare the document for output e.g. printed book. This feature is an important step toward multi-channel and cross-media publishing. Tests with Adobe InDesign CS 2 showed that it is possible to tag the text of the layout document. Further investigations are done to efficiently map layout to the structure. InDesign supports the mapping of text-formats to XML-Tags but the structure had to be added afterwards. The mapping feature can be used, if the text is in a proper layout. Otherwise the user has to mark the specific text area (e.g. one chapter) and then to assert the XML tag to the text.
QuarkXPress is another desktop publishing application (DTP) produced by Quark Inc. With QuarkXPress users can import and export XML Documents. With Quark Digital Media Server content can be stored in a central database. It then can be used in multiple forms according to the principles of multi-channel publishing.
Quark XTensions software, which are plugins, can automate functions and eliminate repetitive steps with palettes, commands, tools, and menus. Tests with QuarkXPress 6.5 Passport (international Edition) showed that QuarkXPress was not able to import the TEI-DTD. To tag the text of the book, a new, flat DTD had to be written. With the new DTD the mapping from layout formats to XML tags was possible. The content then is exported into a XML file. This is the basic version for the accessibility work.
The post-processing tasks are necessary, because, as mentioned before, the exported files in some cases have no structure and there are also parts of some books that could not be exported (e.g. graphics, made in the authoring systems). The post-processing tasks were:
- Adding Structure to the XML
- Revise elements, that were not exported properly
- Describe Images
The result after the completion of the work is a valid XML version of the book.
The next step is to convert the XML via style sheets into the target format. The style sheets for the conversion are freely available on the internet [Rahtz05]. They allow to convert the XML file into a HTML file with one/multiple pages and also to convert the XML file into a PDF file.
Training materials, seminars and workshops
Training materials have been developed which are now used in workshops and seminars to transfer the knowledge to as many publishers as well as design agencies as possible.
To make sure that the books are not used outside the designated user group a DRM System was customized. The system consists of a secure-reader-software and a USB dongle, which acts as the key. Every student gets a key and the software. The key has a code, which allows the student to read the book, if the key is plugged into the computer. This system has the advantage, that the user is not bound to one specific computer or peace of hardware. He can read the book for example at school but also in a learning group or at home. How the students get their books and a detailed workflow between publishers and the service providers is described in the next paragraph.
To start the process, a teacher of a student with special needs orders a book in an accessible format. If the schoolbook service provider does already have the book in its stock, it will be provided directly to the student. Otherwise, the service provider asks the publisher for the electronic version of the book. The publisher sends his TEI-XML file to the service Provider. The service Provider produces the accessible version of the book. Printed (Braille/enlarged) copies are sent by standard mail. If an electronic document is ordered, the service provider encodes the files with the DRM system using the data from the USB dongle of the student. The book is placed on a server, where the student can download the book. When the student has the reader software installed and the dongle plugged in, he can open the book and read it.
Agreement between Publisher and Service Provider
To ensure that the process works efficiently, an agreement between publishers and service providers has been worked out. The core articles of the agreement are:
- The publishers provide their electronic source documents
- It must be ensured, that the books are only given to persons with special need
- A DRM system must be used therefore
- It must be a "closed" system with registered users
The agreement will be signed by every publisher and service provider. If a service provider needs a book from a publisher he can ask for it under the condition of the framework agreement.
The most important result of the project is the fact that handing over digital copies of print published documents is guaranteed in the future.
The project showed that it is technically feasible to create XML versions of books by using the print ready version of a document. The experience also showed that the quality of the XML after just using the functions provided by the authoring tools is not good enough. A lot of work has to be done afterwards by cleaning and revising the XML document. The persons who are performing this work will have to have some basic XML skills. It will also be a challenge to convince the publishers to create documents that can be exported into XML without a lot of additional effort. In some areas at the moment there are only limited possibilities to sources from publishers, especially in areas, where books consist mainly of pictures, graphics and other visual content. Another challenge is the integration of non-text content like mathematical or chemical expressions.
The project made obvious that all publishers pass their layout data to the print office by using PDF. An important task in the future will be, to allow authoring systems to create PDF files that are either accessible or allow a conversion back into a useful format.
In any case these are only first, but important steps towards multi channel publishing. More work is needed for a more efficient production of different versions of one source document.