TEXTML Server
Native XML Storage and Information Retrieval
Whitepaper January 2005
Contents
Why Use TEXTML Server?...................................................................................3 New Database Needs for Document-Centric Applications 3 One XML Back-End Server for Various Solutions 5 An Embeddable Component 6 TEXTML Server Structure Overview......................................................................7 Close-up on TEXTML Server................................................................................8 Document Base 8 TEXTML Server's Indexing Engine - a Novel Approach to Indexing 9 Search Engine 12 TEXTML RSA - Replication Service Agent 13 TEXTML FTS - Fault Tolerance Server 16 Summary....................................................................................................... 17 Other Related Documentation ........................................................................... 18 TEXTML Server User Documentation 18 Additional Product Information 18
TEXTML Server Whitepaper 2
Why Use TEXTML Server?
This Whitepaper will discuss the architecture and approach used to develop TEXTML Server, the industry leading native XML repository and search engine.
New Database Needs for Document-Centric Applications
XML is everywhere With the proliferation of XML-based applications, XML has raised expectations of how information can be leveraged to increase productivity and efficiency. Whether in Publishing, Aerospace, Financial Services, Life Sciences or Health Care, XML has become a standard technology for more and more industries and is used in virtually all industries and in a wide variety of document-centric applications to optimize document management: y Publishing y Editorial content management y Online archiving y Digital asset management y Ad management y Content syndication y Aerospace y Production of technical documentation y Interactive Electronic Technical Manuals (IETM) y Knowledge management y Financial Services y Data exchange y Web content management y Standardized business reporting processes y Business process management y Life Sciences y E-Learning y Knowledge management y Health Care y Online archiving y Knowledge management
TEXTML Server Whitepaper 3
XML challenges As organizations embrace XML and the concept of "write content once, re-use at will", IT professionals empowered to deliver the next generation of XML applications have come to realize that traditional database models are ill-suited for storing, indexing and retrieving rich XML content: y Content is no longer of a purely transactional nature, nor is it purely multimedia, or solely textual, but rather a hybrid of the three forms of content. Existing relational databases, which were conceived to support data-centric applications, are not suited to managing the kind of XML content that document management applications must handle. XML content is of unpredictable, semi-structured nature and subject to change at any point. Databases must adapt to support such content. XML, by its very nature, presents many challenges to the traditional RDBMS model. It offers the ability to create deeply hierarchical, multi-tiered structures that enable nested values that vary in length and type. It is important to be able to manage, in a streamlined way, a structure that contains empty or missing elements and whose ordering is important. y XML content does not map well to existing object-oriented and relational database models. Storing an XML document in a traditional database requires that the XML structure be mapped to a predefined database schema, thereby requiring the decomposition of the XML document in order to explode it into a series of inter-related tables. This process is often resource intensive and results in the loss of some data such as processing instructions and comments as well as the notion of element and attribute ordering - making the XML document hierarchy irrelevant. Why bother creating XML content if you are just going to destroy it upon its storage? In addition, if your XML Schema changes even slightly, your database structure will be disrupted, prompting often massive updates to be required to hundreds of... [download for more]