language-icon Old Web
English
Sign In

Structured document

A structured document is an electronic document where some method of embedded coding, such as mark-up, is used to give the whole, and parts, of the document various structural meanings according to a schema. A structured document whose mark-up doesn't break the schema and is designed to conform to and which obeys the syntax rules of its mark-up language is 'well-formed'.The Standard Generalized Markup Language (SGML) has pioneered the concept of structured documentsXML is the universal format for structured documents and data on the Web A structured document is an electronic document where some method of embedded coding, such as mark-up, is used to give the whole, and parts, of the document various structural meanings according to a schema. A structured document whose mark-up doesn't break the schema and is designed to conform to and which obeys the syntax rules of its mark-up language is 'well-formed'. As of 2009 the most widely used markup language, in all its evolving forms, is HTML, which is used to structure documents according to various Document Type Definition (DTD) schema defined and described by the W3C, which continually reviews, refines and evolves the specifications. In writing structured documents the focus is on encoding the logical structure of a document, with no explicit concern in the structural markup for its presentation to humans by printed pages, screens or other means. Structured documents, especially well formed ones, can easily be processed by computer systems to extract and present metadata about the document. In most Wikipedia articles for example, a table of contents is automatically generated from the different heading tags in the body of the document. Popular word processors can have such a function available. In HTML a part of the logical structure of a document may be the document body; <body>, containing a first level heading; <h1>, and a paragraph; <p>.

[ "Information retrieval", "Database", "Data mining", "World Wide Web", "Programming language", "structured document retrieval" ]
Parent Topic
Child Topic
    No Parent Topic