Presentazione sul tema: "XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation."— Transcript della presentazione:
XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation
XML - Introduction 2 Agenda What is …… Ten points for XML History and Evolution Technologies for add funtionalities XML Family XML Application Areas Electronic Data Interchange
XML - Introduction 3 XML: what is The Extensible Markup Language (XML) is a general-purpose specification for creating custom markup languages markup language is an artificial language using a set of annotations to text that give instructions regarding how text is to be displayed. A well-known example of a markup language in use in computing is HyperText Markup Language (HTML) It is classified as an extensible language because it allows its users to define their own elements
XML - Introduction 4 XML: cosa è XML è un metalinguaggio, che permette di definire sintatticamente linguaggi di markupmetalinguaggio definisce un insieme regole (meta)sintattiche, attraverso le quali è possibile descrivere formalmente un linguaggio di markup, detto applicazione XML ogni applicazione XML eredita da XML un insieme di caratteristiche sintattiche comuni ogni applicazione XML a sua volta definisce una sintassi formale particolare XML permette di esplicitare la (le) struttura(e) di un documento in modo formale mediante marcatori (markup) che vanno inclusi allinterno del testo (character data) Il markup rappresenta la struttura logica del documento Il markup si riconosce dal resto del testo perché compreso tra delimiter, informalmente: &yyyy;
XML - Introduction 5 XML in 10 Points http://www.w3.org/XML/1999/XML-in-10-points.html http://www.w3.org/XML/1999/XML-in-10-points.html 1.XML is for structuring data XML documents reflect the structure of the data that they contain. For example, if the document were a book, it might contain elements, which would in turn contain elements, and so on. XML is a set of rules (you may also think of them as guidelines or conventions) for designing text formats that let you structure your data. XML makes it easy for a computer to generate data, read data, and ensure that the data structure is unambiguous. XML avoids common pitfalls in language design: it is extensible, platform-independent, and it supports internationalization and localization. fully Unicode-compliant.Unicode
XML - Introduction 6 XML in 10 Points http://www.w3.org/XML/1999/XML-in-10-points.html http://www.w3.org/XML/1999/XML-in-10-points.html 2.XML looks a bit like HTML Like HTML, XML makes use of tags (words bracketed by ' ') and attributes (of the form name="value"). While HTML specifies what each tag and attribute means, and often how the text between them will look in a browser, XML uses the tags only to delimit pieces of data, and leaves the interpretation of the data completely to the application that reads it. In other words, if you see " " in an XML file, do not assume it is a paragraph. Depending on the context, it may be a price, a parameter, a person, a p... (and who says it has to be a word with a "p"?).
XML - Introduction 7 XML in 10 Points http://www.w3.org/XML/1999/XML-in-10-points.html http://www.w3.org/XML/1999/XML-in-10-points.html 3.XML is text, but isn't meant to be read Although XML is verbose, and it is all ASCII text, XML is still designed primarily to be used by automated systems, not necessarily read by humans. Like HTML, XML files are text files that people shouldn't have to read, but may when the need arises. Compared to HTML, the rules for XML files allow fewer variations. A forgotten tag, or an attribute without quotes makes an XML file unusable, while in HTML such practice is often explicitly allowed.
XML - Introduction 8 XML in 10 Points http://www.w3.org/XML/1999/XML-in-10-points.html http://www.w3.org/XML/1999/XML-in-10-points.html 4.XML is verbose by design Since XML is a text format and it uses tags to delimit the data, XML files are nearly always larger than comparable binary formats. That was a conscious decision by the designers of XML. The advantages of a text format are evident, and the disadvantages can usually be compensated at a different level. Disk space is less expensive than it used to be, and compression programs like zip and gzip can compress files very well and very fast. In addition, communication protocols such as modem protocols and HTTP/1.1, the core protocol of the Web, can compress data on the fly, saving bandwidth as effectively as a binary format.
XML - Introduction 9 XML in 10 Points http://www.w3.org/XML/1999/XML-in-10-points.html http://www.w3.org/XML/1999/XML-in-10-points.html 5.XML is a family of technologies The core of XML is the XML 1.0 recommendation. Beyond XML 1.0, "the XML family" is a growing set of modules that offer useful services to accomplish important and frequently demanded tasks XLink describes a standard way to add hyperlinks to an XML file. XPointer is a syntax in development for pointing to parts of an XML document. An XPointer is a bit like a URL, but instead of pointing to documents on the Web, it points to pieces of data inside an XML file. CSS, the style sheet language, is applicable to XML as it is to HTML. XSL is the advanced language for expressing style sheets. It is based on XSLT, a transformation language used for rearranging, adding and deleting tags and attributes. The DOM is a standard set of function calls for manipulating XML (and HTML) files from a programming language. XML Schemas 1 and 2 help developers to precisely define the structures of their own XML-based formats.
XML - Introduction 10 XML in 10 Points http://www.w3.org/XML/1999/XML-in-10-points.html http://www.w3.org/XML/1999/XML-in-10-points.html 6.XML is new, but not that new Development of XML started in 1996 and it has been a W3C Recommendation since February 1998, which may make you suspect that this is rather immature technology. In fact, the technology isn't very new. Before XML there was SGML, developed in the early '80s, an ISO standard since 1986, and widely used for large documentation projects. The designers of XML simply took the best parts of SGML, guided by the experience with HTML, and produced something that is no less powerful than SGML, and vastly more regular and simple to use.
XML - Introduction 11 XML in 10 Points http://www.w3.org/XML/1999/XML-in-10-points.html http://www.w3.org/XML/1999/XML-in-10-points.html 7.XML leads HTML to XHTML There is an important XML application that is a document format: W3C's XHTML, the successor to HTML. XHTML has many of the same elements as HTML. The syntax has been changed slightly to conform to the rules of XML. A format that is "XML-based" inherits the syntax from XML and restricts it in certain ways (e.g, XHTML allows " ", but not " "); it also adds meaning to that syntax (XHTML says that " " stands for "paragraph", and not for "price", "person", or anything else).
XML - Introduction 12 XML in 10 Points http://www.w3.org/XML/1999/XML-in-10-points.html http://www.w3.org/XML/1999/XML-in-10-points.html 8.XML is modular Using XML, you can define vocabularies that are designed to be reused. By creating DTDs or XML Schemas, you can create sets of documents that are all based on common vocabularies. Similarly, using XML Namespaces, you can publish and share those vocabularies without conflicts. Since two formats developed independently may have elements or attributes with the same name, care must be taken when combining those formats (does " " mean "paragraph" from this format or "person" from that one?).
XML - Introduction 13 XML in 10 Points http://www.w3.org/XML/1999/XML-in-10-points.html http://www.w3.org/XML/1999/XML-in-10-points.html 9.XML is the basis for RDF and the Semantic Web RDF, or the Resource Description Framework, and the Semantic Web are both initiatives of the W3C to help refine the way information is organized on the Web. XML is the basis of these technologies, and will help organize the information on the Web, making it easier for users to find and access the information they need.
XML - Introduction 14 XML in 10 Points http://www.w3.org/XML/1999/XML-in-10-points.html http://www.w3.org/XML/1999/XML-in-10-points.html 10. XML is license-free, platform-independent and well-supported XML is not owned by any corporation, nor is it controlled by a corporation. It is a publication of the W3C, and as such, it can be used freely by anyone. And although some may have issues with the W3C process, or what ends up in the final Recommendations, the bottom line is that it makes XML a fairly open standard. (open standard is a standard that is publicly available and has various rights to use associated with it. )
XML - Introduction 15 Riferimenti in Italiano XML in 10 punti Questo sommario in 10 punti cerca di raccogliere alcuni concetti basilari che permettano al neofita di vedere un po' di luce attraverso la nebbia. di Andrea Benassi 26 Novembre 2003 http://www.indire.it/content/index.php?action=read &id=313 http://www.indire.it/content/index.php?action=read &id=313
XML - Introduction 16 XML e W3C XML is recommended by the World Wide Web Consortium (W3C).recommendedWorld Wide Web Consortium (W3C). The recommendation specifies both the lexical grammar and the requirements for parsing. Lexical That is, the rules governing how a character sequence is divided up into subsequences of characters, each of which represents an individual token. parsing, or, more formally, syntactic analysis, is the process of analyzing a sequence of tokens to determine their grammatical structure with respect to a given (more or less) formal grammar.
XML - Introduction 17 History It started as a simplified subset of the Standard Generalized Markup Language (SGML)Standard Generalized Markup Language (SGML) The versatility of SGML for dynamic information display was understood by early digital media publishers in the late 1980s prior to the rise of the Internet. By the mid-1990s some practitioners of SGML had gained experience with the World Wide Web, and believed that SGML offered solutions to some of the problems the Web was likely to face as it grew. Dan Connolly added SGML to the list of W3C's activities when he joined the staff in 1995; work began in mid-1996 when Sun Microsystems engineer Jon Bosak developed a charter and recruited collaborators. It started as a simplified subset of the Standard Generalized Markup Language (SGML)Standard Generalized Markup Language (SGML) The versatility of SGML for dynamic information display was understood by early digital media publishers in the late 1980s prior to the rise of the Internet. By the mid-1990s some practitioners of SGML had gained experience with the World Wide Web, and believed that SGML offered solutions to some of the problems the Web was likely to face as it grew. Dan Connolly added SGML to the list of W3C's activities when he joined the staff in 1995; work began in mid-1996 when Sun Microsystems engineer Jon Bosak developed a charter and recruited collaborators.
XML - Introduction 18 Evolution XML was compiled by a working group of eleven members, supported by an (approximately) 150-member Interest Group. Technical debate took place on the Interest Group mailing list and issues were resolved by consensus or, when that failed, majority vote of the Working Group. The XML Working Group never met face-to-face; the design was accomplished using a combination of email and weekly teleconferences. The major design decisions were reached in twenty weeks of intense work between July and November 1996, when the first Working Draft of an XML specification was published. Further design work continued through 1997, and XML 1.0 became a W3C Recommendation on February 10, 1998.
XML - Introduction 19 Working Group's goals Internet usability, general-purpose usability SGML compatibility Facilitation of easy development of processing software and minimization of optional features Legibility, formality, conciseness, and ease of authoring. Like its antecedent SGML, XML allows for some redundant syntactic constructs and includes repetition of element identifiers. In these respects, terseness was not considered essential in its structure.
XML - Introduction 20 The name XML …. other names (CURIOSITY) "MAGMA" (Minimal Architecture for Generalized Markup Applications) "SLIM" (Structured Language for Internet Markup) "MGML" (Minimal Generalized Markup Language).
XML - Introduction 21 Perché non SGML? SGML ha molti pregi, ma ha dalla sua una complessità duso e di comprensione notevole Non è pensato per la rete XML contiene tutte le caratteristiche di SGML che servono per creare applicazioni generali...senza scendere nel livello di dettaglio e pedanteria richiesti da SGML Inoltre, il successo di HTML ha fatto capire che: Il mondo degli sviluppatori è pronto ad accogliere il modello basato sul markup La semplicità è un punto di forza fondamentale The differences between SGML and XML are highlighted in a note published by the W3C, which can be found at: http://www.w3.org/TR/NOTE-sgmlxml-971215.http://www.w3.org/TR/NOTE-sgmlxml-971215
XML - Introduction 22 XML version XML 1.0, was initially defined in 1998. XML 1.0, was initially defined in 1998 It has undergone minor revisions since then, without being given a new version number, and is currently in its fourth edition, as published on August 16, 2006. It is widely implemented and still recommended for general use. The second, XML 1.1, was initially published on February 4, 2004, the same day as XML 1.0 Third Edition, and is currently in its second edition, as published on August 16, 2006. XML 1.1 is not very widely implemented and is recommended for use only by those who need its unique features. XML 1.0 and XML 1.1 differ in the requirements of characters used for element and attribute names: XML 1.0 only allows characters which are defined in Unicode 2.0, which includes most world scripts, but excludes those which were added in later Unicode versions.
XML - Introduction 23 HTML case XML non è un sostituto di HTML HTML nasce come DTD di SGML per la pubblicazione di semplici documenti testuali con qualche immagine e collegamento ipertestuale Vengono implementate nel tempo molte estensioni proprietarie che creano barriere allinteroperatività degli strumenti I browser (parser) rilassano le regole sintattiche ed interpretano anche documenti HTML scorretti HTML è per presentare informazioni, XML è per descrivere informazioni.
XML - Introduction 24 Many Technologies Contribute to the Power of XML you would need to make use of some other technologies that are not specifically XML, but might be based on XML, or be supplementary to XML. If you wanted to use XML as a file format for storing information, and then publishing that information in print, on CD-ROM, and on the World Wide Web, you would need to make use of some other technologies that are not specifically XML, but might be based on XML, or be supplementary to XML. You might have an XML document that you want to display on the Web; however, XML documents do not contain any information about display formatting. To transform the XML data into HTML or XHTML for displaying it on the Web, you might need to use a style sheet, such as the Extensible Stylesheet Language (XSL)
XML - Introduction 25 Documet Type Definition You might also need to specify exactly how XML files are to be structured, using a set of rules ( Document Type Definition (DTD)). DTDs are an integral part of creating valid XML, but they are actually not formally defined anywhere. DTDs are a holdover from SGML, maintained for compatibility reasons. The syntax used for the declarations in DTDs is defined as a part of the XML 1.0 Recommendation DTDs are usefulwithout them or another type of schema, it is impossible to verify that an XML file is structured properly within the rules the author had in mind. But DTDs are not required in order to use XML
XML - Introduction 26 Note: XML can come in two varieties: well formed and valid Well-formed XML means that the XML is written in the proper format, and that it complies with all the rules for XML as set forth in the XML 1.0 Recommendation. Valid XML means that the XML document has been validated against a rule set, or schema,
XML - Introduction 27 XML 1.0 Reccomandation defines the basic structures of XML Elements Attributes Entities Notations CDATA sections PCData Sections Comments This includes defining the conventions for names, case sensitivity, start tags, end tags, and so on. Everything you need to work with well-formed XML is contained within this one Recommendation.
XML - Introduction 28 XML-Related Recommendations There are also a number of W3C Recommendations that are very closely related to the core XML technology. In this category, the Recommendations define some technologies that are designed specifically to add functionality to XML 1.0. These technologies include XML Namespaces and XML Schemas There are also a number of W3C Recommendations that are very closely related to the core XML technology. In this category, the Recommendations define some technologies that are designed specifically to add functionality to XML 1.0. These technologies include XML Namespaces and XML Schemas
XML - Introduction 29 Namespaces XML allows developers to create their own markup languages, for use in a variety of applications. However, there is nothing to stop two developers from developing markup languages that have similar tags, but with different structure or meaning. If both of these developers were using their markup languages internally only, this might not be a problem. But what if these developers start sharing their vocabularies with their clients, vendors, and the general public? The result could be confusion about what tag means what, and in what context.
XML - Introduction 30 Namespace example (I) Developer One designs a element that looks like this: John Doe Developer Two, however, prefers to use a element with no children: John Doe For example, what happens if a vendor is working with both organizations? Developer One designs a element that looks like this: John Doe Developer Two, however, prefers to use a element with no children: John Doe For example, what happens if a vendor is working with both organizations?
XML - Introduction 31 Namespace example (II) Create elements as being a part of a specific namespace. This means that when they are used, the parser is aware that they belong to a namespace, and if a similar element is used, but it belongs to a different namespace, there is no conflict. Namespaces make use of a special attribute called xmlns that allows you to define a prefix and the namespace URI. John Dough Jane Doe
XML - Introduction 32 XML Schemas In order to be considered valid, the XML document needs to either have a DTD or an XML Schema. XML Schemas represent a formal schema language for defining the structure of XML documents. The XML Schema specification deals with some of the shortcomings of DTDs, such as the lack of robust data structures, and also abandons the cryptic syntax of DTDs for an easier-to-use XML-based syntax
XML - Introduction 33 XML Family There are also a number of W3C Recommendations that deal with various aspects of XML that are not necessarily related to the structure of an XML document but provide mechanisms for implementing XML in practical solutions. These recommendations are related to the display or navigation of XML documents. XML è in realtà una famiglia di linguaggi. Alcuni hanno lambizione di standard, altri sono solo proposte di privati o industrie interessate. Alcuni hanno scopi generali, altri sono applicazioni specifiche per ambiti ristretti.
XML - Introduction 34 Extensible Stylesheet Language (XSL). Stylesheet language designed to aid in the presentation of XML. As a stylesheet language, it is similar to Cascading Style Sheets (CSS), although there are some significant differences XSL uses an XML syntax to specify how elements within an XML document should be displayed.
XML - Introduction 35 Extensible Stylesheet Language (XSL) example Introducing XML John Doe Learning about XML is not complicated... If we wanted to display the title of the document in italic, we could use an XSL sheet that looks something like this: When the stylesheet and XML document are processed by an XSL-capable parser, the result will be a document displayed with the title in italic.
XML - Introduction 36 Extensible Stylesheet Language Trasformation(XSLT) XSLT is a technology that allows developers to author a stylesheet which when processed, will result in the elements and attributes of an XML document being transformed into another format. For example, by using XSLT it is possible to transform an XML element: John Doe into an HTML tag set: John Doe
XML - Introduction 37 XPath/XPointer XPath is a Recommendation that was developed specifically for locating components within an XML document XPointer is a Recommendation that allows developers to easily refer to and locate XML document fragments. This is very useful for several types of applications, including the ability to have multiple authors working on a single large XML document, or making extremely large XML documents more manageable for editing purposes. XPointer enables you to specify points and ranges within your XML documents, which can then be treated as "mini" documents in their own right.
XML - Introduction 38 XLink/XInclude/XBase One of the most powerful aspects of information on the World Wide Web is the ability to link together documents of interest. Therefore, a linking mechanism for XML documents naturally increases the power of XML. The XLink and XBase Recommendations are both used to specify information about linking XML documents together. Linking in XML is more complicated than in HTML, because there are more types of links available to developers There are also applications where simply linking between documents might not be ideal and you might want to build a large XML document from a set of smaller documents. For that purpose, there is the XInclude Recommendation, which provides the means to include sets of XML documents into a single document structure.
XML - Introduction 39 Processing XML files Three traditional techniques for processing XML files are: Using a programming language and the SAX API. Using a programming language and the DOM API. Using a transformation engine and a filter (XSL) An application programming interface (API) is a set of functions, procedures, methods or classes that an operating system, library or service provides to support requests made by computer programs
XML - Introduction 40 Document Object Model, or DOM XML and structured documents like XML are trees, and the DOM is essentially an API for manipulating the document tree. Rather than an API based on user events (such as clicking a mouse), the DOM is based on the structure of the document itself. The DOM is likely to be best suited for applications where the document must be accessed repeatedly or out of sequence order. If the application is strictly sequential and one-pass, the SAX model is likely to be faster and use less memory. XML and structured documents like XML are trees, and the DOM is essentially an API for manipulating the document tree. Rather than an API based on user events (such as clicking a mouse), the DOM is based on the structure of the document itself. The DOM is likely to be best suited for applications where the document must be accessed repeatedly or out of sequence order. If the application is strictly sequential and one-pass, the SAX model is likely to be faster and use less memory.
XML - Introduction 41 Simple API for XML, or SAX SAX is an event-driven API, which means that rather than working with the document structure as a whole, SAX allows you to deal with specific parts of a document as the document is parsed. The quantity of memory that a SAX parser must use in order to function is typically much smaller than that of a DOM parser. DOM parsers must have the entire tree in memory before any processing can begin. The memory footprint of a SAX parser, by contrast, is based only on the maximum depth of the XML file Because of the event-driven nature of SAX, processing documents can often be faster than DOM-style parsers. Memory allocation takes time, so the larger memory footprint of the DOM is also a performance issue. Due to the nature of DOM, streamed reading from disk is impossible. Processing XML documents that could never fit into memory is only possible through the use of a stream XML parser, such as a SAX parser. SAX is an event-driven API, which means that rather than working with the document structure as a whole, SAX allows you to deal with specific parts of a document as the document is parsed. The quantity of memory that a SAX parser must use in order to function is typically much smaller than that of a DOM parser. DOM parsers must have the entire tree in memory before any processing can begin. The memory footprint of a SAX parser, by contrast, is based only on the maximum depth of the XML file Because of the event-driven nature of SAX, processing documents can often be faster than DOM-style parsers. Memory allocation takes time, so the larger memory footprint of the DOM is also a performance issue. Due to the nature of DOM, streamed reading from disk is impossible. Processing XML documents that could never fit into memory is only possible through the use of a stream XML parser, such as a SAX parser.
XML - Introduction 42 XML and Data: Document Repositories There are a number of tools called document repositories, which are designed specifically for maintaining large documents or sets of documents. Because these tools are based in SGML, most have rapidly adapted to XML and are available for use now. Document repositories can be viewed as specialized databases, designed to work with large documents. They often have special features, such as the capability to enable users to edit only a part of a document, and then integrate that part into the There are a number of tools called document repositories, which are designed specifically for maintaining large documents or sets of documents. Because these tools are based in SGML, most have rapidly adapted to XML and are available for use now. Document repositories can be viewed as specialized databases, designed to work with large documents. They often have special features, such as the capability to enable users to edit only a part of a document, and then integrate that part into the
XML - Introduction 43 XML and Data: XQuery The proper design of your database structure (the schema) is essential The best data in the world is useless without proper queries. Because XML documents are now being stored in relational databases, object databases, document repositories, and as simple flat files, the W3C wanted to create a common query language which would enable users to create queries that would work across all these different kinds of data applications. One way to look at XQuery is as an XML-specific SQL. The advantage to XQuery for XML is that XQuery is being designed specifically for XML,with the structure of XML documents in mind. The proper design of your database structure (the schema) is essential The best data in the world is useless without proper queries. Because XML documents are now being stored in relational databases, object databases, document repositories, and as simple flat files, the W3C wanted to create a common query language which would enable users to create queries that would work across all these different kinds of data applications. One way to look at XQuery is as an XML-specific SQL. The advantage to XQuery for XML is that XQuery is being designed specifically for XML,with the structure of XML documents in mind.
XML - Introduction 44 The Related Technologies There is another category of XML technologies called XML vocabularies. These are individual markup languages that have been written using XML 1.0. XML vocabularies can be treated just like any other XML document, because they are wellformed (and in many cases, valid) XML. When you are developing XML documents, what you are really doing is developing your own XML vocabularies. However, there may already be an existing XML vocabulary that will meet your needs. There are literally hundreds of XML vocabularies in existence. Some of these vocabularies are being developed privately for use within a specific organization. And some are being developed publicly for anyone to use. The vocabularies we have chosen to cover here are vocabularies that are being developed in conjunction with the W3C, and either are, or will likely become, W3C Recommendations
XML - Introduction 45 Different Vocabularies : XHTML XHTML, which stands for XML HTML. XHTML is simply HTML, rewritten to comply with the rules for being well-formed The reasoning behind this move is that XHTML will allow XML applications to read and treat HTML as if it were just another XML document One critical difference is that unlike HTML, XHTML is case sensitive, and all the tags have to appear in lower case. That is because XML is case sensitive, so and are not the same tag. Additionally, XHTML requires that all tags be properly closed and nested; HTML does not.
XML - Introduction 46 Different Vocabularies To make wireless communication easier between devices, and to serve documents to wireless devices, there is an XML-based vocabulary in use (and in ongoing development) designed specifically for wireless: the Wireless Markup Language (WML). Scalable Vector Graphics (SVG) is an XML-based specification for creating graphics, which could be used on the Web or in print. SVG enables these graphics to be created in a text file, based on the geometry of the graphic. Synchronized Multimedia Integration Language (SMIL) is an XML- based language that allows developers to create multimedia presentations in an XML-based language. It allows features similar to that of PowerPoint or Flash, such as animated graphics, sounds, and the ability to interact with the presentation on some level (such as following links) Resource Description Framework (RDF) is primarily an XML-based format for expressing metadata about information on the Web. Metadata is data about data; for example, a table of contents in a book might be considered metadata because it describes the contents of each chapter in the book.
XML - Introduction 47 Ragioni per luso di XML Trasmettere dati tra sistemi diversi (e spesso tra piattaforme diverse) Inviare informazioni in un formato indipendente dalla sua rappresentazione (separazione tra contenuti e presentazione) Scambiarsi informazioni insieme alla struttura semantica dellinformazione Trasmettere dati che sono facilmente intellegibili sia dalluomo che dal computer Consentire alle imprese di accelerare lintegrazione con i loro business partner Migliorare la diffusione delle informazioni dentro limpresa e sul web Permettere la gestione di quei documenti precedentemente di competenza dellEDI
XML - Introduction 48 Tecnologia XML Vantaggi Presentazione dei dati orientata allutente La combinazione di XML+XSL: permette di separare la logica di business dalla logica di presentazione libera lapplicazione dai vincoli legati al device di presentazione Scambio di dati tra applicazioni lintegrazione tra applicazioni è possibile con uno sforzo, che è una frazione di quello tradizionale dellarea EDI Pubblicazione di dati direttamente in XML il formato leggibile dalla macchina (UNICODE) può essere combinato con altri dati ed elaborato ulteriormente (impossibile con HTML)
XML - Introduction 49 AREE APPLICATIVE PRINCIPALI Goldfarb e Prescod nel loro testo "The XML Handbook" dividono tutte le applicazioni XML in due grandi categorie: POP (Presentation oriented publishing) MOM (Message oriented middleware) Il POP gestisce documenti il cui utente finale è un lettore umano. Il publishing di testi, di manuali, di presentazioni sono obiettivi di POP. Le finalità di POP sono simili a quelle dell'HTML. Usando l'XML è però possibile dare connotazioni strutturali più ricche ai testi (vedi: DocBook). Gli stylesheet permettono di trasformare documenti che rappresentano la struttura logica in documenti che descrivono il layout fisico. Cambiando stylesheet, si può cambiare il modo in cui i documenti sono visualizzati/stampati. Il MOM si basa sullo scambio di documenti XML fra programmi al fine di svolgere una funzione coordinata in un ambiente distribuito. Un esempio di MOM è la gestione automatica di ordini fra fornitori e clienti. Il MOM può coinvolgere diversi tipi di risorse (p.e., database e sistemi di message-queuing), per le quali si stanno diffondendo interfacce basate su XML.
XML - Introduction 50 Presentation Oriented Publishing POP è stata lapplicazione killer di SGML Ha portato enormi risparmi alle aziende che lavoravano sul Web negli anni 80 Invece di creare documenti formattati, gli utenti umani creano astrazioni non formattate Il file rappresenta ciò che è nel documento, non come deve apparire Lutente POP non si preoccupa dei dati ma della rendition Per ottenere il risultato desiderato specificare dei foglio di stile, uno per la stampa, uno per il CD-Rom, uno per il Web, etc.
XML - Introduction 51 Message Oriented Middleware MOM lapplicazione killer di XML sul Web MOM influenza radicalmente il concetto di middleware
XML - Introduction 52 XML AREE APPLICATIVE Content management presentation-oriented publishing one common data format multiple rendering styles (XSL) Data interchange/EDI data interchange / EDI interfacing of heterogeneous products inter-process communication (IPC) Application integration application-to-application communication Internet message formats (protocols) client/middle tier/server Data aggregation/portal enterprise information portals
XML - Introduction 53 Electronic Data Interchange The transfer of structured data, by agreed message standards, from one computer system to another without human intervention. Even in this era of technologies such as XML web services, the Internet and the World Wide Web, EDI is still the data format used by the vast majority of electronic commerce transactions in the world. Comprende: Un set di regole sintattiche per strutturare i dati Un protocollo per lo scambio interattivo Messaggi standard Le organizzazioni che inviano o ricevono documenti sono chiamate in terminologia EDI "trading partners"
XML - Introduction 54 Essential elements of EDI the use of an electronic transmission medium (originally a value-added network, but increasingly the open, public Internet) rather than the despatch of physical storage media such as magnetic tapes and disks; the use of structured, formatted messages based on agreed standards (such that messages can be translated, interpreted and checked for compliance with an explicit set of rules); relatively fast delivery of electronic documents from sender to receiver (generally implying receipt within hours, or even minutes); and direct communication between applications (rather than merely between computers).
XML - Introduction 55 Il vecchio EDI Formati diversi per ciascuna applicazione Il codice applicativo non ha una vista univoca Nuovi attori hanno impatti devastanti Può soltanto condividere elementi definiti in precedenza I nuovi bisogni non possono essere facilmente soddisfatti
XML - Introduction 56 XML può essere la soluzione Formati diversi per ciascuna applicazione XML fornisce una singola vista logica Larchitettura flessibile supporta nuovi componenti
XML - Introduction 57 Calcolo Distribuito (I) Reazione lenta ai cambiamenti Costi di manutenzione elevati Flessibilità limitata I cambiamenti dei dati si propagano a tutti i livelli
XML - Introduction 58 Calcolo Distribuito (II) Più standard Più semplice Più facilmente estensibile Minori costi di manutenzione Maggiore reattività API e template language standard
XML - Introduction 59 Esempio: fatturazione elettronica La fatturazione elettronica elaborabile, quella cioè orientata ad automatizzare le registrazioni contabili, è basata su sistemi di trasmissione di dati commerciali ed amministrativi che, utilizzando reti di trasmissione telematica o reti di telecomunicazioni nazionali ed internazionali, consentono di scambiare automaticamente tra due applicazioni informatiche, messaggi strutturati mediante una norma concordata. Sono tali, per esempio, i tradizionali sistemi di trasmissione EDI (Electronic Data Interchange che scambiano dati secondo tracciati standard internazionali, utilizzando reti di trasmissione private oppure le più innovative,e meno onerose, soluzioni WEBEDI con tecnologie di trasmissione web-based oppure le ultime nate, le soluzioni XML-based, dove i dati vengono scambiati utilizzando il metalinguaggio XML (eXtensible Markup Language), secondo gli stessi standard dellEDI oppure con nuovi standard internazionali
XML - Introduction 60 Approccio XML/EDI basato su scambio di messaggi Piero De Sabbata ENEA
XML - Introduction 61 Trasmissione messaggi e sicurezza Piero De Sabbata ENEA
XML - Introduction 62 Lo scenario message based Piero De Sabbata ENEA
XML - Introduction 63 Alcuni Riferimenti Specifications W3C XML homepage The XML 1.0 specification The XML 1.1 specification Sources Introduction to Generalized Markup by Charles Goldfarb Introduction to Generalized MarkupCharles Goldfarb Making Mistakes with XML by Sean Kelly Making Mistakes with XMLSean Kelly The Multilingual WWW by Gavin Nicol The Multilingual WWW Retrospective on Extended Reference Concrete Syntax by Rick Jelliffe Retrospective on Extended Reference Concrete SyntaxRick Jelliffe XML Based languages Essential XML Quick Reference XML, Java and the Future of the Web by Jon Bosak XML, Java and the Future of the WebJon Bosak XML tutorials in w3schools XML.gov Retrospectives Thinking XML: The XML decade by Uche Ogbuji Thinking XML: The XML decade XML: Ten year anniversary by Elliot Kimber XML: Ten year anniversary Closing Keynote, XML 2006 by Jon Bosak Closing Keynote, XML 2006Jon Bosak Five years later, XML... by Simon St. Laurent Five years later, XML... 23 XML fallacies to watch out for by Sean McGrath 23 XML fallacies to watch out for W3C XML is Ten!, XML 10 years press release W3C XML is Ten!
XML - Introduction 64 ConsortiumRecommendations Canonical XML · CDF · CSS · DOM · HTML · MathML · OWL · PLS · RDF · RDF Schema · SISR · SMIL · SOAP · SRGS · SSML · SVG · SPARQL · Timed Text · VoiceXML · WSDL · XForms · XHTML · Canonical XMLCDFCSSDOMHTML MathMLOWLPLSRDFRDF SchemaSISR SMILSOAPSRGSSSMLSVGSPARQL Timed TextVoiceXMLWSDLXFormsXHTML XML · XML Base · XML Events · XML Information Set · XML Schema (W3C) · XML Signature · XPath · XPointer · XQuery · XSL Transformations · XSL-FO · XSL · XLinkXML BaseXML EventsXML Information SetXML Schema (W3C)XML SignatureXPath XPointerXQueryXSL TransformationsXSL-FO XSLXLink Notes XHTML+SMIL · XAdESXHTML+SMILXAdES Working Drafts CCXML · CURIE · InkML · XFrames · XFDL · WICD · XHTML+MathML+SVG · XBL · XProc · HTML 5CCXMLCURIEInkML XFramesXFDLWICDXHTML+MathML+SVG XBLXProcHTML 5
XML - Introduction 65 UNICODE E un sistema di codifica che assegna un numero univoco ad ogni carattere usato per la scrittura di testi, in maniera indipendente dalla lingua, dalla piattaforma informatica e dal programma utilizzato. Il codice assegnato al carattere viene rappresentato con U +, seguito dalle quattro (o sei) cifre esadecimali del numero che lo individua. Attualmente lo standard Unicode non rappresenta ancora tutti i caratteri in uso nel mondo. Essendo ancora in evoluzione, si prefigge di coprire tutti i caratteri rappresentabili, garantendo la compatibilità e la non sovrapposizione con le codifiche dei caratteri già definiti, ma lasciando comunque dei ben precisi campi di codici "non usati", da riservare per la gestione autonoma all'interno di applicazioni particolari.
XML - Introduction 66
XML - Introduction 67 Character encoding Unicode can be implemented by different character encodings Una codifica di caratteri consiste in un codice che associa un insieme di caratteri ad un insieme di altri oggetti, come numeri (specialmente nell'informatica) con lo scopo di facilitare la memorizzazione di un testo in un computer o la sua trasmissione attraverso una rete di telecomunicazioni. Esempi comuni sono il Codice Morse e la codifica ASCII. The most commonly used encoding is UTF-8
XML - Introduction 68 UTF-8 UTF-8 (Unicode Transformation Format, 8 bit) è una codifica dei caratteri Unicode in sequenze di lunghezza variabile di byte Usa da 1 a 4 byte per rappresentare un carattere Unicode. Per esempio un solo byte è necessario per rappresentare i 128 caratteri dell'alfabeto ASCII, corrispondenti alle posizioni Unicode da U+0000 a U+007F. Esempi : http://it.wikipedia.org/wiki/UTF-8#Descrizione http://en.wikipedia.org/wiki/UTF-8#Examples
XML - Introduction 69 Esempi Intervallo Unicode UTF-8 Binario 0x000000 - 0x00007F 0xxxxxxx 0x000080 - 0x0007FF 110xxxxx 10xxxxxx 0x000800 - 0x00FFFF 1110xxxx 10xxxxxx 10xxxxxx 0x010000 - 0x10FFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx Per esempio, il carattere alef (א), corrispondente all'Unicode U+05D0, viene rappresentato in UTF-8 con questo procedimento: ricade nell'intervallo da 0x0080 a 0x07FF. Secondo la tabella verrà rappresentato con due byte. 110xxxxx 10xxxxxx. l'esadecimale 0x05D0 equivale al binario 101- 1101-0000. gli undici bit vengono copiati in ordine nelle posizioni marcate con "x". 110-10111 10- 010000. il risultato finale è la coppia di byte 11010111 10010000, o in esadecimale 0xD7 0x90 The Dollar Sign ($), which is Unicode U+0024 or binary 10 0100: this falls into the first line of the table range of U+0000 through U+007F The first line of the table shows it will be encoded using one byte, 0xxxxxxx Putting the binary right-justified into the 'x' bits results in 00100100 This byte in hexadecimal is 0x24. Thus the ASCII dollar sign is encoded unchanged. The Euro symbol (), which is Unicode U+20AC or binary 10 0000 1010 1100: this falls into the third line of the table range of U+0800 through U+FFFF The third line of the table shows it will be encoded using three bytes, 1110xxxx,10xxxxxx,10xxxxxx. Putting the binary right-justified into the 'x' bits results in 11100010,10000010,10101100 These bytes in hexadecimal are 0xE2,0x82,0xAC. That is the encoding of the Euro symbol () in UTF-8.
XML - Introduction 70 World Wide Web Consortium The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web (abbreviated WWW or W3). It is arranged as a consortium where member organizations maintain full-time staff for the purpose of working together in the development of standards for the World Wide Web. As of October 2008, the W3C had 418 members (http://www.w3.org/Consortium/Member/List )http://www.w3.org/Consortium/Member/List W3C also engages in education and outreach, develops software and serves as an open forum for discussion about the Web. It was founded and is headed by Sir Tim Berners- Lee.
XML - Introduction 71
XML - Introduction 72 What is a Recommendation? Unlike an officially sanctioned standards body, such as the International Standards Organization (ISO), the W3C is not an official standards organization. The W3C simply publishes "Recommendations," which are not binding in any way. Simply put, they are a set of guidelines, published and copyrighted by the W3C. The power of these "Recommendations" comes from the fact that people treat them as standards by consensus, and the fact that you can't claim compliance with a Recommendation and not be in compliance without violating the copyrights.
XML - Introduction 73 Incarico a Charles F. Goldfarb di costruire un sistema per la memorizzazione, la ricerca, la gestione e la pubblicazione di documenti legali Goldfarb scoprì che molti sistemi, in IBM, non potevano comunicare tra loro I formati dei file nelle diverse applicazioni erano proprietari...e diversi tra loro!!! 3 fatti importanti I diversi programmi avevano bisogno di supportare una rappresentazione comune dei documenti Il linguaggio comune doveva essere specifico per i documenti legali Il linguaggio doveva essere specificato in una maniera formale, capace di delimitare in modo appropriato gli elementi La risposta è stato GML (Generalized Markup Language), precursore di SGML (Standard GML), il linguaggio da cui deriva XML
XML - Introduction 74 Standard Generalized Markup Language (ISO 8879:1986 SGML) is an ISO Standard metalanguage in which one can define markup languages for documents. SGML is a descendant of IBM's Generalized Markup Language (GML), developed in the 1960s by Charles Goldfarb, Edward Mosher and Raymond Lorie (whose surname initials were used by Goldfarb to make up the term GML). SGML provides an abstract syntax that can be realized in many different concrete syntaxes SGML was originally designed to enable the sharing of machine- readable documents in large projects in government, law and industry, which have to remain readable for several decades. It has also been used extensively in the printing and publishing industries, but its complexity has prevented its widespread application for small-scale general-purpose use. Primarily intended for text and database publishing, one of its first major applications was the second edition of the Oxford English Dictionary (OED), which was and is wholly marked up using an SGML-like markup.
XML - Introduction 75 W3C XML 10 Years On 10 February 1998, W3C published Extensible Markup Language (XML) 1.0 as a W3C Recommendation. W3C would like to thank the dedicated communities -- including people who have participated in W3C's XML groups and mailing lists, the SGML community, and xml-dev -- whose efforts have created a successful family of technologies based on the solid XML 1.0 foundation. "There is essentially no computer in the world, desk-top, hand-held, or back-room, that doesn't process XML sometimes," said Tim Bray of Sun Microsystems. "This is a good thing, because it shows that information can be packaged and transmitted and used in a way that's independent of the kinds of computer and software that are involved. XML won't be the last neutral information-wrapping system; but as the first, it's done very well."
XML - Introduction 76 Il concetto di metalinguaggio (I) In logic and linguistics, a metalanguage is a language used to make statements in another language which is called the object language ( cioè un formalismo per descrivere rigorosamente un altro linguaggio) Markup languages are different from metalanguages as they only describe how a document should be presented and not the syntax of a computer programming language, however it's possible to use schemas like XML Schemas to describe content rules. XML is the metalanguage used to describe XHTML just as SGML is used to describe HTML. XHTML is much stricter than HTML, for example XHTML is case sensitive unlike HTML.
XML - Introduction 77 metalinguaggio documenti Il concetto di metalinguaggio (II) XML Math-ML XHTMLDocBook sintassi metasintassi linguaggi
XML - Introduction 78 Dato che XML è un metalinguaggio per specificare altri linguaggi, costituisce un livello comune per il dialogo in ambienti differenti XML non dice nulla su che tag utilizzare, ma fissa solo delle regole comuni per eseguire correttamente il parsing del file E possibile usare XML per gli scopi più disparati, a seconda delle operazioni che verranno eseguite dalla specifica applicazione di fronte al markup utilizzato Regole XML Tag specifici Appl. xml parser Dati (file XML) Il concetto di metalinguaggio (III)