The Wayback Machine - https://web.archive.org/web/20010603012942/http://groups.yahoo.com:80/group/sml-dev/message/4710
Yahoo! Groups
Groups Home - Yahoo! - Account Info - Help

Welcome, Guest Register - Sign In  
sml-dev · SML-DEV [ Join This Group! ]
  Home  
* Messages  
     Post  
  Chat  
  Files  
  Bookmarks  
  Database  
  Polls  
  Members  
  Calendar  
  Promote  
 
 
  owner = Owner 
  moderator = Moderator 
  online = Online 
 
 
Messages Messages Help
 Reply  |  Forward  |  View Source  |  Unwrap Lines 
 
  Message 4710 of 4822  |  Previous | Next  [ Up Thread ]  Message Index   Msg #
From:  "Clark C . Evans" <cce@c...>
Date:  Fri May 11, 2001  8:50 pm
Subject:  YAML Draft 0.1

With quite a bit of work, I've tried to come up with
a first pass at this proposal.  It's at www.yaml.org,
below is the current text version.  Your comments
would be very cool.

Thanks!

Clark


+---------------------------------------------------------------+
|                  Welcome to YAML, Draft 0.1                   |
+---------------------------------------------------------------+
| YAML is a straight-forward markup language, offering an       |
| alternative to XML, borrowing ideas from C, HTML, Perl, and   |
| Python.                                                       |
|                                                               |
| ��*�YAML texts are brief and readable.                        |
| ��*�YAML is very expressive and extensible.                   |
| ��*�YAML has a simple stream based interface.                 |
| ��*�YAML uses data structures native to your programming      |
|     language.                                                 |
| ��*�YAML is easy to implement, perhaps too easy.              |
| ��*�YAML has a solid information model, no exceptions no      |
|     mess.                                                     |
|                                                               |
+---------------------------------------------------------------+
|                         Key Concepts                          |
+---------------------------------------------------------------+
| YAML is founded on several key concepts from very successful  |
| languages.                                                    |
|                                                               |
| ��*�YAML uses similar whitespace handling as HTML. In YAML,   |
|     sequences of spaces, tabs, and carriage return characters |
|     are folded into a single space during parse. This         |
|     wonderful technique makes markup code readable by         |
|     enabling indentation without affecting the canonical form |
|     of the content.                                           |
| ��*�YAML uses similar slash style escape sequences as C. In   |
|     YAML, the backslash, \ , is used as an escape indicator.  |
|     Like C, \n is used to represent a new line, \t is used to |
|     represent a tab, and \\ is used to represent the slash.   |
|     In addition, since whitespace is folded, YAML introduces  |
|     \s to represent additional spaces that is part of the     |
|     content and should not be folded. Further, the \          |
|     character as a continuation marker, allowing content to   |
|     be broken into multiple lines without introducing         |
|     unwanted whitespace.                                      |
| ��*�YAML uses similar data typing as Perl. In YAML, there     |
|     there are three fundamental types of data, scalars which  |
|     are indicated by a dollar ($) sign, maps/hashes which are |
|     indicated by a (%) sign, and list/vectors which are       |
|     indicated by a (@) sign. Also like perl, all node names   |
|     (variables) begin with one of these indicators. As a      |
|     result, YAML's internal memory based representation uses  |
|     your language's native map, list, and string constructs   |
|     rather than inventing it's own object model.              |
| ��*�YAML uses block scoping similar to Python. In YAML, the   |
|     extent of a node is indicated by its child's nesting      |
|     level, i.e., what column it is in. Skeptable as you may   |
|     be, ask anyone who has worked with Python, and you will   |
|     hear that it makes the code more readable and less error  |
|     prone. Try it. It makes life easy.                        |
|                                                               |
+---------------------------------------------------------------+
|                            Example                            |
+---------------------------------------------------------------+
| To the left is an example of an invoice expressed via YAML.   |
|   $invoice           00034843                                 |
|   $date              12-JAN-2001                              |
|   %buyer                                                      |
|       $given-name    Chris                                    |
|       $family-name   Dumars                                   |
|       %address                                                |
|           $line1     458 Wittigen's Way                       |
|           $line2     Suite #292                               |
|           $city      Royal Oak                                |
|           $state     MI                                       |
|           $postal    48046                                    |
|   @order                                                      |
|       %product                                                |
|           $id        BL394D                                   |
|           $desc      Grade A, Leather Hide Basketball         |
|           $price     $450.00                                  |
|           $quantity  4                                        |
|       %product                                                |
|           $id        BL4438H                                  |
|           $desc      Super Hoop (tm)                          |
|           $price     $2,392.00                                |
|           $quantity  1                                        |
|   $comments                                                   |
|       Mr. Dumars is frequently gone in the morning            |
|       so it is best advised to try things in late             |
|       afternoon. \nIf Joe isn't around, try his house\        |
|       keeper, Nancy Billsmer @ (734) 338-4338.                |
|   %delivery                                                   |
|       $method        UZS Express Overnight                    |
|       $price         $45.50                                   |
|   $tax               0%                                       |
|   $total             $4237.50                                 |
+---------------------------------------------------------------+
|                       Information Model                       |
+---------------------------------------------------------------+
| The information model is similar to XML, although it has many |
| significant differences.                                      |
| Document  The the starting production for YAML is a List.     |
| List      An ordered sequence of zero or more Nodes           |
| Node      An ordered tuple having an optional Name followed   |
|           by a mandatory Value                                |
| Name      Identical to the Name production in the XML 1.0     |
|           specification.                                      |
| Value     Exactly one of String, Map, or List                 |
| String    A sequence of zero or more characters. A character  |
|           is identical to the character defined in the Char   |
|           production of the XML 1.0 specification.            |
| Map       An un-ordered sequence of zero or more Nodes such   |
|           that each Node's Name is unique within the          |
|           sequence. There may be only a single node without a |
|           name in each map.                                   |
+---------------------------------------------------------------+
|                   Common XML Compatibility                    |
+---------------------------------------------------------------+
| Although the syntax is distinctly different, a restricted     |
| subset of YAML can be used to provide an isomorphic image of  |
| an Common XML text. This involves a few conventions layered   |
| upon YAML. Following are the simple mapping conventions.      |
| <x/>                     %x             An XML element can be |
|                           @             represented in YAML   |
|                                         using a map node with |
|                                         an anonymous list     |
|                                         child.                |
| <x>text</x>              %x             An XML text node can  |
|                           @             be represented in     |
|                            $ text       YAML using an         |
|                                         anonymous string node |
|                                         in the context of an  |
|                                         anonymous list node.  |
| <x att="value"/>         %x             An XML attribute node |
|                           $att value    is represented in     |
|                           @             YAML using a named    |
|                                         string node in the    |
|                                         context of a map      |
|                                         node.                 |
| <x><y/></x>              %x             An XML parent/child   |
|                           @             element relationship  |
|                            %y           can be represented in |
|                             @           YAML by placing the   |
|                                         element's             |
|                                         representation within |
|                                         the anonymous list    |
|                                         node.                 |
| <x a="val">text<y/></x>  %x             Of course, these all  |
|                           $a val        play together.        |
|                           @                                   |
|                            $ text                             |
|                            %y                                 |
|                             @                                 |
| This mapping has an abbreviated form, which is the default    |
| conversion, although the more verbose form lends itself       |
| better to generic processing. Thus, a XML to YAML-X converter |
| should offer both forms.                                      |
| <x/>                     $x             If there are no       |
|                                         children and no       |
|                                         attributes, an XML    |
|                                         element can be        |
|                                         written as a named    |
|                                         string node with zero |
|                                         characters.           |
| <x>text</x>              $x text        If an XML element has |
|                                         only a single text    |
|                                         node child with no    |
|                                         attributes, then it   |
|                                         can be represented    |
|                                         using a named string  |
|                                         node.                 |
| <x att="value"/>         %x             If an XML element     |
|                           $att value    with attributes lacks |
|                                         children, the         |
|                                         anonymous list node   |
|                                         may be omitted.       |
| <x><y/></x>              @x             An XML element with   |
|                           $y            children and no       |
|                                         attributes may be     |
|                                         represented as a YAML |
|                                         list node.            |
| When converting textnodes and attribute values from XML,      |
| significant whitespace must be escaped using \r for carriage  |
| return, \n for new line, \t for a tab, \s for an additional   |
| space, and \\ for a backslash. By default, the conversion     |
| should wrap content as described in the serilization section  |
| below. Below are examples of how specific text nodes could be |
| converted.                                                    |
| <x>                      $x \ntext      In this case, a new   |
| text</x>                                line had to be        |
|                                         escaped.              |
| <x>long line</x>         $x             String content is     |
|                           long          converted here using  |
|                           line          multiple indented     |
|                                         lines. No escaping    |
|                                         here due to YAML's    |
|                                         whitespace folding.   |
|                                         The YAML version      |
|                                         contains one          |
|                                         significant space     |
|                                         between "long" and    |
|                                         "line".               |
| <x>nospace</x>           $x             Here multiple lines   |
|                           no\           are also used,        |
|                           space         however a trailing    |
|                                         escape \ indicates    |
|                                         that the line break   |
|                                         does not induce a     |
|                                         significant space.    |
| <x>a \ esc</x>           $x a \\ esc    Of course, \ in       |
|                                         content must be       |
|                                         escaped               |
| <x>                      $x             A bit more            |
|  text with 2  sp          \n text with  complicated.          |
| </x>                      2\s sp\n                            |
+---------------------------------------------------------------+
|                           Encoding                            |
+---------------------------------------------------------------+
| A YAML text may be use UTF16 or ISO 8859-1 character          |
| encodings. YTML explicitly allows MIME headers to specify     |
| alternative encodings and provide document level meta-data,   |
| including an YAML version number.                             |
|                                                               |
| A YAML Parser should check for a UTF16 byte order mark. If it |
| is found, then the YAML text is encoded using UTF16,          |
| surrogate paris excepted. Otherwise, the parser should assume |
| 8 bit ISO LATIN, 8859-1. The default is not UTF8 since UTF8   |
| is not a simple single-byte encoding. Thus, a parser must     |
| support both UTF16 and ISO 8859-1 and is not required to      |
| support any other encodings.                                  |
|                                                               |
| The parser should identify the first line of the text         |
| starting with an indicator, ($@%). All lines leading up to    |
| this point are collectively called the header, this line and  |
| all following lines are collectively called the body. The     |
| header should be examined for MIME header fields.             |
|                                                               |
| If MIME header fields are present, the parser should verify   |
| that a transfer encoding other than 7bit, 8bit, or binary is  |
| not used. Specifically, if base64 or quoted-printable is      |
| used, the parser must exit gracefully as YAML forbids         |
| transfer encodings. Also, if the Content-Type is multipart,   |
| the parser must exit as support for multi-part content is     |
| forbidden with version 1.0 of YAML.                           |
|                                                               |
| Further, the parser should examine the Content-Type, and      |
| should exit gracefully if the charset is not supported by the |
| parser. Thus, other encodings may be supported by a given     |
| parser, but parsers are only required to support UTF16        |
| (excepting surrogate pairs) and ISO 8859-1. Finally, the      |
| parser must check for a X-YAML-Version and should assume      |
| version 1.0 if the MIME header is missing or this specific    |
| header field is absent. Parser may make these MIME header     |
| fields available through its API, but this is not a           |
| requirement.                                                  |
|                                                               |
| If content before the first indicator exists, but does not    |
| "look" like a MIME header, then the parser may issue a        |
| warning message. Specifically, any line in the header having  |
| whitespace followed by an indicator ($@%) is an error and     |
| must be reported. Finally, if a header exists, then the line  |
| immediately before the body must be a blank line as specified |
| by the MIME specification.                                    |
+---------------------------------------------------------------+
|                   Serilization Format / BNF                   |
+---------------------------------------------------------------+
| This section contains the BNF productions for the YAML        |
| syntax. Much to do...                                         |
+---------------------------------------------------------------+
|                        Parser Behavior                        |
+---------------------------------------------------------------+
| This section describes how a parser should parse YAML. Much   |
| to do...                                                      |
+---------------------------------------------------------------+
|               Emitter Behavior / Canonical Form               |
+---------------------------------------------------------------+
| This section describes how an emitter should write YAML into  |
| canonical form. Includes specific word-wrapping algorithem.   |
| Minimal content length of 20 chararacters, and does it's best |
| to word-wrap by 76 columns.                                   |
+---------------------------------------------------------------+
|                        Implementations                        |
+---------------------------------------------------------------+
| To do... an implementation in C, C++/STL, Python, Java, and   |
| ...                                                           |
+---------------------------------------------------------------+
|                            Credits                            |
+---------------------------------------------------------------+
| This work is the result of long, thoughtful discussions on    |
| the SML-DEV mailing list. Specific contributors include...    |
| (to do)                                                       |
+---------------------------------------------------------------+
|                         Some thoughts                         |
+---------------------------------------------------------------+
|  1. This is very preliminary thoughts on the subject, feedback |
|     is very welcome.                                          |
|  2. Implementations needed... Clark is happy to write the     |
|     Python, C, and perhaps even a C++ implementation. Any     |
|     takers?                                                   |
|  3. Was thinking hard about using # for a comment indicator,  |
|     or perhaps as a numeric indicator. Benfits? In any case,  |
|     the BNF should leave all of these special characters open |
|     to future versions.                                       |
|                                                               |
+---------------------------------------------------------------+
|                              FAQ                              |
+---------------------------------------------------------------+
|  1. Don't the indicator characters need to be escaped in the  |
|     content? Answer: No.                                      |
|                                                               |
+---------------------------------------------------------------+


  Replies Author Date
4728 Re: YAML Draft 0.1 jimfl@t... Sun  5/13/2001
4729 Re: YAML Draft 0.1 Clark C . Evans Sun  5/13/2001
4759 Re: YAML Draft 0.1 Clark C . Evans Mon  5/14/2001
4770 Re: YAML Draft 0.1 jimfl@t... Mon  5/14/2001
4772 Re: YAML Draft 0.1 Sjoerd Visscher Mon  5/14/2001
4730 Re: YAML Draft 0.1 Dave Winer Sun  5/13/2001
4731 Re: YAML Draft 0.1 Clark C . Evans Sun  5/13/2001
4734 Re: YAML Draft 0.1 Sjoerd Visscher Sun  5/13/2001
4737 Re: YAML Draft 0.1 Sjoerd Visscher Mon  5/14/2001

  Message 4710 of 4822  |  Previous | Next  [ Up Thread ]  Message Index   Msg #
 Reply  |  Forward  |  View Source  |  Unwrap Lines   


Copyright © 2001 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help