TOML, Tom's Own Markup Language

LeafStorm · on Feb 24, 2013

I note that, like many erstwhile specs, TOML does not document the escape sequences accepted in strings. Nor does it exhaustively specify integer formats and float formats - rather ironic for a spec that advertises "TOML is designed to be unambiguous and as simple as possible."

The limitation on array types seemed fairly arbitrary at first glance, but after thinking it over I realized it aided compatibility with languages that do not support homogeneous arrays. Though as far as the types go, I would add boolean and perhaps non-quoted strings for single-word values.

Now that the technical criticism is out of the way, holy crap this guy is arrogant.

nightpool · on Feb 24, 2013

I don't know if you can call him "arrogant". None of it read as very serious to me, I more assumed he was just having fun.

cjh_ · on Feb 24, 2013

fwiw: This is how I read it too, I didn't get any sense of arrogance.

burke · on Feb 24, 2013

Well, he's the CEO of Github, and he's probably been drinking, so I suppose a little bit of arrogance is expected.

niggler · on Feb 24, 2013

The type of person who is arrogant while drunk is generally arrogant while sober as well ...

nixgeek · on Feb 24, 2013

For the love of baby jesus why can't people get the 'H' right.

stdbrouw · on Feb 24, 2013

As proper nouns become more common, they first lose any capitalization in the middle of the word, and then finally capitalization of the initial letter. It's human language. It happens.

burke · on Feb 24, 2013

Especially when their own logotype has it in all lowercase.

tshaddox · on Feb 24, 2013

That wouldn't at all back up calling it "Github."

benatkin · on Feb 24, 2013

Yes, and I think it's arrogance on the part of Wordpress (there I did it) folks to insist that everyone capitalize it in the prescribed manner. Especially since they weren't consistent from the get-go. They even went so far as to make Wordpress (trolol) itself filter content to be capitalized if someone tries using the lower case p. http://justintadlock.com/archives/2010/07/08/lowercase-p-dan...

mcintyre1994 · on Feb 24, 2013

It's to do with protecting their trademark though. That whole human language makes proper nouns normal words - companies don't like that at all. In the case of WordPress, there's a lot of potential for abuse if anybody can call their system it or whatever.

zwegner · on Feb 24, 2013

Hehe, that reminds me of iphones auto-correcting "iphone" to "iPhone". Jeez that would irritate me, I'm trying to write a text message, not look like an iDouche...

jobu · on Feb 24, 2013

Unlike TOML, most people are case insensitive.

taproot · on Feb 25, 2013

Don't you mean "Jesus"?

batgaijin · on Feb 24, 2013

github likes daring escapades with sharks

GHFigs · on Feb 24, 2013

mhartl · on Feb 24, 2013

Now that the technical criticism is out of the way, holy crap this guy is arrogant.

Tom's not being arrogant; he's just being irreverent.

zwegner · on Feb 24, 2013

I'm just wondering what the point of having homogeneous arrays is when the dictionaries aren't...

norswap · on Feb 24, 2013

Seconded. Otherwise it seems quite nice, but this one inconsistency stands out.

steveklabnik · on Feb 24, 2013

> If it's not working for you, you're not drinking enough whisky.

cjh_ · on Feb 24, 2013

>I realized it aided compatibility with languages that do not support homogeneous arrays.

Don't you mean languages that only support homogeneous arrays (or languages that do not support non-homogeneous)?

As the spec says that the array elements must all be of the same type, thus homogeneous.

If I a mistaken, can you please explain why?

LeafStorm · on Feb 24, 2013

I meant "only support homogeneous arrays," or "do not support heterogeneous arrays," and apparently got the two wordings mixed up. Thanks.

Gotttzsche · on Feb 24, 2013

looks like you can put string arrays and int arrays into the same array though.

"data = [ ["gamma", "delta"], [1, 2] ] # just an update to make sure parsers support it"

so in a static language it would be like: Array<Array<???>> not sure this makes any sense

dkersten · on Feb 24, 2013

Urg. Off topic, but I dislike this perl/ruby tendency of calling hash tables hashes. When I see the word hash, I always think of a value (ie a hash code) and not a data structure. Why couldn't they call it a hash map, hash table, map, table, dictionary etc like all the other languages...?

charlieok · on Feb 24, 2013

I agree with that. 'map' or 'dictionary' are the best choices I think (or 'associative array', but why bring arrays into it). That's the interface, of which a hash table is just one possible implementation.

Myrmornis · on Feb 24, 2013

I've never liked 'dictionary'. The analogy isn't at all apparent to me. A dictionary explains what words means. The thing we're talking about doesn't explain what keys mean. (Someone who spends most of his time writing python here.) 'map' or 'mapping'.

charlieok · on Feb 24, 2013

It's about the operations. One Does Not Simply (tm) read a dictionary. One instead performs a “lookup” for a particular item. The dictionary is designed to make this lookup fast and reliable, which matches the purpose of these data structures in software.

SoftwareMaven · on Feb 24, 2013

A dictionary maps words to their definitions. The words are the keys, the definitions are the values. Seems reasonable to me. Though as another predominantly pythoner, I do prefer map as well.

Myrmornis · on Feb 24, 2013

In a dictionary the value (meaning) is often (partially) implied by the key (word), by etymology etc. In the data structure there need be no relationship between the key and value other than the fact that they are a key-value pair in this instance. It introduces messy cultural concepts into what should be a clean, abstract concept.

mmahemoff · on Feb 24, 2013

I think dictionary is a useful high-level analogy. Small key objects mapping to potentially large, and often structured, value objects. (By structure, I mean the definition in a dictionary often includes fields like pronunciation and origin.)

aamar · on Feb 24, 2013

Sadly "map" is also the name of the critical "map" function which operates on lists. Maybe with an indefinite article ("a map") it's clear enough.

dkersten · on Feb 24, 2013

Sure, so go the Lua route: table. Or the python route: dictionary. If neither of those do it for you, how about "mapping"?

Hash (and hash map, hash table etc) leak too much implementation detail. What if you want a tree-based mapping instead? I like how in C++ it's map (for ordered, rb-tree based maps) and unordered_map (for unordered, hash table based maps).

aamar · on Feb 26, 2013

I'm going to try out "mapping" -- good suggestion.

markov_twain · on Feb 24, 2013

This bit of ruby should take care of it

    HashMap = HashTable = Map = Table = Dictionary = Hash

And if you're feeling adventurous,

    Object.send :remove_const, :Hash

agscala · on Feb 24, 2013

Please never, ever do this

DanWaterworth · on Feb 25, 2013

If doing this is such a bad idea then why is it so easy?

jlgreco · on Feb 25, 2013

If doing:

  #define BEGIN {
  #define END }

were such a bad idea, then why is it so easy?

DanWaterworth · on Feb 25, 2013

Maybe the language is poorly designed.

jlgreco · on Feb 25, 2013

Or maybe the connection between being able to do something and it being a good idea to do something is just in your head.

DanWaterworth · on Feb 25, 2013

In my experience, making obviously bad things difficult or impossible improves reliability. This idea certainly resides within my cranial cavity, but that doesn't necessarily make it wrong.

jlgreco · on Feb 25, 2013

How could:

  HashMap = HashTable = Map = Table = Dictionary = Hash

possibly not qualify as "obviously bad"? The only reason you've offered up is because it is easy...

DanWaterworth · on Feb 25, 2013

I think this is fine. The obviously bad part is being able to remove/change constants, especially as these changes are global.

agscala · on Feb 25, 2013

The obviously bad part is that you pollute the global namespace for no reason other than laziness. When someone comes across code that uses a "Table" object interchangeably with "Dictionary" and "Hash", then he's going to have to look through the source code to find this bizarre line only to find out that you renamed a built-in container for no good reason.

DanWaterworth · on Feb 25, 2013

Yes, I suppose that's also true.

knewter · on Feb 24, 2013

I approve of this. Don't listen to that other guy. This should be standard.

One addition though:

    Cocktionary = HashMap

"Dictionary" never really made sense.

grey-area · on Feb 24, 2013

Because the primitive is called Hash

    my %hash = ();
    Hash.new

So naturally people talk of Hashes etc. I understand where you're coming from, but it's really not very important, and it would be more confusing to talk of Hash Tables as learners would naturally look for HashTable in the stdlib.

oleganza · on Feb 24, 2013

That was the question: why the class is named Hash instead of HashMap or Dictionary? Was it done intentionally, or it is just an accident because someone did not know English very well?

grey-area · on Feb 24, 2013

I agree Map or Dictionary would have been fine too, but it's so widely used it needs to be easy to type so two words is not great (HashMap). However I suspect it was just named that way following perl (written by an english speaker). Obviously it's far to late to change it now and I can't say it bothers me or most Ruby users. It's something you get used to very quickly.

dkersten · on Feb 24, 2013

So do what python did: dict

Though even HashMap isn't bad because typing is a solved problem - with auto completion and touch typing two words really aren't an issue in my mind.

People get used to living with all kinds of things, but that doesn't make them any better. Yes I'm aware that this applies equally to my typing comment as to you having got used to hash.

Terretta · on Feb 25, 2013

> why the class is named Hash instead of HashMap

Why in golang is a function denoted by func instead of function?

I'd guess it's because programmers prefer fewer keystrokes as long as the term remains sufficiently mnemonic.

jlgreco · on Feb 25, 2013

The name 'func' doesn't collide with anything else though. It is an entirely more reasonable abbreviation.

draegtun · on Feb 24, 2013

It's because Larry Wall wanted something shorter than associative arrays.

* http://www.nntp.perl.org/group/perl.perl6.language/2007/05/m...

* http://www.nntp.perl.org/group/perl.perl6.language/2007/06/m...

tzury · on Feb 24, 2013

    Because we need a decent human readable format 
    that maps to a hash and the YAML spec is like 
    600 pages long and gives me rage. No, JSON 
    doesn't count. You know why.

I do not know why, And would love if one can explain me?

Other than comments, I see not difference between both.

Also, that human readable is not an accurate, as it should be hacker readable, you know, IT folks are the only target audience of those files.

    [owner]
    name = "Tom Preston-Werner"
    organization = "GitHub"
    bio = "GitHub Cofounder & CEO\nLikes tater tots and beer."
    dob = 1979-05-27T07:32:00Z # First class dates? Why not?


    {
        "owner": {
            "name": "Tom Preston-Werner",
            "organization": "GitHub",
            "bio": "GitHub Cofounder & CEO\nLikes tater tots and beer.",
            "dob": "1979-05-27T07:32:00Z"
        }
    }

kingkilr · on Feb 24, 2013

In JSON that datetime won't deserialize to a datetime instance in your language in a conforming parser. Further JSON has no comments (this is a killer for a configuration format).

niggler · on Feb 24, 2013

There are ways to fake comments by using extra fields:

{ "what i want": "what i really really want", "ಠ_ಠ":"Ignore the eyes" }

zwily · on Feb 24, 2013

I think you just helped Tom make his argument... :)

benatkin · on Feb 24, 2013

meteor has a way of serializing and deserializing datetime values http://docs.meteor.com/#ejson

DanielRibeiro · on Feb 24, 2013

In the authors own words[1]:

{ 'because': { '80': 'percent' }, {'of': 'JSON', 'is': 'brackets' } }

[1] https://github.com/mojombo/toml/issues/2#issuecomment-140029...

niggler · on Feb 24, 2013

That's actually not valid JSON -- should use double quotes

{ "because": { "80": "percent" }, {"of": "JSON", "is": "brackets" } }

RaphiePS · on Feb 24, 2013

That's not valid JSON either! (The second value has no key). Needs to be:

{ "because": [{ "80": "percent" }, {"of": "JSON", "is": "brackets" }] }

zoowar · on Feb 24, 2013

And this is only 10% curly braces, not counting spaces.

chipsy · on Feb 24, 2013

Despite that, the thread easily illustrates the difficulty of writing valid JSON by hand.

benatkin · on Feb 24, 2013

JSON wasn't invented, it was discovered, from a long evolution of programming languages. The punctuation isn't ceremony. It's the amount needed for it to be concise (clear and terse, not just terse).

fruchtose · on Feb 24, 2013

The difficulty level is hardly extreme. It is not an unreasonable challenge to learn that writing an array of elements requires opening and closing brackets.

grey-area · on Feb 24, 2013

The issue here might be that JSON has become widely used for two things:

   Data marshalling/transfer
   Config formats

For the latter, as they are typically written by hand, it's not particularly appropriate as the syntax is noisy and multiple nesting with brackets tends to lead to errors, even if you understand it perfectly well in principle, and of course there are no comments, no datetimes etc.

I imagine this is intended as a saner version of YAML for configs.

chipsy · on Feb 24, 2013

FWIW I've also recognized the problem and wrote and implemented my own configuration syntax, which makes the aforementioned JSON data look like:

[because=[[80=percent][of=JSON;is=brackets]]]

The implementation, which is in Haxe and has an informal spec in comments, can be seen here: https://github.com/triplefox/triad/blob/master/dev/com/ludam...

I didn't view bracketing as the enemy(which seems to be the focus of a lot of config syntaxes) but rather the combination of multiple types of bracketing, plus start-and-stop usage of shift keying. I only have two types of brackets, the sequence [ type and the long string {" type, and you can "feel" when you're writing a long string because of that sudden need to use the shift.

drivebyacct2 · on Feb 24, 2013

I've /never/ had a problem writing JSON by hand.

Try this in TOML:

key = "value1", "value2"

The same mistakes can be ignorantly made in any markup.

benatkin · on Feb 24, 2013

It isn't a markup language. I'd like to correct this mistake that was started by YAML. :/ http://en.wikipedia.org/wiki/Markup_language (Using the bacronym "YAML Ain't Markup Language" only helped it grow, making more people confused as to what a Markup Language is.)

I like it, though. More grepable than JSON or YAML, with the way it handles nested keys using dot notation.

burntsushi · on Feb 24, 2013

This has been fixed. [1]

[1] - https://github.com/mojombo/toml/commit/aa4ac1d6df1031ebe871c...

benatkin · on Feb 24, 2013

Like hell it has. People know what ML means at the end of a file format.

charlieok · on Feb 24, 2013

I just wrote what it said in the README. I assume that's why he named it thus. I agree it isn't a markup language.

benatkin · on Feb 24, 2013

Yes, this was directed at the linked content, not the title. Good on ya for not being clever with the title. :)

networked · on Feb 24, 2013

I've always been a fan of the .INI syntax but the lack of a standard (which I think Microsoft should have championed) made the format hard to use consistently. There have been attempts at standardization [1] but, alas, they never spread widely enough. In light of the above, I'm glad to see an INI-derived format with a real spec -- not necessarily because it might replace JSON but because it might replace INI.

Speaking of INI, for the longest time the killer app for INI files for me was persistent data storage in batch scripts (.bat/.cmd files in Windows 9x/NT). Using a command line utility like [2] or a similar program from IBM that sadly wasn't legally redistributable you were able to achieve persistence with minimum effort, which would otherwise be difficult to program in batch. I even wrote a portable clone of inifile.exe for MS-DOS and Linux to be able reuse my scripts more easily. TOML would sure benefit from the same.

[1] http://www.cloanto.com/specs/ini/

[2] http://www.horstmuc.de/wbat32.htm#inifile

tericho · on Feb 24, 2013

The biggest concern with JSON seems to be the lack of comments. So what voodoo is Sublime Text 2 performing? Why can't we just use that?

  {
    // Sets the colors used within the text area
    "color_scheme": "Packages/Color Scheme - Default/Monokai.tmTheme",

    // Note that the font_face and font_size are overriden in the platform
    // specific settings file, for example, "Preferences (Linux).sublime-settings".
    // Because of this, setting them here will have no effect: you must set them
    // in your User File Preferences.
    "font_face": "",
    "font_size": 12,

    // Valid options are "no_bold", "no_italic", "no_antialias", "gray_antialias",
    // "subpixel_antialias", "no_round" (OS X only) and "directwrite" (Windows only)
    "font_options": [],

    // Characters that are considered to separate words
    "word_separators": "./\\()\"'-:,.;<>~!@#$%^&*|+=[]{}`~?",

    // Set to false to prevent line numbers being drawn in the gutter
    "line_numbers": true
  }

geon · on Feb 24, 2013

Some JSON implementations supports comments, others don't. If you know the one you use supports (and will continue to support) comments, go ahead and use it. It just won't be portable.

Douglas Crockford himself suggests you "Go ahead and insert all the comments you like. Then pipe it through JSMin before handing it to your JSON parser." That sounds like a reasonable workaround.

https://plus.google.com/118095276221607585885/posts/RK8qyGVa...

toni · on Feb 24, 2013

I don't know specifically about Sublime Text 2, but from my own experience writing a configuration library which accepts JSON as an input, you usually strip out those comments before feeding the resulting content to your appropriate json_decode function.

1- Read the contents of your JSON file.

2- Strip out the comments with some regex foo or such.

3 - Feed the remaining contents to your JSON parser.

fruchtose · on Feb 24, 2013

> There should only be one way to do anything.

[...]

> There are two ways to make keys.

I guess I haven't had enough whiskey yet.

geon · on Feb 24, 2013

> Tabs or spaces. TOML don't care.

And two ways to indent.

skrebbel · on Feb 24, 2013

Given Jekyll's enormous backlog of issues and pull requests[0], can we expect this to be maintained or supported any bit beyond the late night drunken brain fart that this is?

[0] https://github.com/mojombo/jekyll

mojombo · on Feb 24, 2013

Parker Moore and I (along with many contributors) have been spending quite a bit of time on Jekyll recently. Over the last 30 days we've merged 17 pull requests and closed 62 issues. We're ramping up for a 1.0 release and there's a brand new website in the works. You can check it all out on the master branch.

Tens (or possibly hundreds) of thousands of people use Jekyll now. It's interesting to note that Jekyll started out as a "brain fart" as well. Just one amongst hundreds of blog engines. I wrote it because I was dissatisfied with everything on the market, and I thought I could do something different and better, to serve my own needs. I open sourced it, because I thought others might get a kick out of it.

I'd wager that most of the great things we use today started as nearly ephemeral emanations from someone's mind, often late at night, or helped along by a snifter of brandy. The funny thing is, if you never try out your crazy ideas, you'll never know which ones might have changed the world.

ghuntley · on Feb 24, 2013

Tom, I've been waiting two years on a pull-request to the official gem which adds the ability to view open/closed issues in private repositories.

https://github.com/defunkt/github-gem/pull/59

The pull has 19 people asking for integration and has some stellar comments:

"seriously? year long pull request with two lines of changes?"

"I normally would think that the github gem features for paying users would get a lot of attention from the folks at github..."

I even tried getting it pulled via pre/postsales emails to enterprise@github.com (I'm a enterprise customer) which was met with a "yeah, i'll tap him on the shoulder to integrate - year later nothing.

mojombo · on Feb 24, 2013

That project isn't actively maintained at the moment (nor is it an official GitHub project), but I'll see what I can do tomorrow to get it merged in and released. Sorry for the frustration!

skrebbel · on Feb 24, 2013

Cool. I'm very happy that you're back actively working on Jekyll! Guess my statement was a bit outdated then. Take it with the appropriately sized grain of salt.

Note: there's nothing wrong with releasing brain farts; quite the contrary. I didn't at all mean to imply that you shouldn't do that.

nacker · on Feb 24, 2013

> I'd wager that most of the great things we use today started as nearly ephemeral emanations from someone's mind, often late at night, or helped along by a snifter of brandy.

Scientific support for this:

http://www.psychologytoday.com/blog/choke/201204/alcohol-ben...

http://bigthink.com/ideafeed/how-alcohol-inspires-creativity

https://www.sciencedirect.com/science/article/pii/S105381001...

dfkf · on Feb 24, 2013

And how is this better than xml?

  <owner name="Tom Preston-Werner"
         organization="GitHub"
         bio="GitHub Cofounder &amp; CEO\nLikes tater tots and beer."
         dob="1979-05-27T07:32:00Z" />

  <database server="192.168.1.1"
            ports="8001 8001 8002"
            connection_max="5000"
            enabled="true" />

  <servers>
    <alpha ip="10.0.0.1"
           dc="eqdc10" />
    <beta ip="10.0.0.2"
          dc="eqdc10" />
  </servers>

icebraining · on Feb 24, 2013

Let's see:

. No native support for numbers, dates, booleans or lists. The latter can be implemented using subelements, but it's so cumbersome that you skimped on that and used a non-typed string instead (the database ports).

. Redundant verbosity. Root elements, closing tags, way too much crap to be manually inserted.

. XML parsers are huge, complex beasts which have no place in many smaller applications.

. Being XML, it leaves way too many possibilities for crappy developers. Namespaces in config files, oh joy!

http://harmful.cat-v.org/software/xml/

dfkf · on Feb 24, 2013

Most of your points are environment specific and I think that you forgot the strongest of them - "xml APIs usually suck". In .net they are non-issues. And about being cumbersome and verbose, the point I tried to make is that you don't have to be zealous and put every small piece of data in a separate element. No reason not to put data in attributes or even in comma/whitespace separated strings, if that piece of data can be extracted in one short line of code.

icebraining · on Feb 24, 2013

Most of your points are environment specific

How so? .Net can't magically discover the types of values or prevent developers from abusing the format.

you don't have to be zealous and put every small piece of data in a separate element.

But then you're layering a complex format with a custom application-specific parser, with an unknown syntax (e.g. spaces vs commas, are ranges supported, etc). It obviously can be done, but it's a mess.

MichaelGG · on Feb 24, 2013

Agree. And if XML supported unnamed closing tags, it'd lose a lot of it's rep for verbosity. Although in this case you'd just be replacing </servers> with </> in other documents it is a lot more noticeable.

I will note this isn't a valid XML document: you have no root node.

jaequery · on Feb 24, 2013

please let's not bring XML into this. last thing we need is someone inspired to say let's all go back to XML.

dfkf · on Feb 24, 2013

Go "back"?! There are lots of places where xml is alive and well and config files is one of them. And you can see why - empty elements with attributes look rather concise, and without all that punctuation noise JSON has.

fruchtose · on Feb 24, 2013

Somewhere, in a small room in a larger building owned by a gigantic corporation, a SOAP programmer just felt validated.

ehm_may · on Feb 24, 2013

cough every java project ever cough

eropple · on Feb 24, 2013

Typesafe Config says hello.

jeremymcanally · on Feb 24, 2013

Ruby parser here: https://gist.github.com/jm/5022483 Please fork and improve. :)

kenneth · on Feb 24, 2013

    # line 36
    text.split("#").first

This will have trouble with a line like:

    tweet = "TOML is #awesomesauce"

--

    # line 43
    array = $1.split(",").map {|s| s.strip.gsub(/\"(.*)\"/, '\1')}

You should recurse into coerce here, or you'll just lose types. (Also you're assuming arrays of strings.)

    array = $1.split(",").map {|s| coerce(s) }

--

You're also not dealing with nested key groups. (eg. [servers.alpha]).

--

That being said, naïve string parsing is a terrible way to build a new markup language implementation. It's the reason the Markdown landscape is such a mess[1]. What this really needed is a formal grammar.

[1]: I actually tried to fixed that by writing a formal lexer & informal parser for Markdown in a side-project of mine[2]. It's not quite there yet, because for practicality reasons I wrote my own parser instead of a formal AST-generating parser.

https://gist.github.com/kballenegger/29dabe4b6e762ee221df

[2]: http://getmacchiato.com

jeremymcanally · on Feb 24, 2013

Yup array handling is weak. I was going to recurse into coerce, but then the examples made it seem like only strings will be accepted in arrays (he put "8000" in there rather than just 8000). I'll get clarification.

Made it into a proper project/gem here if you want to file issues: https://github.com/jm/toml

And good call on the nested key groups. Shouldn't be hard to knock that out.

pak · on Feb 24, 2013

Don't have time to work on it now, but it looks like you'll need to recurse while parsing arrays. Right now, only arrays of strings that don't contain commas are handled correctly.

egonschiele · on Feb 24, 2013

Cool, the market isn't fragmented enough already.

Yes, being the CEO of Github does give you the power to do whatever you want.

Of course, drinking and coding is a great idea. The Ballmer peak isn't a joke, it's a way of life.

tantalor · on Feb 25, 2013

What's this fragmentation you speak of? Surely you won't be forced to use one particular format you don't like. An API usually supports multiple formats.

GhotiFish · on Feb 24, 2013

Are arrays of maps a bad idea? Someone posted a pom.xml file pointing out how horrible it was, and I thought to myself "How would this look in toml?"

I was all set to try a translation when I hit this section:

  <dependency>
    <groupId>com.google.apis</groupId>
    <artifactId>google-api-services-drive</artifactId>
    <version>v2-rev53-1.13.2-beta</version>
  </dependency>
  
  <dependency>
    <!-- A generated library for Google+ APIs. Visit here for more info:
          http://code.google.com/p/google-api-java-client/wiki/APIs#Google+_API
    -->
    <groupId>com.google.apis</groupId>
    <artifactId>google-api-services-plus</artifactId>
    <version>v1-rev22-1.8.0-beta</version>
  </dependency>  


  <dependency>
    <groupId>com.google.api-client</groupId>
    <artifactId>google-api-client</artifactId>
    <version>1.13.2-beta</version>
  </dependency>

  <dependency>
    <groupId>com.google.api-client</groupId>
    <artifactId>google-api-client-servlet</artifactId>
    <version>1.13.1-beta</version>
  </dependency>

How would I represent this in TOML?

  [dependancy1]
  groupId    = "com.google.api-client"
  artifactId = "google-api-client"
  version    = "1.13.2-beta"

  [dependancy2]
  groupId    = "com.google.api-client"
  artifactId = "google-api-client-servlet"
  version    = "1.13.1-beta"

That's not right, it clearly should be an array, but I don't think the standard supports it. At best I would think you'd have to use parallel arrays

  [dependencies]
  groupIds    = ["com.google.api-client", "com.google.api-client"]
  artifactIds = ["google-api-client"    , "google-api-client-servlet"]
  versions    = ["1.13.2-beta"          , "1.13.1-beta"]

and that's just not pretty.

simcop2387 · on Feb 25, 2013

why not

[dependencies.com.google.api-client] artifactId = "google-api-client" versions = "1.13.2-beta"

[dependencies.com.google.api-client] artifactId = "google-api-client-servlet" versions = "1.13.1-beta"

GhotiFish · on Feb 25, 2013

but the syntax is colliding there, You define dependencies.com.google.api-client.artifactId twice.

Also it creates the key value maps:

   dependencies.com
   dependencies.com.google

which shouldn't exist, so that doesn't seem right either.

simcop2387 · on Feb 25, 2013

Good point, I didn't notice that one. Certainly an interesting case. The ones that exist that shouldn't I don't think are as big of issues but it is certainly not an easy problem to solve here.

espadrine · on March 3, 2013

I'm trying to lay down the issues with TOML, so I added this in…

https://gist.github.com/espadrine/5028426

ricardobeat · on Feb 24, 2013

Ah, the power of fame. Implementations spreading like weeds. Four already in javascript, even though the spec is not anywhere near finished:

    npm search toml
    npm http GET https://registry.npmjs.org/-/all/since?stale=update_after&startkey=1361700343737
    npm http 200 https://registry.npmjs.org/-/all/since?stale=update_after&startkey=1361700343737
    NAME                  DESCRIPTION                   AUTHOR            DATE      
    node-toml             TOML parser                   =ricardobeat      2013-02-24 10:08
    toml                  TOML parser for Node.js       =binarymuse       2013-02-24 04:19  toml parser
    toml-node             TOML ====                     =thehydroimpulse  2013-02-24 08:01
    toml-parser           A TOML parser for node.js     =aaronblohowiak   2013-02-24 06:41

hyperpape · on Feb 24, 2013

In a few years, maybe we can have "nfnpm" (noise-free node package manager).

rogerbinns · on Feb 24, 2013

What is wrong with JSON? Everything already supports it.

JSON has two drawbacks: a lack of comments (although you could add "#" keys in relevant places) and no binary support (arbitrary conventions include base64) but this doesn't support binary anyway.

nikcub · on Feb 24, 2013

A few issues (although I do use JSON in config):

It isn't a friendly form of human input. My error rate is 50%+ , you have to lint on save to catch things that are invisible to the naked eye

No ability to override, extend or reference keys. This is most useful in config objects where for eg. in a dev object you want to override the username and password for a database connection but not repeat all the other parameters

No comments

rogerbinns · on Feb 25, 2013

You can override in pretty much the same way TOML does. Instead of replacing an underlying object, you update it with the values read from JSON.

wvenable · on Feb 24, 2013

Lack of comments is pretty much a deal breaker for configuration. I see a lot of undocumented JSON used for configuration and I find it difficult to believe that is something we want for the future.

Lack of comments makes JSON much better for data exchange than formats with comments.

redbad · on Feb 24, 2013

Lack of comments is at once annoying and beneficial: it forces your [JSON-based] configuration to be simple.

Tloewald · on Feb 24, 2013

Like xcode project files...

rtpg · on Feb 24, 2013

couldn't you just use JS Object notation? you get comments, less cruft (having to quote string definitions in the key-space might be annoying).

Am I missing something with that?

rogerbinns · on Feb 27, 2013

What is the difference between "JS Object notation" and JSON? Google searches show they are the same thing. JSON definitely does not have comments http://www.json.org/

jmah · on Feb 24, 2013

Hey Tom, why not use git's config format?

http://git-scm.com/docs/git-config

jemeshsu · on Feb 24, 2013

About the use of mark up language as config file. I see that in most Python apps, the config file is just another Python script and not using another markup language. This way makes sence in a dynamic language and it feels natural. I understand it is a habit to use yaml in Ruby apps for config. Is it not possible to just use Ruby script as config file since the script can be loaded dynamically? What are the pros and cons of using another markup language as config file vs using just the app language(Python/Ruby)?

gavinballard · on Feb 24, 2013

Your configuration file might need to be read by more than one language.

It's also nice to have a configuration file mean the same thing regardless of its runtime environment.

tartley · on Feb 24, 2013

Using script files to store config is convenient, but is it true that in some circumstances it could give malicious parties a chance to inject arbitrary executed code into your environment, in ways that parsing a pure data file could not.

mccolin · on Feb 24, 2013

It is also common for ruby configs to be script files. Rails, for instance, has the config/initializers folder which is a set of ruby scripts that will be run at startup. It comes down mostly to preference.

regularfry · on Feb 25, 2013

Dynamic config files are wrong for the same reason you don't want logic in your HTML templates.

ezquerra · on Feb 24, 2013

This is quite nice but there are a few of things that I miss:

1. A way to have multi-line values for non array types

2. A more flexible number syntax (e.g. allow hex and binary integers, allow exponents on floats, allow NaN and +/-Inf)

3. Make it possible to have an extra comma after the last element on an array (as in Python)

4. Add a way to "include" another config file

#1 is important because some projects require all lines to have a max width of 80 lines, including on config files.

#2 is important for scientific/engineering projects. I think the current simple format shows that this format is a little too web centric. If this is going to be used for non-web stuff this is a must.

#3 is something that helps when putting this sort of configuration file in version control. Without this, adding an extra entry to a multi-line array creates a diff in two lines rather than 2 (since you must add a comma to the line above the one that you inserted). This is something I miss in JSON and which Python did just right (IMHO).

#4 would be useful in cases in which you want to provide a base configuration file for example.

Also, maybe I missed it but it is not super clear what would happen if you redefine an existing entry (I hope it is possible). Finally, is order important?

EDIT: typo.

tartley · on Feb 24, 2013

+1 for #3

It's not just in the diffs. Trailing commas make editing the list easier.

slurgfest · on Feb 24, 2013

It seems to me that YAML does this better already (with parsers which are already high-quality).

If we want simplicity, then why not make sure it is a subset of YAML?

stormbrew · on Feb 24, 2013

Agreed, I'd much rather have a normalized subset of YAML without the object serialization stuff (I don't even understand why it's there: why take a format intended to be read by humans and then muck it up with complex and dangerous object serialization notation).

stock_toaster · on Feb 24, 2013

And without anchors and reference too.

shuzchen · on Feb 24, 2013

I agree, and high quality YAML parsers are generally available in every language one might want to use. I don't believe I've ever encountered a situation where I was unable to obtain one. Well rust comes to mind, but then rust is really young and you could probably make one easily by just wrapping libyaml. That said, I might just write a TOML parser in python just for kicks.

foobar2k · on Feb 24, 2013

Python has ConfigParser which parses ini-style config files, I assume it uses some standard grammar.

http://docs.python.org/2/library/configparser.html

aaronblohowiak · on Feb 24, 2013

Working node.js version: https://github.com/aaronblohowiak/toml

I just need to auth and push it to npm.

jonpaul · on Feb 24, 2013

Nice work! It's too bad that this guy squatted on the `toml` package name without any implementation: https://github.com/BinaryMuse/toml-node

jonpaul · on Feb 24, 2013

Update: It seems that he has provided an implementation now. I feel better about that, I can't stand when people squat on package names in Node.js.

(Meta: the edit link expired, hence the reply to myself)

tartley · on Feb 24, 2013

I think he invited pull requests containing implementations.

ricardobeat · on Feb 24, 2013

Wrote my own dumb parser in CoffeeScript as an experiment, tried to publish as `toml-parser` :D

https://github.com/ricardobeat/toml/blob/master/index.coffee

tferris · on Feb 24, 2013

If this had be done by any Tom no one would have upvoted

nacker · on Feb 24, 2013

So? If Git had been invented by anyone except Linus, probably hardly anyone would be using it today.

namuol · on Feb 24, 2013

Don't want to be a naysayer, but what's wrong with something like CSON (CoffeeScript Object Notation)?

Comkid · on Feb 24, 2013

Would this be considered legal?

  [ [1,2], ["a", "b"] ]

kazoolist · on Feb 24, 2013

I wondered the same thing. "No, you can't mix data types, that's stupid" leaves it ambiguous.

If you parse the outer array as just "array of arrays" (as each element is an array), you're not "mixing". But if we're supposed to be parsing it as "arrays of arrays of _type_", then we are mixing.

rjsamson · on Feb 24, 2013

This has now been clarified - it is legal

burntsushi · on Feb 24, 2013

Which I find to be quite odd [1]. Data types in `TOML` can be mixed at the level of the hash table. Arrays ought to be homogeneous.

[1] - https://github.com/mojombo/toml/issues/28

mikegirouard · on Feb 24, 2013

The README says not to mix data types because "that's stupid" (which I don't know if I agree with); but I don't know if that answers your question.

You have an array of array, which at that level, satisfies the spec. The children individually keep types contained.

That said, I'm going to assume the intent is to not allow that.

Comkid · on Feb 24, 2013

Would it be okay to understand arrays as integer indexed keys?

For example, wouldn't that mean something like:

  array = [ [1,2], ["a", "b"] ]

Be the same as this:

  [array]
  0 = [1,2]
  1 = ["a", "b"]

burke · on Feb 24, 2013

It's unspecified, I guess, but if you want to read into the spirit of it, which is to make it trivially-supportable by type-nazi languages such as haskell, you either get a [[Int]] or a [[String]].

mbetter · on Feb 24, 2013

I don't know how much it is going to help as you're going to have to wrap the values anyway to get a Map with heterogeneous values.

mostly_harmless · on Feb 24, 2013

I'll be the first to ask: whats wrong with JSON?

Lexarius · on Feb 24, 2013

Like JS from which it sprang, it lacks an integer type. Fortunately, parsers written for languages that do have integers can usually parse them correctly.

(If you don't know why this might matter, try opening your browser's Javascript console and evaluating 10000000000000001)

That's my peeve, though. I suspect that Tom is probably more concerned with readability. TOML also looks like it can be parsed a line at a time and doesn't really need to do any recursive parsing, so you could probably parse a stream of it as it arrives, which I imagine is trickier with JSON.

mikegirouard · on Feb 24, 2013

IMHO, although it's very easily readable by humans, it's not quite as easily written by humans.

I've always preferred INI over JSON for this reason.

krapp · on Feb 24, 2013

I would prefer INI over JSON as well except that JSON lets me nest arrays and INI doesn't. So this looks really nice.

On the other hand I'd like to mix my data types as much as I darn well please.

niggler · on Feb 24, 2013

I use it liberally, but the only thing that I find wrong with json is the [\u2028\u2029] issue:

{"The invisible character":"really messes with javascript "}

Copy the text and paste it in console.

xnxn · on Feb 24, 2013

Nitpick: that's an issue with JavaScript, not JSON.

niggler · on Feb 24, 2013

I'd agree if JSON had a different name, but given that it is called "JavaScript Object Notation" on the main page (http://json.org/) there's an implicit expectation that it's somehow related to javascript.

myhf · on Feb 24, 2013

And there's an implicit expectation that JavaScript is somehow related to Java.

JulianMorrison · on Feb 24, 2013

No comments. Lack of essential data types, forcing you to make the contents of strings part of a hidden unspecified semantic (this parses as date, that parses as time, etc). Constrained by the limitations of JS floats (they aren't even bigdecimal). Excessive significant punctuation. Insignificant white space (permitting a difference between valid, and pretty-printed form). Looks like executable code and tempts you to parse it with eval.

knieveltech · on Feb 24, 2013

Nothing, json's awesome. But like most data packaging schemes the finished product isn't designed with human-readability as a primary goal.

Edit: as mikegirouard points out it is much easier to read than (for example) serialized data, but still not as friendly as ini.

billpg · on Feb 24, 2013

Dear industry, if you are going to add comments to JSON, please make it the /* */ variety.

GhotiFish · on Feb 24, 2013

Why? I dislike those comments.

billpg · on Feb 24, 2013

Text editors will sometimes insert end-of-line characters in the name of word-wrap.

Using the end-of-line as a comment terminator would require significant refactoring of JSON parsers, which were previously at liberty to lump CR and LF together with SP and TAB. A starting and ending token, on the other hand, fits the pattern already required of a JSON parser.

Shorel · on Feb 27, 2013

> Text editors will sometimes insert end-of-line characters in the name of word-wrap.

In this decade, only a brain damaged text editor would do that.

eksith · on Feb 24, 2013

Can't wait to see the end result.

This reminds me of a new project I'm working on called Leewh. It's based on Wheel and kinda has the same overall function, but I needed something to get my project rolling quickly and using .ini and JSON syntax separately felt... well... too square, I guess.

I figured I'll come up with something more well rounded.

bobfunk · on Feb 25, 2013

Spent a couple of hours this evening on a small parser written in CoffeeScript:

https://github.com/biilmann/coffee-toml

Error handling is rough and it still doesn't handle groups with dots [alpha.beta], but appart from that it should be fairly complete.

nixgeek · on Feb 24, 2013

Another parser for Ruby is hiding over here: https://github.com/parkr/tock

twelvechairs · on Feb 24, 2013

Its empty.

    class Tock
       VERSION = "0.0.1"
       # TODO: IMPLEMENT ALL THE THINGS
    end

espadrine · on March 3, 2013

The issues I find with TOML:

https://gist.github.com/espadrine/5028426

cantankerous · on Feb 24, 2013

Looks interesting. What's the rationale behind disallowing something like "-.1234". Is this how it's done elsewhere?

aaronblohowiak · on Feb 24, 2013

makes parsing slightly easier. also '-.' might blur together so you would think it is just '-1234'

aristidb · on Feb 24, 2013

I like the format. A formal spec would be nice, to aid implementations in as many languages as possible.

ehm_may · on Feb 24, 2013

I really like the hash key format.

Parsing JSON for arbitrarily nested keys is nasty, and this makes it extremely natural.

benatkin · on Feb 24, 2013

How is this different, or better, than the configuration format that git itself uses?

jaequery · on Feb 24, 2013

at first glance i thought this was crazy. but now, i think this might actually work.