Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert layers to YAML #947

Merged
merged 10 commits into from
Sep 16, 2014
Merged

Convert layers to YAML #947

merged 10 commits into from
Sep 16, 2014

Conversation

pnorman
Copy link
Collaborator

@pnorman pnorman commented Sep 13, 2014

Closes #711

Although requiring a script is not ideal, the JSON MML file couldn't be sanely edited anyways. The advantages are

  • Sane merge conflict resolution
  • Not always a merge conflict for edits to the same SQL
  • Reasonable code reviews of SQL changes. Currently I don't review most SQL changes since they're very hard to get a handle on when escaped into a single line of JSON
  • Easier changing of database settings. With YAML aliases, they only need to be specified once

I think it'd be easiest to merge this then rebase other merges on top of it - the one-time conversion from JSON to YAML is not trivial, as a great deal of work had to be done by hand to clean up the YAML.

Requires PyYAML. If this dependency is an issue on a particular
platform, pretty much any language allows you to do this.

A ruby example is

    ruby -ryaml -rjson -e 'puts JSON.pretty_generate(YAML.load(ARGF))' < project.yml > project.json
This one-time conversion was done with the json2yaml node module,
and then substantially reformatted by hand, using mappings and alias
to avoid repeating projection and extents information for each layer,
and to get consistent ordering of the properties of each layer. The
latter does not matter semantically, but makes editing with a text
editor cleaner.

By using the scripts/yaml2mml.py script, the YAML can be converted
back into JSON, which is still what TileMill uses.

A JSON-aware diff of the original project.mml and the new generated
one shows that the only differences are the new "_parts" property,
used for aliases and ignored by TileMill, and layer extents.

The script can be invoked with

    scripts/yaml2mml.py < project.yaml > project.mml

The project.mml file remains checked in to git, as TileMill requires
it and this allows someone to clone the repo, and immediately have
it work. Merge conflicts in project.mml are trivial to solve, as
the entire file is just regenerated by running the script again.

This does mean that the .MML file should not be edited directly,
but this is fine, as editing the JSON by hand is extremely difficult
anyways! The HOT OSM style has a similar workflow, making use of cartocc.
By using aliases, we can avoid having to specify the same information
repeatedly, saving lines, avoiding mistakes, and making it easier to
use a differently named database.

The last is evident in that the landcover extents changed slightly
in this, because they were different than all the other layers.
Block scalars allow you to use newlines, and are far more readable

This commit also catches some extents that were missed previously.
A lot of the SQL was badly formatted, with random tabs, caps, and
lines of 5000 characters. This commit cleans up about half of them
but is not complete.
* Add indentation if necessary for complex function calls, WHERE parenthesis, and CASE statements
* One space before and after = etc
* Name SQL subqueries after the layer name (but use underscores)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about spaces after commas? We are currently inconsistent in that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been making that consistent

@matthijsmelissen
Copy link
Collaborator

Great work, this will certainly make all our work much easier.

Did you make any changes to the SQL apart from whitespace? Does this need a review?

On coverting the file I get:
Traceback (most recent call last):
File "./scripts/yaml2mml.py", line 2, in
import sys, yaml, json
ImportError: No module named yaml

I think we need to add the package python-yaml to the dependencies.

(SELECT
way
FROM planet_osm_line
WHERE man_made='cutline'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here (and at some other places) are spaces missing around the equals sign. Not really important, but I guess we'd better fix it now.

(SELECT way, highway, name
FROM planet_osm_line
WHERE highway in ('bridleway', 'footway', 'cycleway', 'path', 'track', 'steps')
and name is not null
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't gotten this far, just some guided search and replaces

@pnorman
Copy link
Collaborator Author

pnorman commented Sep 14, 2014

I think the fact that we can now add comments and do a proper code review is being well demonstrated with the fact that @math1985 has been able to do one :)

@matthijsmelissen
Copy link
Collaborator

I checked the code, and I think the yaml code is equivalent with the old mml code - with the exception of one typo (space in front of 'construction') and a code block called '_parts'. What does this do?

@pnorman
Copy link
Collaborator Author

pnorman commented Sep 14, 2014

The parts within it are used for YAML aliasing.
Pulling an example from the file, we get this YAML

# Various parts to be included later on
_parts:
  # Extents are used for tilemill, and don't actually make it to the generated XML
  extents: &extents
    extent: *world
    srs-name: "900913"
    srs: "+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0.0 +k=1.0 +units=m +nadgrids=@null +wktext +no_defs +over"
  osm2pgsql: &osm2pgsql
    type: "postgis"
    dbname: "gis"
    key_field: ""
    geometry_field: "way"
    extent: "-20037508,-20037508,20037508,20037508"
Layer:
  - id: "citywalls"
    name: "citywalls"
    class: ""
    geometry: "linestring"
    <<: *extents
    Datasource: 
      <<: *osm2pgsql
      table: |-
        (SELECT 
            way 
          FROM planet_osm_line 
          WHERE historic = 'citywalls')
        AS citywalls
    advanced: {}

The <<: *extents key merges the keys from the &extents mapping into that location, avoiding having to specify them again, and same for <<: *osm2pgsql.
The idea is taken from the MapProxy documentation.

Fortunately, an understanding of YAML isn't needed to use them when adding a layer - you just copy from existing layers. It's probably possible to use the first occurrence of an osm2pgsql layer as an anchor and copy mappings from that, but this way is more robust and clearer what's being included.

The YAML Reference Card has the YAML syntax (and is itself YAML), not that I see any need to use more YAML features.

The _parts in JSON is ignored by carto and doesn't impact the output XML, and TileMill also appears to ignore it.

@matthijsmelissen
Copy link
Collaborator

Thanks for the corrections.

We're still not consistent in the use of spaces after commas: personally I would prefer to add spaces after all commas. What do you think?

There are also still a couple of keywords that need capitzalizing, by the way.

@pnorman
Copy link
Collaborator Author

pnorman commented Sep 14, 2014

We're still not consistent in the use of spaces after commas

Well, that's largely because I wanted to get this PR in quicker than I could clean up all of the SQL.

@Ircama
Copy link
Contributor

Ircama commented Nov 6, 2016

project.yaml includes this comment:

# Extents are used for tilemill, and don't actually make it to the generated XML

Is the _parts section only needed by Tilemill or also to the OSM rendering through Mapnik?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move to using YAML for project layers file
4 participants