Cartometric Blog

A scrapbook of GIS tricks, with emphasis on FOSS4G.

PostGREsql/PostGIS Implementation of Google’s Encoded Polyline Algorithm

with 3 comments

[Edit 30 Jan, 2014]

I added an additional PostGREsql method to perform Polygon encoding by concatenating polygon geometries (delimiter: †) and their inner rings (delimiter: ‡) together into one massive encoded block of ring features. I also provided an example JavaScript method demonstrating how to bring the amalgamated polygon feature encodings into your Google Map.

By some uncanny twist of the fates, I’ve elected to use, had to use, and/or been asked to develop applications that use Google Maps ASCII Encoded Polyline expressions. In previous encounters, I’ve used a PHP class to handle the encoding task, and most recently I wrote a Python method to decode these expressions so that I could return a 3rd-party’s encoded geometries to WKT and import them into a spatially aware database.

So far so good.

However one thing has always bugged me about using the PHP solution–I don’t like using a piece of middleware to handle what I consider to be a responsibility of the data layer. Mark McClure’s page, which is basically the seminal authority on this topic, provides external links to implementations for Perl, Ruby, PHP (note: I prefer the PHP class linked, above), Java, and Mathematica. Also, by searching Stack Overflow, you can find implementations of the algorithm in both C# and VB.Net. But for all my efforts searching, I could never dredge up an implementation for either MySQL or PostGREsql/PostGIS. Bummer.

Looking up, it seems version 2.2 of PostGIS might include some built-in Google encoding conversion methods. While this is cool enough for a hat tip, unfortunately, it’s too inconvenient to wait that long, and even then, there’s no guarantee the implementation will work the way I expect with complex Polygon geometries; for instance, maybe it will encode only the exterior ring of Polygons, ignoring MultiPolygons completely, etc. For that matter, it’s equally possible there could be some bugs. So with this said, and even though the previously-mentioned PHP implementation does the job, my boss was cool-enough to let me take a crack at implementing the algorithm as a PostGREsql/PostGIS function, and then share the results with the world. Since some initial testing confirms my PostGIS implementation works, I’ll just post the code parts and hope others find it useful.

For what it’s worth, if anyone finds a bug or has recommendations for improvements, please don’t hesitate to drop me a line.


Sample query calling the first encoding function on the EXTERIOR RING of Polygon geometries:
(Also works on single-part LINESTRING features.)

 * Note that the encoding method can accept a LINESTRING only, which
 * is the geometry type used to represent the ring parts of a Polygon.
 * To help understand this, and why, please see the trailing discussion
 * section, which elaborates on this situation further.
  GoogleEncodeLine(ST_ExteriorRing(wkb_geometry)) as Google
  FROM polygons_wgs84
  WHERE ST_GeometryType(wkb_geometry) = 'ST_Polygon'
  LIMIT 10 ;


[Added 30 Jan, 2014]

Sample query calling the second encoding function on Polygon and MultiPolygon geometries:
(Preserves multi-part polygons and their inner-ring parts, a.k.a. “holes”.)

 * This encoding method will accept Polygon and MultiPolygon geom types.
 * The output returned is an amalgamation of Polyline encodings, where
 * individual geometries and their interior rings are concatenated
 * together using string delimiters, †, and ‡, respectively.
  GoogleEncodePolygon(wkb_geometry) as GooglePolygon
  FROM polygons_wgs84
  LIMIT 10 ;


Implementation functions to execute/save in your PostGREsql instance:

[Added 30 Jan, 2014]

 * Pass in either a Polygon or MultiPolygon geometry. Returns
 * an array of ASCII-encoded Polygon feature parts, including
 * multi-part geometries and their interior rings.
 ng INT;        -- Store number of Geometries in the Polygon.
 g INT;         -- Counter for the current geometry number during outer loop.
 g2 GEOMETRY;   -- Current geometry feature isolated by the outer loop.
 nr INT;        -- Store number of internal ring parts in the Polygon.
 r INT;         -- Counter for the current inner-ring part.
 r1 GEOMETRY;   -- Exterior ring part isolated BEFORE the inner loop.
 r2 GEOMETRY;   -- Inner-ring part isolated within the inner loop.
 gEncoded TEXT; -- Completed Google Encoding.
 gEncoded = '';
 ng = ST_NumGeometries(g1);
 g = 1;
     g2 = ST_GeometryN(g1, g);
     if g > 1 then gEncoded = gEncoded || chr(8224); END IF;
     -- Get ExteriorRing now; if there are any holes, get them later in the loop..
     r1 = ST_ExteriorRing(g2);
     gEncoded = gEncoded || GoogleEncodeLine(r1);
     nr = ST_NRings(g2);
     if nr > 1 then
       -- One (1) is because interior rings is one-based.
       -- And nr-1 is because ring count includes the boundary.
       FOR r IN 1..(nr-1) BY 1 LOOP
         r2 = ST_InteriorRingN(g2, r);
         gEncoded = gEncoded || chr(8225) || GoogleEncodeLine(r2);
       END LOOP;
     END IF;
 RETURN gEncoded;
$$ LANGUAGE plpgsql;


 * First of two methods. Pass in a geometry (LINESTRING only).
 * Returns ASCII-encoded point array for use in Google Maps.
  p INT; np INT;
  deltaX INT;
  deltaY INT;
  enX VARCHAR(255);
  enY VARCHAR(255);
  gEncoded TEXT;
  gEncoded = '';
  np = ST_NPoints(g);

  IF np > 3 THEN
    g = ST_SimplifyPreserveTopology(g, 0.00001);
    np = ST_NPoints(g);

  pt1 = ST_SetSRID(ST_MakePoint(0, 0),4326);

    pt2 = ST_PointN(g, p);
    deltaX = (floor(ST_X(pt2)*1e5)-floor(ST_X(pt1)*1e5))::INT;
    deltaY = (floor(ST_Y(pt2)*1e5)-floor(ST_Y(pt1)*1e5))::INT;
    enX = GoogleEncodeSignedInteger(deltaX);
    enY = GoogleEncodeSignedInteger(deltaY);
    gEncoded = gEncoded || enY || enX;

    pt1 = ST_SetSRID(ST_MakePoint(ST_X(pt2), ST_Y(pt2)),4326);
RETURN gEncoded;
$$ LANGUAGE plpgsql;


 * Second of two methods. Accepts a signed integer (LON or LAT
 * by 1e5) and returns an ASCII-encoded coordinate expression.
  e VARCHAR(255);
  s BIT(32);
  b BIT(6);
  n INT;
 e = '';
 s = (c::BIT(32))<<1;

 IF s::INT < 0 THEN
   s = ~s;
   END IF;

 WHILE s::INT >= B'100000'::INT LOOP
   b = B'100000' | (('0'||substring(s, 28, 5))::BIT(6));
   n = b::INT + 63;
   e = e || chr(n);
   s = s >> 5;
 e = e || chr(s::INT+63);

$$ LANGUAGE plpgsql;


[Added 30 Jan, 2014]

JavaScript method demonstrating how to add Polygon encodings to a Google Map object:
(This client implementation works for either the single or the multi-part polygons.)

 * JavaScript! Pass-in an encoded text block created by either
 * PostGIS method, GoogleEncodePolygon() or GoogleEncodeLine(),
 * and render it in your Google Map object. If you don't want
 * the map to zoom to each rendering, just remove the "bounds"
 * variable and any references to it.
function renderEncoded(encoded_path)
   var bounds = new google.maps.LatLngBounds();
   var $encodedGeoms = encoded_path.split("†");
   for (var i=0; i<$encodedGeoms.length; i++)
       var encodedGeom = $encodedGeoms[i];
       var $encodedRings = encodedGeom.split("‡");
       var polyPaths = [];
       for (var j=0; j<$encodedRings.length; j++)
           var ptarray = google.maps.geometry.encoding.decodePath($encodedRings[j]);
       var polygonObject = new google.maps.Polygon(
         paths: polyPaths,
         strokeColor: '#890000',
         strokeOpacity: 1.0,
         strokeWeight: 2


And some additional discussion..

There are two “gotchas” when it comes to implementing the encoding algorithm with respect to Polygons:

1) Polygons, as geometries, can be composed of many rings. The outer ring is considered to be the boundary, and various inner rings are often called “holes”. So this is a specified, understood, and accepted built-in many-to-one relationship between polygons and their internal ring geometries.

And 2) It’s not rare to find polygon tables containing both Polygon and MultiPolygon data types. I think this happens because ESRI allows it, and so in an effort to play well with others, other GIS systems have accommodated it. At least, I know this is true for MySQL and PostGIS.

Here’s why this makes trouble–Google’s encoding algorithm is only intended to represent individual point arrays as a singular geometry. Basically, as long as your first point equals your last point, it’s considered to be a closed geometry, and you can add it and render it in a Google Map as a polygon. The algorithm itself isn’t designed to represent nested arrays, which would be necessary to render either a Polygon with “holes” or a MultiPolygon, which could potentially define many polygons with holes of their own! As such, I suspect there could be considerable disagreement as to how a Polygon-to-Google-Encoded method should actually handle Polygons..

The only solutions I can imagine for this problem would require “faking” a one-to-many relationship by perhaps delimiting together several encodings to account for MultiPolygons and/or single feature Polygons with interior rings. But this starts to get weird. So to keep things somewhat simple for the sake of the post, I chose to stay true to the algorithm’s intent and return a single encoded geometry expression from my method. And the sample query demonstrates this by calling the method against the outermost ring (i.e. the boundary) of a Polygon geometry type, which PostGREsql regards as a LineString, anyway.

[Added 30 Jan, 2014]

Since I wanted to handle the more complex geometries, I wrote the wrapper method GoogleEncodePolygon(), to first iterate over ST_NumGeometries() and gain access to any multi-part features, then second, iterate over ST_NRings() using ST_InteriorRingN()–you could also do this using ST_DumpRings()–and gain access to any interior rings of the Polygon geometries, themselves. Then, for each ring part, I call GoogleEncodeLine(), and concatenate together all those expressions into one massive block of “compound” expressions. I chose to delimit each geometry encoding using an extra-special character that would never be used by Google’s algorithm; for example chr(8224), which corresponds to “†”. I then further delimit the internal ring parts using another special character, chr(8225), which corresponds to “‡”, and return all these concatenated together as a compound encoding expression. Then, on the client-side (a JavaScript example is provided above), I merely split the compound expression against my delimiters, loop over the various expressions, and add them to the map individually. Note if you are attaching attributes to your features, you’ll need to remember to include them explicitly to each unique Polygon added to your map.

Written by elrobis

January 27th, 2014 at 12:20 pm

PostGIS: query all multipolygon parcels with at least one hole

without comments

I was writing some code to iterate over Well Known Text expressions for polygon features, and I decided I needed to test the most complex edge-case I could think of–multipolygon geometries where at least one of the bound polygons has a hole (i.e. an interior ring).

I ended up with the following query. This seems like the kind of thing I’ll want to reuse later, so I’m noting it here. For good measure, I also use a rudimentary technique to sort the output with the most complicated geometries in the table at the top of the list. Basically, the more “text” it takes to describe the geometry using Well Known Text, the larger and more complex I figure it must be!

  SomePrimaryId,   /* your primary key, i.e. ogc_fid, etc. */
  SomeUniqueId,    /* your descriptive id, i.e. a parcel number */
  ST_NumGeometries(wkb_geometry) AS num_geoms,
  ST_NRings(wkb_geometry) AS num_rings,
  ST_AsText(ST_Centroid(wkb_geometry)) AS center,
  Char_Length(ST_AsText(wkb_geometry)) AS len,
  ST_AsText(wkb_geometry) AS wkt
FROM SomePolygonTable
  ST_NumGeometries(wkb_geometry) > 1
  ST_NRings(wkb_geometry) > ST_NumGeometries(wkb_geometry)
ORDER BY Char_Length(ST_AsText(wkb_geometry)) ASC ;


Just for the sake of promoting caution, I’m not certain this is a definitive approach for identifying the largest geometry in a table, as the length of the binary representation and the length of the readable text representation do not correspond one-to-one. Moreover, a feature could have more vertices that required less precision to express (fewer decimal position), than a geometry with fewer vertices that needed more precision, and then you have to ask, which is bigger, fewer vertices and more text, or more vertices that coincidentally did not require as much text? My conclusion is, the “most complicated geometry” is probably relative to the one asking the question. However for my purposes, this was close enough to put the most complicated stuff at the top of the list.

Written by elrobis

November 8th, 2013 at 10:26 am

Install httplib2 to your Preferred Python Runtime after ArcGIS Unceremoniously Hijacks the First Attempt

without comments

We’re going to use Google Maps Engine for some stuff, and so I thought I’d throw my first codes at it using Python. Well.. Google’s Python example requires the httplib2 library, so I needed to install that. When I did—following the library’s download/instructions—for some reason, the library went into service against ArcGIS’s embedded Python runtime, rather than my favored 2.7 instance, which I use for every single thing. ::grimaces::

To fix this, I just ensured the new library’s script was called from my preferred Python runtime rather than let Windows, or some black magic, decide which instance it should integrate with httplib2.

Breaking it down:

1) cd into the unpacked httplib2 directory:

C:\>cd C:\Users\elijah\Downloads\httplib2-0.8

2) Next, you can qualify which Python runtime you want to execute the supplied install script—in my case, this worked to install httplib2 to that specific runtime. This was the exact command I used:

C:\Users\elijah\Downloads\httplib2-0.8>C:\Python27\python.exe install

Voila! Now when I pull up IDLE, relative to my preferred instance of Python 2.7, I can import httplib2 without issues.

It didn’t take too long to figure this out—but it seemed like an easy post should someone else be stumped and on the wrong foot.

Written by elrobis

August 5th, 2013 at 9:42 am

Posted in Uncategorized

osm2pgsql help and usage

without comments

I wanted a more convenient place to read the osm2pgsql help output, so this seemed as good a place as any for it. /E

osm2pgsql -h
osm2pgsql SVN version 0.80.0 (32bit id space)
        osm2pgsql [options] planet.osm
        osm2pgsql [options] planet.osm.{gz,bz2}
        osm2pgsql [options] file1.osm file2.osm file3.osm
This will import the data from the OSM file(s) into a PostgreSQL database
suitable for use by the Mapnik renderer
   -a|--append          Add the OSM file into the database without removing
                        existing data.
   -b|--bbox            Apply a bounding box filter on the imported data
                        Must be specified as: minlon,minlat,maxlon,maxlat
                        e.g. --bbox -0.5,51.25,0.5,51.75
   -c|--create          Remove existing data from the database. This is the
                        default if --append is not specified.
   -d|--database        The name of the PostgreSQL database to connect
                        to (default: gis).
   -i|--tablespace-index        The name of the PostgreSQL tablespace where
                        all indexes will be created.
                        The following options allow more fine-grained control:
      --tablespace-main-data    tablespace for main tables
      --tablespace-main-index   tablespace for main table indexes
      --tablespace-slim-data    tablespace for slim mode tables
      --tablespace-slim-index   tablespace for slim mode indexes
                        (if unset, use db's default; -i is equivalent to setting
                        --tablespace-main-index and --tablespace-slim-index)
   -l|--latlong         Store data in degrees of latitude & longitude.
   -m|--merc            Store data in proper spherical mercator (default)
   -M|--oldmerc         Store data in the legacy OSM mercator format
   -E|--proj num        Use projection EPSG:num
   -u|--utf8-sanitize   Repair bad UTF8 input data (present in planet
                        dumps prior to August 2007). Adds about 10% overhead.
   -p|--prefix          Prefix for table names (default planet_osm)
   -s|--slim            Store temporary data in the database. This greatly
                        reduces the RAM usage but is much slower. This switch is
                        required if you want to update with --append later.
                        This program was compiled on a 32bit system, so at most
                        3GB of RAM will be used. If you encounter problems
                        during import, you should try this switch.
      --drop            only with --slim: drop temporary tables after import (no updates).
   -S|--style           Location of the style file. Defaults to /usr/local/share/osm2pgsql/
   -C|--cache           Now required for slim and non-slim modes:
                        Use up to this many MB for caching nodes (default: 800)
   -U|--username        Postgresql user name
                        password can be given by prompt or PGPASS environment variable.
   -W|--password        Force password prompt.
   -H|--host            Database server hostname or socket location.
   -P|--port            Database server port.
   -e|--expire-tiles [min_zoom-]max_zoom        Create a tile expiry list.
   -o|--expire-output filename  Output filename for expired tiles list.
   -r|--input-reader    Input frontend.
                        libxml2   - Parse XML using libxml2. (default)
                        primitive - Primitive XML parsing.
                        pbf       - OSM binary format.
   -O|--output          Output backend.
                        pgsql - Output to a PostGIS database. (default)
                        gazetteer - Output to a PostGIS database suitable for gazetteer
                        null  - No output. Useful for testing.
                        Include attributes for each object in the database.
                        This includes the username, userid, timestamp and version.
                        Note: this option also requires additional entries in your style file.
   -k|--hstore          Add tags without column to an additional hstore (key/value) column to postgresql tables
      --hstore-match-only       Only keep objects that have a value in one of the columns
      -                         (normal action with --hstore is to keep all objects)
   -j|--hstore-all      Add all tags to an additional hstore (key/value) column in postgresql tables
   -z|--hstore-column   Add an additional hstore (key/value) column containing all tags
                        that start with the specified string, eg --hstore-column "name:" will
                        produce an extra hstore column that contains all name:xx tags
   -G|--multi-geometry  Generate multi-geometry features in postgresql tables.
   -K|--keep-coastlines Keep coastline data rather than filtering it out.
                        By default natural=coastline tagged data will be discarded based on the
                        assumption that post-processed Coastline Checker shapefiles will be used.
      --number-processes                Specifies the number of parallel processes used for certain operations
                        Default is 1
   -I|--disable-parallel-indexing       Disable indexing all tables concurrently.
      --unlogged        Use unlogged tables (lost on crash but faster). Requires PostgreSQL 9.1.
      --cache-strategy  Specifies the method used to cache nodes in ram.
                                Available options are:
                                dense: caching strategy optimised for full planet import
                                chunked: caching strategy optimised for non-contigouse memory allocation
                                sparse: caching strategy optimised for small extracts
                                optimized: automatically combines dense and sparse strategies for optimal storage efficiency.
                                           optimized may use twice as much virtual memory, but no more physical memory
                                   The default is "chunked"
   -h|--help            Help information.
   -v|--verbose         Verbose output.
Add -v to display supported projections.
Use -E to access any espg projections (usually in /usr/share/proj/epsg)

Written by elrobis

December 28th, 2012 at 9:21 am

Posted in Uncategorized

Resident “lunatic” of LinkedIn

without comments

Just updated my LinkedIn profile.. This is just TOO FUNNY right now!

I totally had to save this for posterity. Unfortunately it will probably disappear before you can say “Jack Robinson”. 8)

Written by elrobis

December 21st, 2012 at 11:04 am

Posted in Uncategorized

mapnik rundemo.exe error: run from within the demo/c++ folder?

without comments

So I just installed the mapnik 2.0.1 binaries for Windows, and I ran into a “gotcha.” I couldn’t figure this out by Googling, so hopefully this post will help someone.  (Be sure to note the embarrassing conclusion.)

Specifically, following a fresh mapnik install, rundemo.exe told me this:

usage: ./rundemo <mapnik_install_dir>
Usually /usr/local/lib/mapnik
Warning: ./rundemo looks for data in ../data/,
Therefore must be run from within the demo/c++ folder.

Note the first bit:  usage: ./rundemo <mapnik_install_dir>


Well ok, that’s fine. Except this didn’t work, either:

C:\mapnik-2.0.1rc0\demo\c++>rundemo C:\mapnik-2.0.1rc0
running demo …
looking for ‘shape.input’ plugin in… C:\mapnik-2.0.1rc0/input/
looking for DejaVuSans font in… C:\mapnik-2.0.1rc0/fonts/DejaVuSans.ttf
### Configuration error: Could not create datasource. No plugin found for type ‘shape’ (searched in: C:\mapnik-2.0.1rc0/input/)

Hmm, but that is my install path??


Anyway, I finally figured it out. This works:

C:\mapnik-2.0.1rc0\demo\c++>rundemo C:/mapnik-2.0.1rc0/lib/mapnik
running demo …
[... yadda yadda ...]


Now for the embarrassing part—I was juxtaposing between this install tutorial and this other install tutorial, and clearly I did not pay close attention to the second. It turns out, if my eyes were peeled, I might have noticed this:

Then you should be able to run the demo program:

cd c:\mapnik-2.0.1rc0\demo\c++ rundemo ..\..\lib\mapnik


I’d like to quote Donald Knuth and re-state: “Software is hard”, but this is just a good, old-fashioned case of I-was-dumb. Nevertheless, I was stumped and Googling didn’t help, so maybe this will save some the confusion. :/


“You know those guitars that are like ..double guitars?!?!”

Written by elrobis

December 18th, 2012 at 12:27 pm

Posted in Uncategorized

Tagged with

osm2pgsql and windows errors: failed to start MSVCR90.dll, Connection to database failed, etc..

with one comment

I’m following the BostonGIS tutorial(s) to learn how to setup an OpenStreetMap tile server on Windows (XP 32, cos’ that’s what’s on the desk), and I’m running into headache after headache. So this post notes the gotcha’s I’m encountering and how I’m fixing them (optimistically assuming I fix all of them).

“Thar be dragons..” 8)


Part I: “Failed to start MSVCR90.dll”

First, as recommended I tried using the HOTOSM installer, but when I launch it using the example command provided on the BostonGIS site, I got the error above. I’d logged out/in, rebooted, ad nauseam, to no avail. Also the problem can’t be my redist libs–this install of XP is only a couple months old, and I have all the redists libs I know of (2005, 2008, 2010—plus, when I try reinstalling a redist it says I have the latest). I consider uninstalling and reinstalling Visual Studio to be more trouble than just trying something else to get the OSM download into PostGIS, so..

Next I tried the latest “alpha” build of osm2pgsql (rather than the all-in-one HOTOSM installer), and that  got me around the MSVCR/redist issue. That is, it’s launching..


Part II: the BostonGIS osm2pgsql example command

Error: Connection to database failed: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket “/tmp/.s.PGSQL.5432″?

…and then…

Error: Connection to database failed: fe_sendauth: no password supplied

…and then…

Out of memory for node cache dense index, try using “–cache-strategy sparse” instead

…and then…

Out of memory for sparse node cache, reduce –cache size

I’ve benefitted greatly from the BostonGIS tutorials, and I’m very thankful for them. (Thank you Regina, Leo, et. al., sincerely!) Having said that, though, here’s a few notes discussing how I addressed the above, for which a couple items are conspicuously missing in the BostonGIS osm2pgsl example command.

So take a look at this:

osm2pgsql missouri.osm.bz2 -H localhost -d OSM -U postgres -P 5432 -W -S -C 512 –slim –hstore –cache-strategy sparse

What’s of-note:

-H localhost (solves the first error)

-W (solves the second error, i.e. lets you enter your db password after ENTER)

–cache-strategy sparse (gets past the third, and into the fourth error)

–slim (from everything I read, you just need this on a 32-bit system)

-C 512 (wicked-small cache size)


21 Minutes. I’m not talking about an episode of Friends.

So those are the extra command parameters that ultimately got me to end-game:

Osm2pgsql took 1317s overall

Setting that wicked-small cache size finally did the trick (it actually finished while I was writing this). At first I started it larger (12000, look under “Parameters”), then I started backing off (4096, 2048 look under “Loading data into your server”), when all the examples I found didn’t work, I just tried reducing by the common multiples—again, 1024 didn’t work, but 512 did.

I need to do some research, but I guess that cache size suggests my hard disk must be ..wholly ..awful. :/ I’m not a gear-hound (clearly, I’m using XP), but that’s my best-guess.

Written by elrobis

December 18th, 2012 at 9:55 am

Posted in Uncategorized

Decode Google Map encoded points as Well Known Text (WKT) with Python

without comments

I had close encounter of the 5th kind yesterday.. here’s the gist..

It started when someone “gave” me a GIS dataset (..of polygons, kind of..) that a colleague of theirs, way back in ancient history, chose to pre-cook as ASCII-encoded point pairs. Their intention was almost certainly to use the pre-cooked data in Google Maps. Anyway, being arguably sane, I wanted to return this data to a more GIS-normal format so I could put it in a database like MySQL or Post and use it for other stuff.

I considered a few different approaches to this problem, including creating a Google Map that could load-in all of the polygons from their encodings, then iterate over the polygons, interrogate the polygon point pairs, and finally concatenate WKT features from the points and save the geofeatures into a MySQL table. This approach offered the advantage of using Google’s existing Maps API to do the decoding for me. But let’s face it, that’s lame, uninspired, and not inventive.. it wasn’t even interesting.

Besides..  I wanted to use Python.

I expected to find a Python recipe for this looming in the misty www, but I didn’t. However, I did find a JavaScript recipe by trolling around in Mark McClure’s website. Specifically, he provides a Polyline Decoder utility, and when I viewed the page source, I found the JavaScript code that actually does the decoding (opening in FireFox will show you the code, or IE should prompt you to download the file).

Long story short, the following Python method is an adaptation of Mark McClure’s JavaScript method (twisted a little to return WKT features rather than the pure point array). If you’re somewhat comfortable with Python, you should be able to copy/paste the method right into your Python file and start calling it; just pass-in the encoded point string and let the method do the rest.

Best / Elijah


def decodeGMapPolylineEncoding(asciiEncodedString):
    print "\nExtrapolating WKT For:"
    print asciiEncodedString

    strLen = len(asciiEncodedString)

    index = 0
    lat = 0
    lng = 0
    coordPairString = ""

    # Make it easy to close PolyWKT with the first pair.
    countOfLatLonPairs = 0
    firstLatLonPair = ""
    gotFirstPair = False

    while index < strLen:
        shift = 0
        result = 0

        stayInLoop = True
        while stayInLoop:                                                # GET THE LATITUDE
            b = ord(asciiEncodedString[index]) - 63
            result |= (b & 0x1f) << shift
            shift += 5
            index += 1

            if not b >= 0x20:
                stayInLoop = False

        # Python ternary instruction..
        dlat = ~(result >> 1) if (result & 1) else (result >> 1)
        lat += dlat

        shift = 0
        result = 0

        stayInLoop = True
        while stayInLoop:                                                # GET THE LONGITUDE
            b = ord(asciiEncodedString[index]) - 63
            result |= (b & 0x1f) << shift
            shift += 5
            index += 1

            if not b >= 0x20:
                stayInLoop = False

        # Python ternary instruction..
        dlng = ~(result >> 1) if (result & 1) else (result >> 1)
        lng += dlng

        lonNum = lng * 1e-5
        latNum = lat * 1e-5
        coordPairString += str(lonNum) + " " + str(latNum)

        if gotFirstPair == False:
            gotFirstPair = True
            firstLatLonPair = str(lonNum) + " " + str(latNum)

        countOfLatLonPairs += 1

        if countOfLatLonPairs > 1:
            coordPairString += ","

    # The data I was converting was rather dirty..
    # At first I expected 100% polygons, but sometimes the encodings returned only one point.
    # Clearly one point cannot represent a polygon. Nor can two points represent a polygon.
    # This was an issue because I wanted to return proper WKT for every encoding, so I chose
    # To handle the matter by screening for 1, 2, and >=3 points, and returning WKT for
    # Points, Lines, and Polygons, respectively, and returning proper WKT.
    # It's arguable that any encodings resulting in only one or two points should be rejected.
    wkt = ""
    if countOfLatLonPairs == 1:
        wkt = "POINT(" + coordPairString + ")"
    elif countOfLatLonPairs == 2:
        wkt = "POLYLINE(" + coordPairString + ")"
    elif countOfLatLonPairs >= 3:
        wkt = "POLYGON((" + coordPairString + "," + firstLatLonPair + "))"

    return wkt

Written by elrobis

October 20th, 2012 at 1:00 pm

PostGIS: count all features of each GeometryType in a spatial table

without comments

Sometimes, just when you think you’ve got something figured out –you get reminded that you really don’t.  :/

As you may know, ESRI allows for single and muli-part geometries to live in the same FeatureClass. So, if I have a shapefile of roads, there might be both LINESTRING and MULTILINESTRING features in that dataset. I live just loose-enough not to care about that. But I do need to be aware of it when I’m cobbling data in PostGIS.

In thise case, I was getting a PostGIS error tying to do a Dissolve-By-SQL, so I thought why not get a quick count of each GeometryType in the dataset? Maybe I was running into issues single and multi-part geometries were blurred together. It took me an embarassing chunk of time to get this right, so I figured I’d post the recipe  in case I needed a reminder later.

  GeometryType( wkb_geometry ) as geomType,  
  COUNT( wkb_geometry ) as featureCount
FROM anyGeoDataTable
WHERE wkb_geometry IS NOT NULL
GROUP BY GeometryType( wkb_geometry );


[Update 4.16.2012]
I wanted to do the same thing in MySQL the other day. It’s essentially the same query, but note the need to use CONVERT() function to make the output properly render in MySQL Workbench:

  CONVERT( GeometryType( shape ) USING utf8 ) as geomType,
  COUNT( shape ) as featureCount
FROM anyGeoDataTable
GROUP BY GeometryType( shape );


Without applying the CONVERT() function, MySQL Workbench just shows “blob” in the return set.

Anyway.. back to cracking some nut..  :]

Written by elrobis

January 21st, 2012 at 9:17 pm

Posted in PostGIS

Tagged with ,

Prepare a Shapefile for OpenScales using ogr2ogr and PostGREsql

with one comment

This post explains how to import GIS data (a shapefile, in this case) into a database (PostGREsql) so it can be consumed by most any mapping API. I have OpenScales in mind, but this approach will support any mapping app with functions for rendering feature overlays using geodetic coordinates (i.e Longitude and Latitude). In many cases, you’ll need to translate your feature data out of a projected/cartesian system and into a geodetic/spherical system; so I’ll include a demonstration of that.

Quick and Dirty Summary

This approach has two parts. First, we’ll use GDAL’s ogr2ogr utility to import a shapefile into our database. Second, we’ll use a few SQL commands to translate our data from a projected to a geodetic system, as well as optimize the table for fast query speeds.


The following prerequisites will need to be met in order to follow along:

1) GDAL is installed. If you need to install GDAL, check out my earlier post titled Install GDAL on Windows. Alternatively, you could install FWTools, which is admittedly easier, but that package is no longer maintained and it’s becoming out-of-date as GDAL/OGR continues to evolve.

2) PostGREsql is installed, and the PostGIS extension is enabled. If you need to install PostGREsql and PostGIS, check out the tutorial at Boston GIS demonstrating how to acquire and install PostGREsql with PostGIS. It won’t hurt to reveiw their entire tutorial, but I deviate from their approach once installation is complete (look for their sub-heading, Loading GIS Data Into the Database).

3) You have some geodata. I think the typical reader will have their own shapefile, pesonal geodatabase, or otherwise, but if you need something to follow along, here’s a US States shapefile projected to NAD83 Albers Equal Area Conic:

Loading Geodata into PostGREsql / PostGIS

To push shapefile data into your geodatabase, you can run an ogr2ogr script like this:

ogr2ogr -f “PostGreSQL” PG:”host= user=youruser dbname=yourdb password=yourpass” “E:\4_GIS\01_tutorials\usstates_nad83_aeac\usstates_albers.shp” -nln usstates -nlt geometry

For deeper reading on ogr2ogr utility flags (like -nln and nlt), check out the usage notes for ogr2ogr. Also, it may be worth you while to peruse the OGR PostgreSQL driver page, as well as the Advanced Driver Information page. In the meantime, here are a few quick notes regarding my script:

-f “PostGreSQL” PG:”host= user=youruser dbname=yourdb password=yourpass”  This tells OGR you’re exporting to PostGreSQL with the following connection string. Notice that my connection string is wrapped in double-quotes (“).

“E:\4_GIS\01_tutorials\usstates_nad83_aeac\usstates_albers.shp”   This is the path to my shapefile  input data. Once again, I wrapped this value in double-quotes (“). I do this to prevent the console from introducing linebreaks into the argument value and confusing the parser.

-nln usstates  The -nln flag means “rename the table on export”. In other words, my PostGREsql db will get a new table named usstates, and not one named usstates_albers.

-nlt geometry  This one’s particularly important for polygon data. It tells OGR “accept any geometry you encounter and store it in the feature’s geometry column”. Oftentimes, a polygon dataset will have polygons and multipolygons in the same table. For example here’s a narrow column of Well Known Text (WKT) geometries from the albers shapefile so you can see what I mean:

MULTIPOLYGON (((-1827806.2165497246 1227…..
POLYGON ((-1148108.0134327484 649421.311…..
MULTIPOLYGON (((1949555.0881544715 75264…..
POLYGON ((-199237.01920416657 704421.540…..
POLYGON ((-519870.38897088548 372373.616…..

If you run the ogr2ogr script noted above without -nlt geometry, you’ll get an error like this:

ERROR 1: Terminating translation prematurely after failed
translation of layer usstates_albers (use -skipfailures to skip errors)

By default, OGR refuses to mix geometry types in a table, so -nlt geometry allows you to duck that requirement and store both Polygon and Multipolygon features in the same table. You could optionally instruct OGR to “explode” Multipolygons into individual Polygons using the -explodecollections flag, as depicted in the following screenshot, but I don’t recommending that solution for the intended use case. For example, if a map user clicks on Michigan’s Upper Penenssula, I want the whole state to be selected, not just the UP. I’m not saying you can’t make that happen after exploding multifeatures; rather, it’s just not the approach I favor.

Without -nlt geometry, ogr2ogr will throw an error if it attempts to export polygons and multipolygons into the same table. Alternatively, you can use the flag -explodecollections (not recommended in this case) to translate your multipolygons into several polygons.

Assuming you used the script like the one I initially provided, you should be able to open pgAdmin III (the PostGREsql admin GUI that insalls with the database) and see your new usstates table:

Post-Processing your Geodata with SQL Instructions

With pgAdmin III open, expand the Tools menu and launch the Query tool. You’ll use the Query tool and the following SQL instructions to prep your data for production. I’ll start by listing all the queries together, then I’ll provide some deeper explaination in the text that follows.

SELECT SRID(wkb_geometry) FROM usstates;
SELECT * FROM spatial_ref_sys WHERE srid = 900925;
SELECT st_asText(wkb_geometry) FROM usstates;
UPDATE usstates SET wgs84geom = st_Transform(wkb_geometry, 4326); 
SELECT SRID(wgs84geom) FROM usstates;
SELECT st_asText(wgs84geom) FROM usstates;
VACUUM usstates;
CREATE INDEX usstates_wgs84_idx ON usstates USING GIST(wgs84geom);

Basically, the steps emphasized in blue do the actual work, while the steps in black are more for sanity checks. Steps 4 and 5 perform the geometry transformation, and the last two steps do some house-cleaning and table optimization. Now I’ll provide a one-by-one discussion of each step.

1) SELECT SRID(wkb_geometry) FROM usstates;

Here we’re getting the SRID for the features in this layer (which is 900925 on my system). By default, OGR will store feature geometries in a field called wkb_geometry. Also, your PostGIS installation includes a table named spatial_ref_sys that stores coordinate system definitions necessary for the database to remain “spatially aware” of your new table as well as the other spatial datasets the system is managing. Consider this, if you want to select points from one layer that fall inside polygons from another layer, PostGIS needs to understand the coordinate systems for both datasets so that it can align their features for analysis. So when we run the SRID() function on the table’s geometry field, wkb_geometry, PostGREsql will return the unique identifier for the coordinate system used to define the features in our table.

2) SELECT * FROM spatial_ref_sys WHERE srid = 900925;

In this step we answer the question, “Does the SRS established at import makes sense for the data?” This statement queries the PostGIS spatial_ref_sys table for the coordinate system whose ID is referenced in the previous step. Check the srText field for a readable version of the coordinate system. Mine begins with “PROJCS["North_America_Albers_Equal_Area_Coni.."  That's what I expected, and that's a good thing.

3) SELECT st_asText(wkb_geometry) FROM usstates;

Now I like to do a quick query to see the WKT for some of my features. The geometry field wkb_geometry was created by ogr2ogr when it imported the shapefile into PostGREsql. If you don't like this name, you can use the creation option -lco GEOMETRY_NAME=geom in your ogr2ogr import script to set the name of the geometry field at import time. As shown in the image, the WKT for my features looks like I would expect.

Querying feature geometries using the ST_asText() function on the geometry column, wkb_geometry.

4) ALTER TABLE usstates ADD COLUMN wgs84geom GEOMETRY;

This instruction adds a new column to the table, which I’ll use to store the feature geometries for my US States in geodetic coordinates. The column will be named wgs84geom and will expect data of type GEOMETRY. In other words, this field will store a permanent “cast” of our feature geometries in the WGS84 coordinate system, which is very popular due to its use by the famous Global Positioning System (GPS).

Note: GIS coordinate systems are complex beasts, and it’s easy to get lost in their particulars. Nevertheless, one distinction is very important, and that’s the difference between projected systems and geodetic systems. Projected systems are two-dimensional. —these are the X/Y grids you used for trigonometry exercises in High School. On the other hand, geodetic systems define coordinate geometries within a three-dimensional, spherical space.

Both systems are roses by many names. For instance, projected systems may be called “Cartesian” or “Geometric”. And Geodetic systems may be called “Sexigesimal” or “Geographic”. The PostGIS community may more often refer to features as being geometry, or geography data and mean projected vs. geodetic coordinates, respectively. So if you run across language like this, realize people intend for geometry to imply X and Y coordinates in a cartesian space, and for geography to mean familiar longitude and latitude coordinates.

In this image, state boundaries are drawn in Albers Equal Area Conic, which is a projected coordinate system.


Here the state boundaries are "represented" in NAD83 geographic coordinates. NAD83 geographic (EPSG 4269), is similar, if not nearly identical, to WGS84. As you can see, geodetic systems are difficult to represent in 2D space; as such, unprojected maps of large areas tend to look "suspect".

5) UPDATE usstates SET wgs84geom = st_Transform(wkb_geometry, 4326); 

With the new column ready to go, you can now wield an UPDATE statement and the st_Transform() function to translate feature geometries from their projected coordinates to their geographic WGS84 coordinates. The st_Transform() function expects two arguments, the source geometry field to transform, and the EPSG code for the output coordinate system. WGS84 is a fundamentally-popular coordinate system, and it’s EPSG code of 4326 is easy to find. If you do not know the EPSG code for your preferred coordinate system, head over to and do some quick research. 

Note: We could optionally perform any coordinate system transformations in our queries by calling st_Transform on the geometry field right in the query. However, by casting our feature geometries in advance, we remove calculation overhead and get a subtle efficiency gain. This can particularly improve response times for spatial queries.

6) SELECT SRID(wgs84geom) FROM usstates;

Like the second step, this query is only intended to confirm whether the SRID established in the previous step makes sense for the data. It should return 4326.

7) SELECT st_asText(wgs84geom) FROM usstates;

Also like the third step, here I’m querying the feature geometrires as WKT to make sure they’re defined by longitude/latitude coordinate pairs.

Well Known Text (WKT) for features transformed into the WGS84 coordinate system.

8) VACUUM usstates;

Once you’re finished with the geometry transformation, call a VACUUM instruction for the usstates table. PostGREsql likes to have a deep knowledge of its feature tables so that it can optimize queries. To this end, the VACUUM command instructs PostGREsql to “gather fresh intel” on your table so that it can make better decisions. This step is particularly necessary for tables with a large number of features as well as tables experiencing a lot of maintenance (i.e. frequent feature INSERT and UPDATE activity).

9) CREATE INDEX usstates_wgs84_idx ON usstates USING GIST(wgs84geom);

Finally —if you intend to perform queries on this table, particularly spatial intersection queries on the new geometry column, you’ll want to create a spatial index referencing that column. Here, usstates_wgs84_idx is just a naming convention that implies TableName_FieldName_ThisIsAnIndex. To create an index, call the GIST() function on a table and pass in the table column you intend to search on —for instance, wgs84geom.

After following along with this post, you should have learned how to 1) use ogr2ogr to populate a PostGREsql database with shapefile data, 2) leverage PostGIS functions to perform a coordinate system transformation in the database, and 3) apply PostGREsql optimization functions to optimize the table for production use.

I hope you found this beneficial. Thanks for reading.


Written by elrobis

November 19th, 2011 at 5:55 pm

Posted in GDAL/OGR

Tagged with , , ,