Cartometric Blog

A scrapbook of GIS tricks, with emphasis on FOSS4G.

ogr2ogr: Export Well Known Text (WKT) for one feature to a CSV file

without comments

Perhaps you’re looking for this?

ogr2ogr -f “CSV” “E:\4_GIS\NorthArkCartoData\UnitedStates\MO_wkt” “E:\4_GIS\NorthArkCartoData\UnitedStates\USStates.shp” -sql ” SELECT * FROM usstates WHERE STATE_NAME = ‘Missouri’ ” -lco “GEOMETRY=AS_WKT ” -lco “LINEFORMAT=CRLF” -lco “SEPARATOR=SEMICOLON”

My buddy at work needed a way to get the WKT geometry definition for a single feature in a shapefile. I thought, “surely this is something we can do with OGR?” Lo’ and behold, yes it was. :)

The script above uses OGR SQL to interrogate a shapefile for one lone feature and, when it’s found, exports the record to a comma separated values (CSV) file (or in this case, a semicolon delimited file). Here’s a quick break down:

-f “CSV” “E:\4_GIS\NorthArkCartoData\UnitedStates\MO_wkt”  This means CREATE the directory MO_wkt and export the derived output to this directory. I wrapped the file path in double-quotes (“) so linebreaks could not be introduced by the terminal and confuse the argument parser.

GOTCHA! DO NOT create the MO_wkt directory before running the script. Let OGR create that directory. However, relative to my example, E:\4_GIS\NorthArkCartoData\UnitedStates\ could be on your file system already.

“E:\4_GIS\NorthArkCartoData\UnitedStates\USStates.shp”  This is the shapefile you want to interrogate. Yet again, the file path is wrapped in double-quotes (“).

-sql “  SELECT * FROM usstates WHERE STATE_NAME = ‘Missouri’  “   This is the OGR SQL expression I used to grab the entire record (*) for features where the state_name field was ‘Missouri’. Notice how the search term (‘Missouri’) is wrapped in single-quotes (‘). If the field you’re searching on is a VARCHAR or a TEXT field (obviously a name requires non-numeric characters), you’ll need to wrap your search term in single-quotes. If you were searching on a numeric field, you would not wrap your search term in quotes. Also, just like the file paths shown above, I recommend always wrapping long values and expressions in double-quotes to avoid confusing the argument parser. Next, and this is a matter of user-preference, but notice how I added extra spaces between the double-quotes and my SQL expression (i.e. “  SELECT..  “). This is totally legit and it may help to distinguish the SQL query from the rest of the OGR script. 

I could have exported a subset of the fields using an expression like:

“  SELECT STATE_ABBR as abbr, STATE_NAME as name FROM usstates WHERE STATE_NAME = ‘Missouri’  ”

This expression selects only the STATE_NAME and the STATE_ABBR fields and renames them name and abbr, respectively, by applying the AS operator. The fields for the subset are comma-separated, but the last field is not followed by a comma (this is basic SQL syntax, but I’m explaining it in case the readership includes some first-timers). If you don’t want to rename the fields in your selection, just omit the “AS rename” from your expression. —when exporting to CSV, the geometry will be exported automatically according to the geometry flag used, which we set next. Unfortunately, though, this means there’s nothing you can do to omit the geometry from your selection.

These last few parameters use -lco flags to signalize various ”creation options” used by the CSV driver. The creation options are unique for each dataset OGR can export, and in this case I’m demonstrating a particular recipie. If you want to know more about the CSV creation options, check out the OGR CSV driver page. Alternatively, if you want to export to a different file format, visit the OGR formats page, select the appropriate driver, and read-up on the various creation options.

-lco “GEOMETRY=AS_WKT “  This creation option means “Export the Geometry value as Well Known Text”. Other options that might interest you include GEOMETRY=AS_XYZ, GEOMETRY=AS_XY or GEOMETRY=AS_YX.

-lco “LINEFORMAT=CRLF”  This creation option means “use a Windows-formatted linebreak”. If you’re running Linux you should use “LINEFORMAT=LF”. Honestly, I can’t remember if we really needed this flag so you may want to try skipping this option.

-lco “SEPARATOR=SEMICOLON”  This flag specifies the delimiter used to separates fields and values from one another in the CSV. We chose SEMICOLON because Polygon WKT already has a bunch of commas in it, so using the SEMICOLON delimiter helped us to visually identify the WKT we were really looking for. Your other options include SEPARATOR=COMMA and SEPARATOR=TAB.

Alternatively, do this with ogrinfo..

You could also use ogrinfo to dump a feature’s WKT right into the console. This approach required using one of the so-called special fields recognized by OGR SQL. To learn more about OGR SQL, its particulars, and these special fields, visit the OGR SQL help page. Without further adieu, here’s my ogrinfo script:

ogrinfo “E:\4_GIS\NorthArkCartoData\UnitedStates\USStates.shp” -geom=yes -sql ” SELECT OGR_GEOM_WKT FROM usstates WHERE STATE_NAME = ‘Missouri’ “

In this case, the OGR SQL expression requests only the “OGR_GEOM_WKT” special field. Because we’re using ogrinfo, our output will stay in the console. However, this was ultimately undesirable for our task, as “real world” Polygon WKT typically spills beyond the console buffer. Moreover, copying the WKT geometry out of the console introduced hundreds of unwanted linebreaks into the WKT. So, for these reasons, we prefered to export our WKT results to a CSV because it allowed us to easily see, select, and copy the desired WKT data without any fuss using good ol’ Notepad with word wrap.

I hope you like. :)

/Elijah

P.S. If you don’t have OGR and want (correction, you NEED) to install it, check out my earlier post describing how to install GDAL/OGR on a Windows system.

Written by elrobis

November 18th, 2011 at 5:59 pm

Posted in GDAL/OGR

Tagged with , ,

Leave a Reply