Friday, February 10, 2006

Python Geoprocessing: Shape2Text

As I discussed here, I was developing a Python script that will convert any shapefile to a text file. This text file will then be manipulated by an external application (lat/long of each vertices will be adjusted) and then the manipulated text file needs to be converted back into a shapefile. I developed the first third of this process, which converts the original shapefile to a text file. Work on the other two is nearing completion as well.

Well, I need to complete this project and have not really touched or thought about it for over a month. Consider this post an effort to remember and to organize my thoughts...and perhaps even organize and tighten my existing code up a bit. Like most new (budding) programmers, I am indebted to the amazing samples and assistance out there for both ESRI geoprocessing scripting and Python programming in general. If someone out there is undertaking a similar task, perhaps this will be of help to them.

There are actually three scripts that need to be developed. First is shape2text, which converts the shapefile to a text file. Second is text2exchange which converts the manipulated text file into a standard exchange formatted text file. Third is exchange2shape, which converts the standard exchange text file to a shapefile.

Python Geoprocessing Script: shape2text

So, let's focus on the first third here: converting from shape2text. Sounds like a simple enough procedure, right? Nope. Here are the problems that I had to overcome:
  1. Convert polygon and line shapefiles to a table of X/Y coordinates
  2. Provide flexibility for users to select the shapefile attributes they would like to be preserved in the text file (and so also in the resulting manipulated shapefile).
  3. Account for the possibility of inner circles (donut holes) and multipart shapes (such as a chain of islands as one feature).
  4. The format of the text file needs to be standardized as the corresponding text2exchange and exchange2shape script will need a standard format to work with.
  5. The external program that will manipulate the text file by adding an adjusted latitude and longitude fields might also remove some vertices (points). New vertices will never be added, but it is possible that vertices will be removed.
  6. The external application requires a fixed set of fields. The number of attributes users select to carry over is not fixed, so this flexibility must be built in such a way that the text file manipulated by the external program always contains the same number of fixed fields. (The game would be too easy without such restricitve rules, eh??)
  7. All of these processes need to happen seamlessly, in sequence, from within ArcMap, and I would prefer to program this in Python as opposed to VBA/ArcObjects.
So, how to read through a polygon or polyline shapefile and write to a text file containing XY vertices? By using data cursors to access the easy-to-use geometry object. Documentation on the geometry object is provided by ESRI in the ArcGIS Desktop Help 9.1. Scroll about half way down and you will even find sample code (which I used successfully) to access the XY for each vertices in a polygon and print it to the screen. The provided script even shows you how to detect multipart and inner circle polygons. Recording these anomolies in a text file and then dutifully recreating them, however, is an entirely other matter. This was very difficult, but I was able to complete it with the help of the great folks over at ArcView-L Listserv.

So, how to account for inner circles and multipart shapes? Well, the sample script provided by ESRI (above) shows how to detect them. What I did was to have the script write a new line in the text file containing only the word NEW. These NEW lines are preserved through the text2exchange script. Preview a sample output file of three state polygon features converted to text. The exchange2shape script generates the adjusted shapefile by reading the exchange file line by line. When it hits a line containing the word NEW, it starts a new part if the FID of the next vertices do not change and a new feature if the FID does indeed change. The geometry object will automatically create either an inner circle or multipart shape depending on the orientation (direction) of the vertices in the polygon. This is actually easy as long as the original order the vertices were read by shape2text are preserved. The geoprocessing capabilities do all the work for you then.

So, how to handle the possible removal of vertices by the external application? By creating a new autonumber field, named pid (point id). (See sample output text file) This field can then be used to properly sort the vertices in the same relative order in which they were read. If points or entire features are removed by the external application, the exchange2shape will still draw the adjusted shapes as they should be drawn as long as the relative order remains.

So, how to provide the flexibility for a variable set of fields while creating a text file with a set number of predefined fields for the external application to consume? Create two text files, one of which has those minimum fields required by the external application. Preview a sample of the text file which shape2text creates specifically for the external program. The text2exchange script then recombines the two text files based on the pid number.

So, how to enable users to select the fields they wish to preserve in the adjusted shapefile at the end of the process?
First the script reads in the user-selected field names as a series of comma seperated values .
fields = sys.argv[2]
Second, a function is called to split the values into accessable variables.
def parseKeyFields(fields):
     StandardFields = fields.split(",")
     return StandardFields
Third, as the data cursor reads the original shapefile, it records the index number of those field names that match the StandardFields list (above).
for item in StandardFields:
     if str(field.Name) == item:
          StandardFieldIndex[n]= str(fieldNum+2)
          n = n + 1
This way, the script has a list that is used to determine whether a particular field gets written to the larger text file. No user-selected fields are written to the fixed fields text file which will be manipulated by the external program.

So, how to coordinate all of these activities from one ArcMap button? Here is what must seamlessly happen.
  • shape2text
  • launch external program
  • text2exchange
  • exchange2shape
I am using VBA for this by creating a geoprocessing object within the VBA instead of the more traditional method of invoking a geoprocessing tool from a button on a toolbar with parameters. This allows the VBA to call the various scripts and to wait until the script has finished running before going to the next line of code.
Dim gp As Object
Set gp = CreateObject("esriGeoprocessing.GpDispatch.1") gp.shape2text InputForm.ComboBox1.Text,
_InputForm.TextBox1.Text
Works like a charm...

4 comments:

Anonymous said...

This is a great blog! I hope you keep on updating things for the future.

Mag said...

thanks there!
it would be great if ESRI publish its API.

MG said...

Hello, great blog although I cannot download your script... Could you upload it again? It would help me a lot. Thank you ;)

qinbincai123 said...

'Tis true; thenlearn how false kiew88i fears be:Just so much honor, when thou yield'st tome,
Will waste, as this flea's death look life fromthee.
Wholesale New Era Hats
Cheap 59fifty Hats
Cheap New Era Hats
New Era Snapback Hats
New Era Fitted Hats