Saturday, September 13, 2008

ArcMap2GMap Fixes: Thanks for Feedback!

After releasing ArcMap2GMap a couple of weeks ago, two persistent bugs have repeatedly been pointed out by users. I want to document these errors and the fixes, both of them quick, here. The script has been updated on the ESRI ArcScripts page with these fixes, so a fresh download should resolve these issues.
Bug #1: Process Status Message Not Appearing in ArcGIS 9.2
  • For whatever reason the form text specifying the three procedure steps (that alternated while processing to inform the user which step was currently operating) were not appearing in ArcGIS 9.2. This resulted in the appearance that nothing was happening.
    • The fix was to add a Me.Repaint command each time the caption of the label is changed. My understanding was that VBA should handle this automatically, but obviously it was not the case here. This creates a bit of unnecessary overhead as the entire form and its contents need to be redrawn jsut to change the label status.
Bug #1: String Length Limitation While Passing to Geoprocessor
  • There is a string length limitation when passing text strings as parameters to a Python script. The first Python script is called by the VBA using the following command: gp.ArcMap2GMap apiKey, tbxName.Text, tbxTitle.Text, layerInfo. The layerInfo string (last parameter) contains the names of all included layers, as well as all user specified options, including color, thickness, messagebox text, etc. I do not know exactly what the length limitation is, but if too many layers are selected this string is truncated and of course errors ensue.
  • This was brought to my attention by a user needing to create a p0age with 27 layers.
    • The fix was to have the VBA write the layerInfo contents to a text file and then after the Python script is invoked, read the contents from the text file.
I surely do appreciate all the feedback!

Thanks everyone!

Tuesday, September 09, 2008

Texas Statewide Historical Maps & Positional Accuracy Pt. 2

Continuation from Pt. 1

Our Geographic Accuracy Measurement Procedures - Overview
  • We calculate 5 positional accuracy measurements for each georeferenced map. Texas is divided into 4 quadrants, with each quadrant receiving an independent positional accuracy measurement. The fifth measurement is the average accuracy of the 4 quadrants.
  • Each quadrant is further subdivided into 4 sections. At least 4 sample points are taken from each of these sections, with a minimum of 20 points from each quadrant. However, as the NSSDA suggests, 25 points are recommended and is what we aim for. We therefore end up using (ideally) between 80 to 100 sample data points to measure the accuracy of each map.
    • The exact points we use for each map of course differ, but here is our priority geographic references. First, clearly identifiable county boundaries. Second, coordinates provided by hash marks labeled on map edges. Third, city locations. City locations have the lowest priority here because we use them primarily to georeference and it is best for the sample points to be different from the points used to georeference. Natural features, such as rivers or lakes, are never used.
Our Geographic Accuracy Measurement Procedures - Step by Steps
  • Four new feature classes are created for each map to hold the 20 to 25 sample points from each quadrant.
  • Point features are created for each sample data point. The coordinates as specified by the map are hand-entered in the attribute table.
  • The actual X/Y coords (in meters) are generated for each feature in the four quadrant feature classes using ArcMap's Add XY Coordinates tool.
  • This data is entered into the horizontal accuracy calculation spreadsheet. Our own customized templates use different NSSDA multipliers, based on the calculated RSME ratio. This is discussed in more detail in the previous blog post.
An Example! Enough Talk, Here is an Example...
  • Title: Colton's Texas (1855)
    • Original Map Citation: Colton, Joseph H. Colton's Texas. New York: J. H. Colton & Co., 1861.
    • Entire map tested 8.982 kilometers horizontal accuracy at 95% confidence level.
    • Download: Georeferenced Map - No metadata included yet as we are still ironing out taxonomy and copyright issues. You betcha there will be posting about these issues, as well as interface and file format issues as well.
  • Horizontal Positional Accuracy statement for the Colton's Texas (1855) map is pasted below. As specified above, I am unable to post a complete metadata record at this time as other portions are incomplete. You can see relative high accuracy of NE Texas as opposed to the SE quadrant.

Horizontal_Positional_Accuracy:
Horizontal_Positional_Accuracy_Report:

Each georeferenced map image file was divided into four quadrants at centerpoint 31.688, -98.634. The horizontal accuracy of each quadrant was calculated independently using the NSSDA (National Standard for Spatial Accuracy) at the 95% confidence level. Where possible, a minimum of 20 sample points from each quadrant were used to measure horizontal accuracy. The horizontal accuracy of of the entire map is the mean average of these four measurements.

Quantitative_Horizontal_Positional_Accuracy_Assessment:
Horizontal_Positional_Accuracy_Value: 8982
Horizontal_Positional_Accuracy_Explanation:
Entire map tested 8.982 kilometers horizontal accuracy at 95% confidence level. Mean average of the four quadrants.

Quantitative_Horizontal_Positional_Accuracy_Assessment:
Horizontal_Positional_Accuracy_Value: 6988
Horizontal_Positional_Accuracy_Explanation:
NW Quadrant tested at 6.988 kilometers at the 95% confidence level.

Twenty-four points were used to test the positional accuracy. The calculated RSME ratio was 0.3344, and the normalized elliptical error table was used to determine error at the 95% confidence level.

Quantitative_Horizontal_Positional_Accuracy_Assessment:
Horizontal_Positional_Accuracy_Value: 8100
Horizontal_Positional_Accuracy_Explanation:
NE quadrant tested at 8.1 kilometers at the 95% confidence level. Twenty-four points were used to test the positional accuracy. The calculated RSME ratio was 0.699, and the normalized circular error table was used to determine error at the 95% confidence level.

Quantitative_Horizontal_Positional_Accuracy_Assessment:
Horizontal_Positional_Accuracy_Value: 12400
Horizontal_Positional_Accuracy_Explanation:
SE quadrant tested at 12.4 kilometers at the 95% confidence level.

Twenty-four points were used to test the positional accuracy. The calculated RSME ratio was 0.609, and the normalized circular error table was used to determine error at the 95% confidence level.

Quantitative_Horizontal_Positional_Accuracy_Assessment:
Horizontal_Positional_Accuracy_Value: 8400
Horizontal_Positional_Accuracy_Explanation:
SW quadrant tested at 8.44 kilometers at the 95% confidence level.

Ten points were used to test the positional accuracy. The calculated RSME ratio was 0.400, and the normalized elliptical error table was used to determine error at the 95% confidence level.

Monday, September 08, 2008

Texas Statewide Historical Maps & Positional Accuracy Pt. 1

We have been undertaking a project, called the Texas Time Machine (TTM), which requires us to georeference statewide historic maps of Texas.

Brief overview of TTM:
Preparing historical materials for use within GIS for geographic analysis requires a large time commitment and high level of expertise. Once prepared, the researcher still needs to understand the fundamentals of operating a large desktop GIS application. TTM resolves this by (1) compiling a collection of prepared historic materials, and (2) enabling interaction with these materials within Google Maps and Google Earth (as well as desktop GIS applications).

TTM provides 4 ways to view geographically referenced maps, statistics, and images. (1) Via Google Map overlays, (2) Via downloadable Google Earth KMZ files, (3) Via downloadable GIS Data, and (4) Via downloadable un-georeferenced images.
I want to focus this post specifically on how we are georeferencing 150 year old maps whose geography encompasses over 250,000 square miles.

Georeference Scanned Map Image
  1. The intended coordinate system of the original cartographer needs to be determined. In consultation with our Cartographic Archivist Librarian, we discovered that for the majority of the maps dating int he 19th century a Mercator system was intended.
  2. Control points need to be used for georeferencing. For the first order, we use control points from the state outline, namely the Pan Handle, westernmost tip, easternmost tip, and southernmost tip. Then, we overlay a uniform 5x5 grid shapefile (25 standard polygon features) over the ungeoreferenced image within ArcMap. One city per cell is used as a control point. Cells without cities indicated on the scanned map will not contain control points.
NSSDA & Historical Measurements
We are adhering to the National Standard for Spatial Data Accuracy (NSSDA). The two primary resources we followed are:
  • ‘Geospatial Positioning Accuracy Standards, Part 3: National Standard for Spatial Data Accuracy’
  • ‘Positional Accuracy Handbook: Using the National Standard for Spatial Data Accuracy to measure and report geographic data quality’.
Both resources can be accessed here.

The first work above provides a general overview of the process and specific case studies where one can learn by those examples. There are two cases for measuring horizontal accuracy.
  • The first case is on page 3-10. This case demonstrates how to calculate error with 95% confidence when the x-axis error is equal to the y-error. RMSE(x) == RMSE(y). (Root Mean Square Error) I do not anticipate this as applicable as our maps are not consistently drawn to scale.
  • The second case is on page 3-11. This case is entitled ‘Approximating Circular Standard Error When RMSE(x) != RSME(y)’. However, the details of the case demonstrate how to calculate error when RMSE(min)/RMSE(max) is between 0.6 and 1.0. This implies an almost consistent error across the x- and y-axis.
    • The formula provided is: Accuracy ~ 2.4477 * 0.5 * (RMSE(x) + RMSE(y)). This is in effect the average of the two errors (added and divided by 2) and then multiplied by the full circle confidence of 95% as designated by the ‘Generalized Circular Probable Error’ table. (JSTOR access, page #170).
    • This case continues to explain that the circular standard error at 39.35% confidence may be approximated at 0.5 * (RMSE(x) + RMSE(y)).
    • The big question for us is how can these numbers be adjusted to accommodate where RMSE(min)/RMSE(max) is less than 0.6.
  • The second work above is a handbook/workbook that enables the easy use of the first case specified in the first work. Namely, where RMSE(x) == RMSE(y). As stated above, this is not the case with our maps because cartographers could not draw them to scale 150 years ago. However, this second work provides print and downloadable versions of a spreadsheet that modified for our uses, namely adjusting the modifier at the end based on the RMSE ratio.
  • Both works provide template language to include in the GIS metadata, as well as specific metadata fields where positional accuracy should be reported.
Our Geographic Accuracy Measurement Procedures
  • Ha! I am tired of writing at the moment and will lay out out specific in-house procedures tomorrow. I will also include some snippets from one of our metadata records.

Thursday, August 28, 2008

GIS Librarian (-ish) Positions x2

1.
Geographic Information Systems Specialist (Bucknell University)
"Bucknell University seeks to hire a Geographic Information Systems Specialist for the Library and Information Technology organization... The primary responsibilities of this position are to develop, expand, and support a GIS user community by assisting students and faculty in the selection and use of appropriate GIS technologies; working with faculty and students in designing and executing projects using GIS; and providing instructional support in GIS."
While this position does not have the title Librarian and while a library degree is not required, this sounds quite similar to my responsibilities here. And, of course, this position does indeed reside in Bucknell's Library. With a salary of $40,000 - $60,000 this position sounds pretty fantastic for a GIS professional looking for an academic position.

2.
Data Service Librarian (New York University)
"New York University is seeking an energetic, creative, and knowledgeable librarian to select, acquire, manage, and deliver numeric and geospatial data collections to support campus research and scholarship."
Some snippets:
  • The librarian will build numeric and spatial data collections and facilitate access to additional data resources across the sciences
  • Reporting to the Data Service Coordinator, the Data Service Librarian works to develop appropriate description for managing research data collections; investigates new sources for metadata; keeps abreast of new and evolving metadata standards such as the Data Documentation Initiative (DDI) and Federal Geographic Data Committee (FGDC) standards.
  • The incumbent will develop and maintain awareness of data-centered initiatives across the sciences, attending professional meetings, workshops and conferences for training and continuing professional development.
  • Requirements: Basic familiarity with software for statistical and geospatial analysis (e.g. SAS, SPSS, Stata, R, GIS applications).
This really sounds like an exciting position. We are continually increasing the numeric (non-spatial) data services and positions such as this one which straddle both numeric and spatial data services are the thing of the future. My opinion is that GIS technology is becoming more and more commonplace and within 5 years (or so) it might not be so necessary to maintain a professional strictly in GIS or strictly in non-spatial statistics. Great long-term opportunity for a highly experienced librarian.

Specific salary benefits are not mentioned, and considering the cost of NYC especially someplace wiothin an easy commute this perhaps will be a huge factor.

ArcMap to Google Map Polygons

ArcMap2GMap: download
+ (not yet tested for 9.3)
Samples:
+ Presidential Election Data 2004
+ Hodge Podge Sample of Stuff From My Computer
+ Health Resources

Finally completed a major update for the ArcMap2GMap script that exports ArcMap layers to a standalone Google Map webpage.

This latest update now includes support for choropleth polygon layers using the gPolygon object. Previous versions included support for point and line geometries. The choropleth map is hard-coded to generate 4 equal interval classes based on the attribute selected by the user, but we do have plans on providing more flexibility with this in the future. I want to whole-heartedly thank my GRA, Shivkumar Chandrashekhar, for all of his assistance with this project.

This version includes all features of previous versions, including:
  • multi-layer support
  • geocoding
  • proximity searching (top 10 closest visible points displpayed)
  • driving directions
Students at our university are restricted from registering DLLs, so we could not compile the VBA forms. This means the MXD provided in the download must still be used.

The major issue we needed to resolve to include this polygon support was the complexity of the vertices in a polygon shapefile. Even the simples polygon shapefiles may have thousands of vertices that will timeout any browser on virtually any computer. We have two point reduction methods in place to help resolve this.
  1. First, each polygon feature's vertices are filtered through a Douglas Peucker Algorithm. The code for this can be viewed in the DP.py script.
  2. Second, after the algorithm is run, each polygon feature class is dissolved using ArcMap's geoprocessing dissolve tool. This effectively removes shared boundaries by features with identical color representation.
We are holding an open workshop on campus on September 24. If anyone tries out the script and has any comments, please leave them here or if you do not want them on the permanent blog record, leave them in the IM client to the right.

Saturday, August 23, 2008

GIS-Related Dissertations/Theses Late 2007 & 2008


As always, there are oodles of great GIS-related theses and dissertations out there. After taking a long and deep gander at the latest batch, here are the latest ones to have caught my eye. For previous installments, click here.

Summer 08 Class Final Projects: Highlights


Been working my fingers to the bone teaching 9 credits last Spring, 6 credits this Summer, and getting ready to teach 6 more this Fall (next week, gulp!).

This past Summer was exciting and really stood out, however, as many of the final projects from both courses were exceptionally great. I taught Seminar: Advanced GIS Topics for Real Estate Research and Understanding Geographic Information Systems (an intro to gis course).

Real Estate Project Highlights:
  • Automated Foreclosure Selection Model. This project incorporated MLS and foreclosure listings and conducted a failry sophisticated comparable market analysis to whittle down the hundreds of available foreclosure listings to a select few worthy of consideration and further investigating. The analysis would have been sufficient for an excellent grade, but they went the extra mile and automated the process with a custom toolbar, geoprocessing models, and some VBA programming.
  • Site Selection (suitability analysis) for a New Mixed-Use Development in Arlington Texas. Of note here is the clever way they included traffic pattern data in their analysis.
  • Analyzing the Correlation Between Crime and Property Values. What set the project apart was the student's in depth use of SPSS and linear statistical analysis in conjunction with ArcGIS.
Understanding GIS Project Highlights
  • Estimating Surface Runoff Volume. Wow is all I can say about this one. This project used ArcGIS to calculate the Soil Conservation Service Curve Number for a local area in Fort Worth. Landuse, zoning, aerial images, and city-defined drainage areas were used in this analysis.
  • Site Selection for an Environmentally Friendly Park in Dallas. Completed by a graduate Landscape Architecture student, this project focused on soil runoff, amount of sunlight, and various other parameters necessary to create a Green park.
  • Analyze the Relationship Between Geology and Oil Fields in Texas. This project made extensive use of the Geologic Atlas of Texas to search for common geologic types underlyiing oil fields. This in itself was a great project, but the student went the extra mile and created a Google Maps web page showcasing the results, which was fantastic.
Like I said, this was a most excellent summer in terms of the quality of student final projects. There were of course many other great projects, but these are the six that stick the most in my head at the moment. ;)

Wednesday, August 20, 2008

Planning Upcoming Workshop: Predicting 2008 Local Voting Results


One of the three GIS workshops we have planned for the Fall 2008 semester is entitled 'Predict 2008 Voting Patterns Across Neighborhoods in Texas'. It is not scheduled until late October, so it is oh so far from being done, but here is how we are planning to go about it.

By Texas neighborhoods, I really mean that each workshop participant will estimate how voters will vote in each Voter Tabulation District (VTD) in Texas. This can be then grouped to form neighborhoods, especially in urban areas.

There will be three parts to the exercise.
  1. Participants will first predict voting by using the previous two presidential election results (2000, 2004) and the previous two gubernatorial results (2002, 2006).
    1. Data source: Texas Legislative Council: Redistricting FTP Site
    2. Participants will build an estimation layer by compiling a weighted average of the four election datasets provided, as well as rates of change between these elections. Everyone will be able to specify which attributes are included in the analysis and what each weight will be.
  2. Participants will then explore the bivariate correlation between various demographic attributes and previous election results, such as income or Hispanic population.
    1. We will use the Linear Regression (bivariate) ArcMap extension, written by Michael Sawada, of the University of Ottawa.
    2. After exploring these tabular relationships, participants can decide whether to use any of these demographics to adjust their estimation layer created in the first step.
  3. Participants will then have the option to enable random occurrences to adjust their estimation layer for them. This one should be a lot of fun, as a random number generator will specify a last-minute political scandal, natural disaster, or economic crisis that will further adjust the estimates.
Of course, this will all be automated using a combination of Python scripting and the Model Builder. so participants can concentrate on their research and the enormous potential that GIS lends to this type of analysis.

It is scheduled for one week before election day, so we are hoping this hot topic will be a further draw for students and faculty.

Just for kicks, I created a Google Maps webpage showing % votes for Bush in 2004 and total votes for each candidate by VTD for Tarrant County, TX.

Tuesday, August 19, 2008

Mapping Oak Cliff's Realities, Possibilities


Dallas Morning News Article: Mapping Oak Cliff's Realities, Possibilities (08/16/08)

This recent article spotlights student Charles Jackson's use of GIS in an undergraduate project on the Oak Cliff neighborhood in Dallas, Texas. Charles is a student I assisted with on may GIS projects. For this one in particular, he used GIS to create a Google Map webpage of various resources available in this low-income area. Charles' online project is entitled Oak Cliff Interactive.

Background information about this project and the technology used can be found in a blog entry I wrote last year.

Been a While...

So, it has been a long, long while since my last post here. Why? Ah, oodles of reasons. Most of all, I just kicked a 9-month compulsive WoW habit that impeded and threatened many facets of my work and personal life.

To those of you out there who were negatively affected by my obsessive gaming habits, I sincerely apologize. To everyone, I fully intend at this point to resume writing on this blog and others as it is a great pleasure to me. I have been quite busy working on numerous GIS projects and teaching various classes, so there are loads of good stuff to write about.