Wednesday, February 28, 2007

ACRL Poster Presentation: ArcIMS, Google Maps, & Information Literacy

As I wrote a couple of days ago, Gretchen Trkay, the Information Literacy librarian and I needed to get the handouts in this week for our ACRL Poster Presentation: OneBook Meets Google Maps: Engaging Freshman in English Composition Library Instruction.

As you can see from the 4 handouts below, our presentation focuses on how we integrated ArcIMS 9.1 services with Google Maps to develop a web application for freshman English Composition courses. Here is the web application: Mapping the Afghan Population of the US.

Well, we submitted our four handouts this afternoon, and it feels good to get those documents complete and shipped out.

Here are the 4 handouts:
I am really looking forward to presenting the results of our collaboration in Baltimore later this month. I have done quite a number of conference presentations, but never a poster presentation before. Should be fun.

Monday, February 26, 2007

Free Batch Geocoding: Juice Analytics Excel Geocoding Tool

As Meg Stewart stated in the GIS @ Vassar blog, "Geocoding in ArcGIS is not pleasant. Success rates hit a high of about 70 percent, in my experience." We all feel your pain, especially those of us in education where we are constantly put in the position where we need to explain this geocoding unpleasantness at least once per week.

The GIS @ Vassar post points out the fantastic BatchGeocode.com, which makes up a major component of my My Powerful Geospatial Suite of Free GIS.

However, lately I have been directing students to download and use the Juice Analytics: Excel Geocoding Tool. I first learned of this tool via Ogle Earth last year, but for whatever reason, it is only lately that I have really begun appreciating it.

The Excel Geocoding Tool is a downloadable Excel file with super-easy-to-use Macros built into it, so of course you need to enable Macros to use this tool. There are two worksheets. The first worksheet prompts which geocoding service to use (Yahoo! API or geocoder.us). Yeah...who's going to use geocoder.us when compared to Yahoo! Maps API? The second sheet has fields for address, city, state, and zip. Paste tens, hundreds, or thousands (I have never done more than 2,000) of addresses into this sheet, click Geocode All Rows, and voila. At lightening speeds compared to BatchGeocode.com, the lat/long of each address appear in the first two columns. The 500 addresses per geocode limit in BatchGeocode.com is nowhere to be seen with this tool. Another advantage is this Excel tool provides the geocoding precision level for each address, which is truly, truly wonderful. Not nearly as powerful as ArcMap's geocoding score, but for each geocoded address, this tool specifies whether the supplied lat/long is matched for the specific address, street, zip code, or city. Just like BatchGeocode.com, the Excel Geocoding Tool can even create KML files.

I introduced my Intro to GIS class to the Excel Geocoding Tool last week, and as you can imagine, it was a big hit.

Sunday, February 25, 2007

March 2007 Conference Presentations

If anyone wants to drop by and say hello, I will be holding the following two conference presentations during March:

Texas Map Society 2007 Spring Meeting
Location: Nacagdoches, TX
Dates: March 23-24
Presentation : Historical Maps & GIS: Peeling the Cartographic Layers
I will be discussing our library's current endeavors to georeference and digitize historical Texas maps and distribute the maps and vector data via Google Maps and Google Earth.

ACRL 13th National Conference
Location: Baltimore, MD
Dates: March 29 - April 1
Poster Presentation: OneBook Meets Google Maps: Engaging Freshman in English Composition Library Instruction (co-presented with Gretchen Trkay)
Our poster presentation will highlight our inclusion of GIS and demographic data into library instruction sessions for 15 freshman composition courses last semester. We developed the following web application, Mapping the Afghan Experience in the US. Click here for more information about this project.

I will post more details about these presentations as the dates get closer. Like most folks, I tend to wait until the last possible moment. We actually need to submit electronic copies of all handouts for the poster presentation for ACRL this week, so I will post more details within the next two days...

ArcMap2GMap for ArcGIS 9.0, 9.1, and 9.2

As I just posted earlier, I have devised band-aid solutions that will allow all of my scripts to operate in ArcGIS 9.whatever, but that I must create separate scripts for each version.


Just created ArcMap2GMap scripts for ArcGIS 9.2 and 9.0, in addition to 9.1. Click here to download from ArcScripts.

Native Support for Geoprocessor in ArcGIS 9.2...Doh!

OK, so finally re-installed ArcGIS 9.2 after my rash initial installation went awry. Now experiencing first-hand that all of the Python scripts I developed in for ArcGIS 9.0 & 9.1 do not work in ArcGIS 9.2.

Why? As the ESRI documentation states here, "At ArcGIS 9.2, there is native Python support for geoprocessing scripting."

What does this mean in practical terms? Replace the COM connection code at the tippety-top of your scripts.


Replace:
import win32com.client

gp = win32com.client.Dispatch("esriGeoprocessing.GpDispatch.1")


With:
import arcgisscripting
gp = arcgisscripting.create()

Now, how to make a single script that will work with either ArcGIS 9.1 or 9.2? I really do not know. The only method that I am aware of to detect the version number is to pull that info from the registry. For me, that is way, way not worth it. Until a nice solution comes along, I will create different scripts for different versions.

While I am discussing Python scripting differences, there is still an unresolved issue (for me) using the searchcursor to access the geometry object for points between 9.0 and 9.1/9.2. Sent a query a while back to the ArcView-L list, but no one was able to help. Which is a shame because the last time I posted a scripting query (about constructing multipart polygons and inner circles) I received a fantastic answer within a couple of days from Nathan Warmer (ESRI).

Anyway, here is a snippet that highlights the difference:

rows = gp.SearchCursor(inputFC)
row = rows.Next()
# For each row
while row:

feat = row.shape
LUArray = feat.GetPart()
# Following line is required for 9.1 & 9.2
pnt = LUArray
# Following 2 lines must be removed for version 9.1 & 9.2 They are essential for 9.0.
# LUArray.Reset()
# pnt = LUArray.Next()
row = rows.Next()
...

I do see that it is no longer necessary to store point features in object arrays, but I have been unable to devise a solution that would work seamlessly across all ArcGIS versions.

Anyway, just thought I would post this difference here while I had it in my mind.

Friday, February 23, 2007

Plan the Route: Arlington TX Largest City Without Public Transit

"Arlington [Texas] remains the largest metropolitan city in the country without a public transit system for the general population."
City of Arlington, 2001

So, I live, work, and play in the largest city in the United States without public transportation, eh? Now to be honest, I bike everywhere on my Trek 7100, but not everyone is fortunate enough to live so close to life's many amenities and short destinations.

The great up-side to this is that I have worked with oodles of students from our City & Regional Planning GIS program wanting to plan public transit routes through Arlington, TX. Working with these students over the past few years have given me a real appreciation of how ArcGIS's spatial analyst and lately (since 9.1) the network analyst extension can be used. One of the students who I helped to devise a method for implementing a bus routing algorithm into ArcMap actually won an award for his work. Very exciting. Of course, his project was planning a bus route through our fair city of Arlington.

So, I thought I'd try my hand at holding a GIS workshop where participants will use GIS to plan a new bus route route through Arlington. Now, the primary purpose of my workshops is not to demonstrate advanced analysis techniques to GIS students, but to increase awareness and interest in GIS and the library's services and collections. This means that in order to hold such a workshop, I needed to automate everything. A students (or faculty) who beforehand would have difficulty spelling GIS would need to create a bus route within 1 to 1.5 hours.

So I fire up my Pywin32 IDE and just about finished up the Python geoprocessing yesterday. Everything is completely, 100%, automated. Today at noon, a couple of my colleagues here in the library agreed to meet with me to do some quick usability testing.

As soon as the testing is complete, and I put the finishing touches on the user interface, I will post all of the code here.

I divided the analysis into two forms. The forms are simple VBA forms that launch the scripts and pass the parameters. First, workshop participants will develop their cost matrix using the following intuitive criteria: number of commuters, number of single mothers, income, and road class. Second, participants will select 4 categories of bus stops, such as airports, colleges, grocery stores, etc. They can then assign specific stops within these categories or allow the GIS to pick a specific stop for them. A cost raster (which is also an output from the first step) is used to select the stop(s) for the participant. Then, because I could not help myself, the Python code calls ArcMap2GMap and plots the bus stops and the derived route on a Google Map that they can take with themselves. Works like a charm. The only downside is that the second step takes over 3 minutes to process, so I will need to put on all of my charm to keep everyone engaged and entertained while we wait for the process to finish. The results are totally worth it, though as the default browser window opens automatically with Google Maps displaying their results.

Then, last week, Dr. Ardeshir Anjomani, coordinator of the our City & Regional Planning department's GIS program, agreed to give a brief introduction to the topic and to highlight various student public transit projects that have come out of his department. It's going to be fantastic...

If anyone is in the area, here are the workshop details:

Title: Public Transportation in Arlington: Plan the Route
Introduction: Dr. Ardeshir Anjomani, coordinator of SUPA's City & Regional Planning GIS Program, will give a brief introduction and highlight various student endeavors to plan public transit routes.
Time: Thursday, March 1, 2007, 3pm - 5pm
Location: Central Library, Room B20 (basement)
Description: Learn how to use Geographic Information Systems to plan a public transportation (bus) route through Arlington, TX.
Flyer: http://gis.uta.edu/arlingtonTransit.pdf

Wednesday, February 14, 2007

TIGER Positional Accuracy & Geocoding: 2 Articles

The following two recent articles focusing on positional accuracy and geocoding are very intriguing. Learned quite a bit about the science behind positional accuracy and geocoding. I usually advise our students to use our local parcel boundary feature class for geocoding (as I discuss here), but I'll focus first on these articles.
  • A Snake-based Approach for TIGER Road Data Conflation, Song, Wenbo; Haithcoat, Timothy L.; Keller, James M. Cartography and Geographic Information Science, Volume 33, Number 4, October 2006, pp. 287-298(12) [abstract]

  • Modeling the probability distribution of positional errors incurred by residential address geocoding, Dale L Zimmerman, Xiangming Fang, Soumya Mazumdar and Gerard Rushton, International Journal of Health Geographics 2007, 6:1 [full-text]
The first article describes a new conflation method for improving the positional accuracy of TIGER files. As described by this article, conflation is the combining of attribute-rich TIGER data with positionally-superior local data. Traditional conflation methods include (1) feature matching, (2) map alignment (rubber-sheeting), and (3) attribute transfers. The article describes a new conflation method, based on snakes. Snakes [background information] are active (dynamic) contour models based on image or 3D data. See Mark Schulze's nifty java tool to learn more about snakes. Traditional conflation are between TIGER and other vector files. This new method extends conflation to interact between TIGER and snakes based on raster orthophotos. This article was a fantastic insight into the creation of the improved vector street files we enjoy here at the university as a result of the hard work of local government analysts (and their contractors).

The second article presents the results of comparing three geocoding methods: (1) Automated Geocodes, (2) E911 Geocodes, (3) Use of an Orthophoto. The automated geocoding was accomplished using TIGER data and ArcGIS 9.1. The E911 geocoding used the local 911 listing. The orthophoto method overlaid parcel boundaries over an orthophoto to enhance the E911 geocode. As expected, the orthophoto method was the most accurate, and was termed the "gold standard". The real beauty of this article is the analysis of the error measurements and their attempts to model the errors.

So, how does this impact how I do my job? Reckon the best is to continue to advise students to geocode based on our local government-produced parcel shapefile or cadastral. If I was not fortunate to work in the GIS-rich DFW Metroplex, and so did not have access to these high quality files? Guess then I would try to work with local faculty and governments to work on conflation techniques to improve the accuracy of TIGER data.

FYI, here is a good resource to scan the recent table of contents of the major GIS-related journals.

Monday, February 12, 2007

Python Script: Shape to Text to Shape

Except for ironing out a few remaining buglets and cleaning up the code, I have just about completed a script that will convert a shapefile of any geometry into a text file, and then will convert that text file back into a shapefile. (I previously wrote about the development of this script here and here.)

Script configured for ArcGIS 9.1: Download
This will not work for ArcGIS 9.0 and has not yet been tested for 9.2. For a narrative of my ArcGIS 9.2 woes, go here.

Purpose

This script has two potential uses: Edit an existing shapefile or Create a new shapefile in text format.

Description

There are actually two scripts: shp2exch.py and exch2shp.py that are launched independent of each other.

I initially developed the script so that faculty in our Earth & Environmental Science department can disassemble shapefiles, apply tectonic rotation formulas to the resulting text files, and then reassemble the shapefiles.
  • shp2txt converts points, polylines (including multipart), and polygons (including multipart and inner circles) to a text file using an exchange format. The text file can be edited by hand or of course another application can be developed to edit these text files.
  • txt2shp converts an existing text file in the exchange format to a shapefile.
Instructions

Download and unzip the contents. Open ArcMap or ArcCatalog 9.1. View the ArcToolbox pane. Right-click on ArcToolbox and select Add Toolbox. Browse to the directory where you unzipped the scripts, and click once on exchanger, and click Open.

Now you can expand the exchanger toolbox to launch the two scripts.

There are also two executable files that launch forms external of ArcGIS. However, ArcGIS still needs to be installed on the system and everything runs slower. You can give it a shot, but I advise using the toolbox within ArcGIS.

The Exchange Format

The text file uses a CSV comma delimited (comma-separated values) file structure with a extension. I used a .csv extension so that the file can be launched directly into Excel by double-clicking.

The text file uses an exchange format that follows the following structure:
  • First line contains all of the attribute (field) names in the shapefile, plus xLatitude, xLongitude, and geometry
  • Second line contains the field type of the attributes
  • The third line through the last line contain the X and Y value for each vertex.
    • In addition, each new feature, part, or inner circle
      is prefaced with a line beginning with NEW and the attributes for that feature or part.
Here is an example of a text file representing a U.S. states polygon shapefile. Note the multipart polygon of Hawaii.
Here is an example of a text file representing freeway lines in North Central Texas.

Known Issues
  1. Coordinate system is preserved because the XY units are preserved in the text file. However, I have been having trouble preserving the complete projection information, so the resulting shapefile projections are not defined.
  2. Currently only works with shapefiles, and not geodatabase feature classes. This is due to poor planning as the script currently relies on the FID field, and does not look for an OBJECTID field.
  3. Probably does not work with ArcGIS 9.2. After last week's troubles, I do not expect to install and fix any 9.2 bugs within the next couple of weeks.
  4. Code is currently a bit sloppily written. Needs to be cleaned up a bit.

Saturday, February 10, 2007

Ouch: ArcIMS 9.1 & ArcGIS Desktop 9.2

What a day I had a couple of days ago. Quite unproductive as I spent the entire day correcting a mistake. My mistake of course. The bottom line is ArcIMS 9.1 and ArcGIS Desktop 9.2 do not mix. Hear me now and believe me later.

This was one of those days that leave your head spinning and asking yourself what the heck is your place in this world. What the heck did I accomplish today? What the heck do I do every day?

Here is what happened.

Our campus really takes advantage of our ESRI site license, and over the last 3 years more and more departments requested ArcGIS be installed in their labs. Including the library, there are now 16 labs, all spread out over the campus, that maintain the ArcGIS software. We have made it a priority these labs alll run the same version over having the latest versions in some labs. Why? Because of compatibility and consistency. So, OIT (campus computing services) finally completed their testing of ArcGIS 9.2 on our license server, and I was first in line to (literally) to get my copy to install on my desktop and my staffs' as well.

Now, this will give away the ending to my little narrative, but bear in mind that our GIS server is running ArcSDE 9.1 & ArcIMS 9.1
  • 9am : Begin uninstallation of 9.1 from my desktop
  • 945am : Begin installation of 9.2 on my desktop
  • 1030am : Success! Next on my to-do list is to tweak one of our ArcIMS services.
  • 1035am : Fire up ArcMap 9.2 and connect to the MXD file running on the server. The ArcIMS service I needed to update is an MXD-based ArcMap Image Service.
  • 1115am : Complete updates to MXD file
  • 1120am : Launch ArcIMS Administrator on the server and restart service. ArcIMS Administrator does not crash, but hangs there longer than normal with that hourglass that never seems to run out of sand.
  • 1130am : ArcIMS Administrator finally threw up an error message that told me (as hindsight knew it would) that the version of the MXD file is not compatible. Woops. Big woops. Should have seen that one coming. And, of course, the service was removed which means that our applications depending on that service are also down. That is no good.
  • 1135am : Realize that all I need to do is to resave the MXD file as 9.1 compatible, restart the service, and all should be OK.
  • 1136am : Remote into my laptop in the GIS lab, fire up ArcMap 9.1, and attempt to open the MXD file. Not going to happen as I last saved it using 9.2. No compatible.
  • 12pm : Run upstairs for my 12-2pm reference desk duty, feeling a tad bit frustrated. Was told that the reference staff planned a meeting at 2pm and they would be so grateful if I could work at the ref desk from 2-4pm instead today. I race back downstairs at top speed.
  • 1205pm : Scan ESRI Forums for methods to save a 9.1 compatible MXD file from 9.2. I found nothing. Explore the properties in ArcCatalog...nothing helpful. Explore the Save As dialog in ArcMap...nothing.
  • 1230pm : Recreating the MXD file remotely using my laptop in the lab. Switching back and forth between ArcMap 9.2 and ArcMap 9.1 to recreate all 28 layers, symbology and all, on my laptop's version of 9.1.
  • 1pm : Finish recreating MXD file.
  • 105pm : Remote to server, launch ArcIMS Administrator and recreate the service. Success! Even completed my tweaks.
  • 110pm : Notice that the automated geoprocessing scripts (most notably the script running the Friday Night Hangout processing) have been crashing. Good grief. They too are not compatible. I will need to trouble shoot to update the code to 9.2, but after the morning I had today was not going to be the day. This too, I should have foreseen. Ah well. The Friday Night Hangout site is one of the most popular services that students play around with even though it is not a course requirement, so I should still get that ironed out ASAP.
  • 130pm : Begin unstallation of ArcGIS Desktop 9.2 from my desktop
  • 2pm : Run upstairs for my 2-4pm reference desk duty
  • 345pm : Leave ref desk early. Begin installation of ArcGIS Desktop 9.1
  • 350pm : Run to business library for meeting with management faculty
  • 430pm : Installation complete. Check the geoprocessingt scripts and all is well with my world again.
Went home and my wife asked me how my day was, and I told her.......

Friday, February 09, 2007

Value of Library-Based GIS Program

I want to jot down some of my ideas here about assessing and perhaps even proving the value of providing GIS services in an academic library.

Why Discuss This
... Where the money is allocated reflects our values
... How the money is allocated creates our future

Even since a colleague of mine attended the ARL 2006 Library Assessment Conference, she can not stop telling everyone how fantastic a speaker Chancellor John Lombardi (University of Massachusetts-Amherst) was and how powerful his message was. Lombardi's dilemma is whether or not to allocate the 50 million dollars necessary to make the needed physical renovations to the library. As opposed to converting the space to a 26 story study hall. (See libraryassessment.info for more background on Lombardi's presentation.) Chuck the books into an off-site storage facility and go library-less like the University of Pheonix, eh?

This really set the library community (and our library in particular) abuzz. Forcing us non-assessment focused librarians to ask ourselves:
What is the value of the academic library?

As a GIS Librarian, I want to turn this valuation lens upon my own little corner and ask:
What is the value of GIS services in an academic library?
How do we measure this and and report these measurements as proof of value to the library administration?

But let us get down to nuts and bolts. Assessment really serves 3 purposes and these are my purposes.
  1. How effective are the ways (techniques) we are using to provide GIS services
  2. Prove/describe current value of GIS services
  3. Prove/describe potential future value
What are library-based GIS services?

The goal of libraries are to bring together technology, services, and collections in one central location and GIS services mirror this perfectly.

GIS Librarians seek to bring together in one place (1) geospatial technology, (2) instruction/assistance services, and of course (3) data collections. The strength of a library's GIS program can be analyzed by examining the strengths/weaknesses in these three areas. For example, at UT Arlington our strengths are definitely in service and technology. We are extremely aware of this imbalance and are slowly but surely correcting it. (Now, I will never admit to this. If you ask me about this weakness I will claim someone hijacked my blog and put this sentence there.)

What is the value of these library-based GIS services?

As with all assessment in libraries these days, it is oh so difficult to measure outcomes and impacts. We have been and continue to measure inputs, processes, and outputs (see here for an informal example), and this is easy to measure. However, the strong emphasis on outcomes and impacts have made us all rethink how we do assessment.

So, how the heck can I place a value on GIS services? What can I measure? Even though I keep throwing out the term GIS services, I really mean technology, services, and collections. Of course there is some overlap, but I think it best to approach these three components individually.
  • Geospatial Technology
    • Expensive. GIS software is expensive. Thank goodness on our campus the library is not responsible for the paying for the major ESRI and Leica Geosystems applications, but we do provide numerous smaller GIS applications and extensions. The hardware is also expensive. Low-end, or hand-me-down computers are not adequate. Geospatial analysis requires high-end processors (dual core?), sufficient memory, video cards, etc. Add on GPS receivers, tablets, and any other hardware the library wants to provode.
    • Customized configurations. It is very time consuming to create the images for these computers as the ideal computer is loaded with numerous software packages. See here for an example of what I mean.
    • So, how do we measure outcomes and impacts of these extra hardware costs, software costs, and servicing costs? Here is an idea:
      • For the GIS labs with high-end computers, how about setting hardware capability benchmarks on the library's regular PCs and then count the number of times that students surpass these benchmarks on the high-end computers? This would prove the value of the extra hardware expense.
        • Here is how this could work. Install the GIS software on a regular library PC and try various geospatial procedures using different sized files and look for the breaking point. For example, bring in various large-sized raster images (hundreds of megs to a gig), and see when performance suffers to a point where analysis is no longer possible. Another example is to bring in tins of varying sizes. The hard part then would be devising a way to count how many times these benchmarks are surpassed. The easiest way is simply to record a notch every time a staff member is aware of it.
      • Another method is to compare the hardware your library provides to comparable institutions and institutions that you aspire to. These comparable institutions do not necessarily need to be library-based, but may be department-run GIS labs as well.
  • Geospatial Instruction/Assistance Services
    • Also expensive. Staff salaries are not cheap. It would be difficult to hire an entry-level GIS Librarian at a standard $35,000 salary. If hired at such a salary, I imagine it would be difficult to keep her/him.
    • Continued professional development/training is vital in this field. The ESRI Virtual Campus training included with a site license is wonderful, but the times I have attended live training have always been fantastic and well worth the money. But training such as this is not cheap. Not only training, but the time spent playing around is also expensive but oh so necessary. When I go a couple of weeks without a few hours to explore some new facet of GIS I begin to feel out of touch.
    • So, how do we measure outcomes and impacts of these GIS services. It is my belief that a library must provide a high level of geospatial instruction/assistance as service is one of the three pillars of a library. But how to prove it? Here are some ideas:
      • In this case the measurable benchmark would be learning outcomes as devised by the GIS staff. Learning outcomes can be devised for class instruction or even for 1-on-1 interactions.
        • For an example of a basic GIS instruction session, see my first attempt.
        • An examples of a 1-on-1 learning outcome can be that the students will understand and/or be able to duplicate the content of the instruction.
        • These learning outcomes will then be measured using print and online surveys that classes will fill out after instruction and students may fill out sometime after a 1-on-1 session. Perhaps there can be a raffle held each semester to encourage students to reflect on 1-on-1 assistance in an online survey.
      • Ask GIS users (including faculty, staff, and students) what made them decide to use GIS in their research. This would generate extremely valuable outcome data as I know off-hand that more and more students and faculty are exposed to and using GIS as a direct result of my workshops and word of mouth.
      • Count repeat GIS instruction requests by faculty for their courses. Last Fall 06 semester, I was invited to provide GIS instruction to over 40 classes on campus. This semester looks to be close to that number, most classes repeating from last semester or last Spring. This is valuable data.
      • Count acknowledgments in theses/dissertations/papers/books
  • Data Collections
    • To assess data holdings, I am most interested in how well our collections align with the data needs of our students and faculty. The connections between a library's data holdings and possible impacts on student or faculty research are too tenuous to nail down.
    • How can I measure this? To be honest, I am having the toughest time coming up with adequate measurement tools for the data collections.
      • Examine faculty vitae to ensure that our geospatial data holdings meet their research interests.
      • Examine syllabi.
      • Usage statistics. Count the number of times particular datasets are accessed.
      • Compare holdings to comparable institutions.
      • Count the number of times library staff are faced with a user whose data needs are not met by our collections.
Now, do not let this fool you. I am not currently implementing all of these measures. Now that I have taken the time to jot these things down here, I will share it with others and finalize an implementation plan for this summer. OK, now I can get back to something a bit more enjoyable. ;)