Wednesday, May 03, 2006

DataFerrett & TheDataWeb

I have not used the DataFerrett for quite some time and a student came by to ask about it yesterday. We extracted variables from the 'Loan Application Register Dataset' of the Home Mortgage Disclosure Act in SPSS format. Student will be returning this weekend to use GIS to geospatially analyze these variables. Thank goodness I remembered enough about it to help the student, but I want to take this moment to jot down some notes.

If you share my insatiable thirst for a never-ending supply of datasets, DataFerrett is a must have. As an aside, one of my students this semester emailed this website to me, suggesting that I join: http://www.dataaddictsanonymous.com. What a sense of humor, eh? Pushah!

Anyway, DataFerrett is a fantastic free data mining tool (downloadable Java application) that is developed by the U.S. Census Bureau and the Centers for Disease Control and Prevention. It is part of TheDataWeb, a "network of online data libraries that the DataFerrett application access the data through." Complex queries against large datasets can be extracted as spreadsheet, SAS, SPSS, STATA, and delimited text. SQL Queries and tables can be generated within DataFerrett, as well as charts, graphs, and maps. Now, with access to ArcGIS we do not need the mapping capabilities within DataFerrett, but very cool it is there.

Want to test drive the latest DataFerrett? Download BetaDataFerrett.

Here is a list of data topics available: http://www.thedataweb.org/topics.html. The complete Census 2000 Summary File 3 dataset (long form, sampled data) to the Census Tract level is available through DataFerrett. If you are using American Factfinder to extract Census data, DataFerrett will give you a way to circumvent Factfinder's restrictions, such as the 7000 geography limit per query.

3 comments: