Sunday, February 21, 2016

Spatial Analysis with GeoDa: Part II - Importing Data and Tools

GeoDa opens as a "floating bar" which you will find nice as you do analysis and realize multiple linked windows can be arranged.  The maps and graphs are interactive, as I'll show in later posts, show selecting features in one window will highlight the same parts in other windows.

When I learn a new piece of software, I always go from left to right.  
File Menu
The "File" menu allows you to import data, save and load projects (self-named *.gda files), and export selected data. In addition, there is a nice Project Information option that tells the title, data source and type, project name, number of observations and fields.

Data Formats
Users can import a wide array of file formats: shapefile, SQLite/SpatialLite, *.csv , .xls, .dbf, .json, .gml, .kml, and MapInfo files.  Remember, are analyzing vector data, so points, lines, and polygons. Remember map projections matter, since spatial weights are created based on distance!

GeoDa does a great job of offering multiple file types to import.
Tools Menu: Spatial Weights
Spatial weights are used to model spatial relationships. Using GeoDa, we can create spatial weights based on contiguity/bordering (think chess moves: rook or queen), distance, and the number of nearest neighbors.  Imagine a grid or matrix that has a row and column for every feature.  The cells are populated using 0/1 for weights based on contiguity (where a feature borders another) or distances for distanced based weights.

Tips:
  • Generally, do not go above 2nd order of contiguity: 1st order contiguity is neighbors, 2nd order is neighbors of neighbors.  Anything beyond this becomes extremely difficult to interpret.
  • The GeoDa Center also has PySAL an open source Python library that can be used to create spatial weights and perform spatial analysis.
The first option is "Select" if you have already created weights.  The second option is "Create."  Here you will a couple of options to examiner spatial relationships in your data.  Which one you choose should be based on the phenomenon you are studying. Like other types of analysis, you will also want to examine how different spatial weights affect your results.
Connectivity Histogram
Another one of GeoDa's cool features is a histogram that shows the number of features with a specific number of features.  It can also help you clear up any questions you have about different types of contiguity and how spatial relationships are modeled.

On the histogram a right, the bar/bin for two neighbors is selected.
On the map at left the county is highlighted. Selecting other bars would highlight more features.
Users can also see the distribution of the spatial weights from the histogram.
Shape
In case you tabular data, you can create points from this menu. You can also create a bounding box or grid.  Next time, we'll look at the Table and Map toolbars.

Want blog or YouTube updates?  You can follow me @jontheepi: https://twitter.com/jontheepi

Wednesday, February 3, 2016

Spatial Analysis with GeoDa: Part I - Introduction

GeoDa (https://geodacenter.asu.edu/software) is a free and open source cross-platform program for exploratory (spatial) data analysis or EDA/ESDA and maximum likelihood spatial regression. It has been downloaded nearly 150,000 times and is available on Windows, OS X, and Linux.  ASU's GeoDa center is home to Luc Anselin, e.g. Anselin's Moran's I a local indicator of spatial autocorrelation or LISA.

Update #1: It looks like an older version of GeoDa's source code is available (circa 2014) but not more current versions: https://code.google.com/archive/p/geoda/source

Why use GeoDa?
You are interested in spatial analysis of vector data (points, lines, polygons) and statistics.  This includes looking for clusters of count or rate data, which have similar attribute values, performing regression (asking why a certain pattern exists), observed/predicted values, residuals, and diagnostics. Spatial statistics are commonly used in mainly fields including health, criminology, and pretty much everything!

If you are using GIS for a problem, at some point, you should consider spatial statistics.  The human brain and eye can only see so much.  Some patterns aren't easily apparent.

Spatial analysis can come at a cost ($), and this is why GeoDa is so great!  It is free, open source, and has great capabilities  It even includes some advanced options which you can't currently find in ArcGIS.

Features
GeoDa includes the ability to make choropleth maps, graphs, Thiessen polygons, creating spatial weights using queen and rook contiguity (which requires a high level license in ArcGIS), graphing features by number of neighbors, linked graphs you can 'brush,' LISAs, and regression. We will dive deeper into features later--there is a lot to cover.

A list of GeoDa's features can be found at: https://geodacenter.asu.edu/general-features.  Also, here is a list of its modeling features: https://geodacenter.asu.edu/node/397.

Examples of Use
In 2014, I wrote about a simple use case: examining health insurance rates at the county-level:
http://opensourcegisblog.blogspot.com/2014/04/exploring-health-insurance-estimates-by.html.

More to come...
This is the first part in a series that explores GeoDa's functions and spatial statistics. If there is something you would like to see, leave it in the comment section below.

Want blog or YouTube updates?  You can follow me @jontheepi: https://twitter.com/jontheepi