Showing posts with label SatScan. Show all posts
Showing posts with label SatScan. Show all posts

Tuesday, December 8, 2015

QGIS and FOSS GIS Wishlist for 2016

Happy holidays! I am certainly thankful for QGIS this year as it showed significant improvements to its capabilities and the user experience.  Many other free and open source GIS projects also improved including a major update of GRASS GIS and gvSIG graduating from incubation. In addition, SaTScan continues to get easier to use while providing advanced spatio-temporal statistics. GeoDA has reached nearly 150,000 downloads, and LAStools continues to rock!

Thanks to QGIS and  ALL FOSS GIS Developers!
I wanted to take a moment to talk about my QGIS wishlist for 2016.  In the coming year, I hope to get more involved...I am aiming for trying to create some plugins. You can checkout the QGIS roadmap and submit requests for new features at: http://hub.qgis.org/projects/quantum-gis/roadmap.

What are your wishes for QGIS in 2016?  Feel free to leave them in comment section below!

Of course, the core QGIS developers are always hard at work, and some of these may not be scheduled for the near or future release, but it is always good to dream! These are on the advanced feature end, and although not critical, would be nice to have.

My QGIS wishlist for 2016:
  • Continued commitment to cartography (definitely happening)
  • Full-funding goals reached for crowdsourced QGIS plugins and projects.
  • More maps in the QGIS Flickr Showcase (Do your part!)
  • Continued improvements to the Print Composer
  • Error-free or near-error free releases of QGIS. 
    • I worry as more features are added, more bugs could creep in!
  • Ability to join points to lines - visualizing data by street segments can be extremely cool!
  • More spatial analysis tools integrated directly into QGIS core
    • Might include linear directional mean, standard distance, or others...
  • Ability to create an address locator from reference data
    • Online locators have limitations (number of records that can be (batch) geocoded) and can't be used for confidential data
Some other/non-QGIS wishes
How to contribute
Lastly, there are many ways to contribute to QGIS: http://qgis.org/en/site/getinvolved/index.html. Also, if you use QGIS, whether for school, business, government, or non-profit, please consider a donation!  https://www.qgis.org/en/site/getinvolved/donations.html

Monday, March 30, 2015

FOSS GIS Version Checks - March 2015

Keeping software up-to-date is extremely important. Free and open source (FOSS) GIS software are no exception.  Typically, updates bring fixes, better stability, sometimes performance improvements or security patches, and even new features, some of which can be game changers!  Keep an eye on FOSS GIS websites or subscribe to their e-mail listserves to keep up-to-date.

Here's a quick list of a few free and open source GIS programs, related software packages and libraries, and their version numbers.

Desktop GIS
GRASS GIS 7.0.0 LTS
QGIS 2.8.1 Wien LTR
OpenJump 1.8.0

Remote Sensing/Image Processing
Orfeo Toolbox 4.4
Opticks Image Processing 4.12.0

LIDAR
Fusion LIDAR 3.42
FugroViewer 2.0
LAStools (March 2015)

Spatial Analysis
GeoDA 1.6.7
GWR 4.0
SaTScan 9.4.1

Web map development
Leaflet 0.7.3 - an open source JavaScript library for mobile web maps
Open Layers 3.4
GeoServer 2.7.0

QGIS Visual Changelog makes learning about new features a breeze!
Many open source GIS programs have a roadmap, 'wiki', or version log. These exhaustive sources of information can give you the heads-up on when an update will be released and what features the new version may contain.  They also contain lists of bugs, potential fixes, and the progress toward the fix.

Lastly, updates for paid software are also important and sometimes may require you to update your license agreement, depending on when you purchased the software. Additional fees may apply.

Tuesday, February 17, 2015

SaTScan 9.4 released, better than ever!

SaTScan is a program for detecting clusters over space, time, and space-time.  It is available for Windows, Mac OS X, and Linux. SaTScan 9.4 was recently released and it is better than ever!  The data import wizard now allows shapefiles to be read and and a graphing feature has been added to help examine temporal trends. Visit the link for a better look at the rundown of new features.

The Import Wizard now reads shapefiles.
In previous posts, I've covered the types of files you will need and how to aggregate data in preparation for importing it. Since version 9.2, SaTScan has had the ability to export *.kml and *.shp so that the most likely clusters can be viewed in GIS software. (Aside: Google Earth Pro is now free! https://www.google.com/work/mapsearth/products/earthpro.html)

Below is an example looking at clusters of low immunization rates in California from the journal Pediatrics. Free full-text: http://pediatrics.aappublications.org/content/135/2/280.full.pdf+html

In SaTScan, using lat/long coordinates, allows users to export to *.kml and *.shp.
Google Earth opens the *.kml automatically when a run is complete.
A few tutorials are being made, http://www.satscan.org/tutorials.html and sample data is available. Be sure to read the expertly written user's guide before running: http://goo.gl/rHg7M6. and the long and varied bibliography of analyses conducted with SaTScan: http://www.satscan.org/references.html

Update #1 (2/20/15)
Scan statistics can also be implemented in R's Spatial Epi Package and rsatscan.

Monday, January 19, 2015

Using R to Prepare a Case File for SatScan

SaTScan requires several different types of files for analysis: 1)  A case file with a column for the geographic unit. day, month or year (see documentation), and number of cases.  You can aggregate the data into any geographic unit--large or small. 2) A geographic coordinate file (cartesian or lat/long) with the name of the unit (i.e. census tract), x and y for centroids of the geographic units, and 3) population file with the estimated population over the time period-- by year.

In this post, I will describe creating a case file using code in R.  The goal is to create a sum of homicides by month, year (just 2013 for this example), and police beat/post.  We won't worry about any other specifics (i.e. degree) or related types of crimes, i.e. shootings.

To ready yourself for data preparation, read Richard Block's tutorial or the more extensive SatScan manual.

I use crime data from Chicago's Open Data Portal.  The same code can be applied to other types of data, health data, etc.  A few key points: 1) the data contains victim-based data--which we want to convert into incidents. 2) not every post has a homicide, and 3) the reference post list contains 275 post.  So, we will end up with a data set with 3300 rows (275 x 12 months) or simply a row for each post-month.

If you want to skip ahead and just look at the code, go to: http://goo.gl/pmOi1u.


At the top: What you start with.  Bottom: After processing in R

Overview of Steps: See the code for further details

Step #1: Two files are imported: 1) a victim-based file of all crimes, which is narrowed down to just homicides (you could also add in shootings) and 2) a 'reference' file or simply a list of the police beats/posts in Chicago.

Step #2:  The data are summed up so that each row contains the total number of victims, then grouped again into incidents by using two different count variables.

Step #3: The list of police beats get column variables for each month in the year and expanded by reshaping data from wide to long.  This serves as a 'reference list' for matching purposes.

Step #4:  The two data sets are matched the 'unmatched' records are also kept.  These are post-months that don't have a homicide, so each count value is replaced with a zero.

Step #5: To ensure the code has worked, I check the total number of rows (3300) and spot check various posts to make sure the data has been grouped in to incidents and posts correctly.  

Whether in R or using for-fee software (i.e. SAS, STATA), preparing data for SaTScan is relatively straightforward but there are a number of steps.

Update #1 (2/18/15)
Scan statistics can also be implemented in R's Spatial Epi Package and rsatscan .

Wednesday, January 7, 2015

FGBASE: Fast Grid-Based Spatial Data Mining

FGBASE is a new open source software for using scan statistics on gridded data.  Unlike SaTScan, FGBASE only currently runs on Mac OS X (10.6, 10.7, and 10.8) instead of Windows and also allows for its source code to be downloaded here: http://www.fgbase.org/download-fgbase/.  The software was specifically created for environmental epidemiology but has potential applications to any fields of study concerned with finding clusters.

Analyzing aggregate data, using either software package, helps to speed up computationally intensive equations for finding spatial, temporal, or spatiotemporal clusters.

Comparison of FGBASE and SaTScan


FGBASE
SaTScan
Operating system(s)
Mac OS X
Windows, Linux,
Mac OS X
Open source code
Yes
No
Geographic output
In app
New: Export to KML or SHP
Sample data sets
Yes, 1
Yes, several
Documentation
TBD
Extensive
Publications
1
Extensive, hundreds
           
Although FGBASE comes with some sample data (available at: http://www.fgbase.org/user-data/), the program was only recently released.  Aside: The data set is different from the one used in the published paper, so you will notice differences when looking at your screen.  What data sets you will need and how they are structured is available at: http://www.fgbase.org/user-data/.

Clusters can be examined using a data-driven approach answering the question: where are the clusters?  Or, a hypothesis-driven approach can be used: are there clusters relative to a source(s) of exposure, where entities (factories,etc.) may be responsible for the clustering of cases.

A stock screenshot of FGBASE. Source: IJHG
I downloaded and installed FGBASE.  I will check back in with more impressions in a few months. Adding documentation, with a tutorial, or even a short YouTube video could greatly aid users.  I also plan to blog about getting data into SatScan and interpreting results later in the year.  Since FGBASE's source code is public, hopefully this will speed further development of the program and aid troubleshooting.

Read more at the International Journal of Health Geographics:
http://www.ij-healthgeographics.com/content/pdf/1476-072X-13-46.pdf

See also:
Treescan
R: Spatial Epi Package
There is also an experimental SaTSViz plugin in QGIS but I have not had a chance to look at.

Sunday, May 5, 2013

Space-Time Cluster Analysis with SatScan

For more information: Visit the latest post on SatScan: http://opensourcegisblog.blogspot.com/2015/02/satscan-94-released-better-than-ever.html

Original post
Numerous basic and advanced techniques exist for finding spatial and temporal clusters.  Searching for clusters has broad applications for any field of scientific inquiry!

Unlike other spatial models in other free and paid software, SatScan's statistics' probability distributions allow for poisson (count data and rates) and binomial distributions--to name two.  There is also the ability to treat same data as continuous.  You won't find an easier way to do this than with SatScan!

SatScan is a free program but requires several steps to get data into it for analysis.  For most analyses you will need three files in a text delimited format -- without column headers (such as variable names).

The three files: 1)  A case file with a column for the geographic unit. day, month or year (see documentation), and number of cases.  You can aggregate the data into any geographic unit--large or small. 2) A geographic coordinate file (cartesian or lat/long) with the name of the unit (i.e. census tract), x and y for centroids of the geographic units, and 3) population file with the estimated population over the time period-- by year.

After this slightly painful process, which one learned, can easily be duplicated, one can easily perform complex spatial analysis and adjust key parameters such as the population at risk and maximum size of the cluster.  Time units are important, and you will have to make key decisions as to how long a cluster may have to develop--depending on the problem of interest.

SatScan can look for purely spatial, purely temporal, space-time, and spatial variation in temporal trends in data.   SatScan uses 'scan' statistics/scanning window and cylinder to finding and differentiating potential clusters.

SatScan's output includes *.txt and/or *.dbf files of the results and clusters.  The *.gis file can be joined to the shapefile of the geographic units, which are using, to show risks and different clusters.  This part is straightforward and less painful.  You will need to take your time selecting parameters and interepreting results!

Two good articles to read are: 1)  Block's Tutorial and Review   and 2) Visual Analytics of Space-Time Statistics.  The SatScan manual on its website also has a great list of references.

Additional Article:
http://medicine.plosjournals.org/archive/1549-1676/2/3/pdf/10.1371_journal.pmed.0020059-L.pdf