Showing posts with label package. Show all posts
Showing posts with label package. Show all posts

Wednesday, January 7, 2015

FGBASE: Fast Grid-Based Spatial Data Mining

FGBASE is a new open source software for using scan statistics on gridded data.  Unlike SaTScan, FGBASE only currently runs on Mac OS X (10.6, 10.7, and 10.8) instead of Windows and also allows for its source code to be downloaded here: http://www.fgbase.org/download-fgbase/.  The software was specifically created for environmental epidemiology but has potential applications to any fields of study concerned with finding clusters.

Analyzing aggregate data, using either software package, helps to speed up computationally intensive equations for finding spatial, temporal, or spatiotemporal clusters.

Comparison of FGBASE and SaTScan


FGBASE
SaTScan
Operating system(s)
Mac OS X
Windows, Linux,
Mac OS X
Open source code
Yes
No
Geographic output
In app
New: Export to KML or SHP
Sample data sets
Yes, 1
Yes, several
Documentation
TBD
Extensive
Publications
1
Extensive, hundreds
           
Although FGBASE comes with some sample data (available at: http://www.fgbase.org/user-data/), the program was only recently released.  Aside: The data set is different from the one used in the published paper, so you will notice differences when looking at your screen.  What data sets you will need and how they are structured is available at: http://www.fgbase.org/user-data/.

Clusters can be examined using a data-driven approach answering the question: where are the clusters?  Or, a hypothesis-driven approach can be used: are there clusters relative to a source(s) of exposure, where entities (factories,etc.) may be responsible for the clustering of cases.

A stock screenshot of FGBASE. Source: IJHG
I downloaded and installed FGBASE.  I will check back in with more impressions in a few months. Adding documentation, with a tutorial, or even a short YouTube video could greatly aid users.  I also plan to blog about getting data into SatScan and interpreting results later in the year.  Since FGBASE's source code is public, hopefully this will speed further development of the program and aid troubleshooting.

Read more at the International Journal of Health Geographics:
http://www.ij-healthgeographics.com/content/pdf/1476-072X-13-46.pdf

See also:
Treescan
R: Spatial Epi Package
There is also an experimental SaTSViz plugin in QGIS but I have not had a chance to look at.

Thursday, December 11, 2014

R GeoNames API

R has a package (geonames) for connecting to the GeoNames API.  If you are not familiar with GeoNames be sure to check out this previous post with a few examples.  The documentation for the package can be found at: http://cran.r-project.org/web/packages/geonames/geonames.pdf

Step #1: Getting a Free Account
You will need to visit  http://www.geonames.org/login to create a free account.  Note: GeoNames has free and premium services.

Step #2: Activating the Account
After receiving a confirmation e-mail, log-in, and click the activate link.  Look midway or towards the bottom of the page.  It is easy to miss!  If you do not activate the account, you will receive an error message in R such as '401 Unauthorized'.

Step #3: Installing and Loading the Package in R
The simplest way to a install a package in R is to go the toolbar at the top, select "Install Packages", choose a download source or simply select "OK" and then scroll down until you find "geonames" in all lowercase letters.

  • After the package installs, you will also have to Load Package from the same toolbar.  
  • Loading the package from the toolbar will have to be done each session--unless you write a short bit of code to do the same automatically.
  • Also make sure you have admin privileges on the computer you are working on.


Step #4: Connecting to the API
In R, you will have two write two lines setting "options" to access the API.  Simply replace "your username" in red with your username!  You will also have to set the host--which is currently api.geonames.org.  Please note in some older documentation and websites this is listed incorrectly as an older address.  It is also possible, although not likely, it could change in the future but would be listed here.

options(geonamesUsername="your username")
options(geonamesHost="api.geonames.org")

Step #5: Test Your Connection
In R, simply type:

source(system.file("tests","testing.R",package="geonames"),echo=TRUE)

If everything is setup correctly, R will pause for a few seconds and return geographic data like in the screenshot below.

Running the code above, you can test your connection.

Step #6: GeoNames Structure
Remember geographic data can have different hierarchies (See Place Hierarchy Webservices) and be accessed in different ways. Be sure to read through the GeoNames and R package documentation to be certain you are getting your desired result.  There are a number of user-defined functions in the GeoNames package.

Examples of user-defined functions:

  • GNcities () - returns cities within a bounding box
  • GNearthqukes () - returns recent earthquakes
  • ...and many more!

Step #7: Getting Geographic Information from Wikipedia, Saving...
For example, one function in the package allows you search Wikipedia articles and retrieve geographic information (i.e. lat, long, elevation).  For example the following code looks for up to 10 articles with "oriole" and stores them in a R dataframe named results:

results<-GNwikipediaSearch("oriole", maxRows = 10)

Click the screenshot to open in a larger window
Example of results, with rank and geographic information, lat, long, and elevation.
You can also type "summarize(results)"--without the quotes -- remember 'results' is the name of the data set to view all of the variables.  The data can also be exported from R.  Keep in mind you can use R to perform exact or fuzzy merging with a data set of place names you have.