Thursday, December 11, 2014

R GeoNames API

R has a package (geonames) for connecting to the GeoNames API.  If you are not familiar with GeoNames be sure to check out this previous post with a few examples.  The documentation for the package can be found at: http://cran.r-project.org/web/packages/geonames/geonames.pdf

Step #1: Getting a Free Account
You will need to visit  http://www.geonames.org/login to create a free account.  Note: GeoNames has free and premium services.

Step #2: Activating the Account
After receiving a confirmation e-mail, log-in, and click the activate link.  Look midway or towards the bottom of the page.  It is easy to miss!  If you do not activate the account, you will receive an error message in R such as '401 Unauthorized'.

Step #3: Installing and Loading the Package in R
The simplest way to a install a package in R is to go the toolbar at the top, select "Install Packages", choose a download source or simply select "OK" and then scroll down until you find "geonames" in all lowercase letters.

  • After the package installs, you will also have to Load Package from the same toolbar.  
  • Loading the package from the toolbar will have to be done each session--unless you write a short bit of code to do the same automatically.
  • Also make sure you have admin privileges on the computer you are working on.


Step #4: Connecting to the API
In R, you will have two write two lines setting "options" to access the API.  Simply replace "your username" in red with your username!  You will also have to set the host--which is currently api.geonames.org.  Please note in some older documentation and websites this is listed incorrectly as an older address.  It is also possible, although not likely, it could change in the future but would be listed here.

options(geonamesUsername="your username")
options(geonamesHost="api.geonames.org")

Step #5: Test Your Connection
In R, simply type:

source(system.file("tests","testing.R",package="geonames"),echo=TRUE)

If everything is setup correctly, R will pause for a few seconds and return geographic data like in the screenshot below.

Running the code above, you can test your connection.

Step #6: GeoNames Structure
Remember geographic data can have different hierarchies (See Place Hierarchy Webservices) and be accessed in different ways. Be sure to read through the GeoNames and R package documentation to be certain you are getting your desired result.  There are a number of user-defined functions in the GeoNames package.

Examples of user-defined functions:

  • GNcities () - returns cities within a bounding box
  • GNearthqukes () - returns recent earthquakes
  • ...and many more!

Step #7: Getting Geographic Information from Wikipedia, Saving...
For example, one function in the package allows you search Wikipedia articles and retrieve geographic information (i.e. lat, long, elevation).  For example the following code looks for up to 10 articles with "oriole" and stores them in a R dataframe named results:

results<-GNwikipediaSearch("oriole", maxRows = 10)

Click the screenshot to open in a larger window
Example of results, with rank and geographic information, lat, long, and elevation.
You can also type "summarize(results)"--without the quotes -- remember 'results' is the name of the data set to view all of the variables.  The data can also be exported from R.  Keep in mind you can use R to perform exact or fuzzy merging with a data set of place names you have.

No comments:

Post a Comment