Difference between revisions of "Locality Georeferencing and Generation"

From xBio:D Wiki
Jump to navigation Jump to search
(Locality Comments)
Line 66: Line 66:
 
The locality comments is where any information that helps link the locality to the label is input.  Record any discrepancies, concerns, and/or rationales within locality_comments column of the localities worksheet as well. For example if something on the label is spelled wrong, the locality has different names, or the coordinates are derived from a different source.
 
The locality comments is where any information that helps link the locality to the label is input.  Record any discrepancies, concerns, and/or rationales within locality_comments column of the localities worksheet as well. For example if something on the label is spelled wrong, the locality has different names, or the coordinates are derived from a different source.
 
:Examples of comments:
 
:Examples of comments:
***derived from [gazetteer]/WWW
+
::*derived from [gazetteer]/WWW
***xxy is a variant name/alternate spelling/short name/misspelling of xxx
+
::*xxy is a variant name/alternate spelling/short name/misspelling of xxx
***“[gazetteer] coords. adjusted to place over land” (if gazetteer puts coordinates in middle of body of water)
+
::*“[gazetteer] coords. adjusted to place over land” (if gazetteer puts coordinates in middle of body of water)
 
 
  
 
== Georeferencing Tips ==
 
== Georeferencing Tips ==

Revision as of 19:24, 1 July 2015

Introduction

This section contains information on the procedures used to find coordinates for collecting localities, known as georeferencing, and the manner on which to format locality names for maximal utility within DEA2 and the xBio:D system. Other locality name formatting and georeferencing methodologies may be used, but a locality name must be unique to a collecting locality, i.e., Columbus is not an acceptable locality name since it may mean Columbus, Ohio; Columbus, Georgia; or any number of separate towns. Place names are a strictly controlled vocabulary to maintain consistency, and county/municipality, state/province, and country/territory names must match the existing name within xBio:D, which may be checked from HOL.

The steps below are broadly informative for producing new localities but are overly concise. Please refer to this georeferencing guide, File:Georef Guid OSUC.docx, for more detailed information on the xBio:D preferred georeferencing protocol.

Helpful Resources

DEA 1 http://hymfiles.biosci.ohio-state.edu/DEA/index.html
Google http://www.google.com/
DEA help http://osuc.osu.edu/osucWiki/Data_Entry_Assistant_%28DEA%29_Help
Gazetteers http://osuc.osu.edu/osucWiki/Geographic_Resources_/_Gazetteers
Procedures http://osuc.osu.edu/osucWiki/Data_Entry_Assistant_%28DEA%29_Procedures
Statoids http://www.statoids.com/
State/Province codes http://osuc.osu.edu/osucWiki/State_/_Province_Codes_for_Countries
Google Translate http://translate.google.com
Google Earth
Township information http://www.earthpoint.us/TownshipsSearchByDescription.aspx


General Georeferencing Information

After setting localities and extracting the processed file, the Excel spreadsheet produced by DEA2 contains two new worksheets: Main and Localities. The Localities worksheet holds all of the specimen records that were previously skipped for not having a matching locality name in DEA2. For these specimen records, new locality names must be created along with their corresponding locality information. This information includes geopolitical units that contain the collecting locality, properly formed locality name text, and WGS 84 coordinate information.

If more than one specimen record has the same locality name, only one record will need to contain the geopolitical hierarchy and the coordinate information but do not forget to copy the locality name for all of the records (see below).

Localities worksheet


Place Name Identification and Lookup

If the current locality is in the United States, USGS-GNIS is probably the best method for verifying the authenticity of the name, locating the county in which the locality resides, and obtaining the coordinates. Outside of the U.S., many countries have their own gazetteers (see Geographic Resources / Gazetteers), but in general, GEOnet is the best way to locate locality information. If a place name potentially is misspelled, use Fuzzy Gazeetter, or simply type the place name and the country into Google. Often Google will suggest the correct place name.

  • Note: If a place name on the label does not match the correct name from a gazetteer, make a note of it in the locality_comments column of the Localities worksheet. Record any discrepancies, concerns, and/or rationales within locality_comments column as well.


Identifying the Geopolitical Hierarchy

When the locality information is gathered for a locality, the geopolitical hierarchy also must be obtained for the locality. The reference used for political divisions is a web site named Statoids or type the parent division (i.e. country) into Hymenoptera Online and browse the subordinate divisions for the accepted spelling. Regardless of the spelling of the geographic division in GEOnet or on the label, use the spelling used in Statoids or Hymenoptera Online. The political division type is the English interpretation of the type unless the division type is in a Romance (French, Spanish, etc.) language (e.g. Indonesian Propinsi -> Province, Ecuadorian Provincia -> Provincia). The Statoids site also provides information on the history of the divisions and alternate divisions which aids in deciphering the current political division for the locality. If a locality is not unequivocally located within a given division, do not use the uncertain division in the geopolitical hierarchy. After obtaining the locality coordinates, the coordinates may be entered into Google Earth to discover the current political division for the point (available for some countries). Only towns or equivalent can be used in the place column, thus townships and similar 3rd level divisions should not be part of the geographic hierarchy.


Locality Name Creation

[Town / National Park / Reserve], [coordinates on label], [elevation], [field code], [habitat], [generalized locality term], [further locality information], [geopolitical hierarchy]


The format for newly created locality names is given above, so follow this convention when creating a locality name. If a feature or place has an English equivalent, use the English equivalent as long as the qualifier is not part of the formal name (e.g. Parque Nacional Henri Pittier -> Henri Pittier National Park, Cerro de la Equis -> Equis Hill, Wadi Saluki -> Wadi Saluki (no English equivalent for wadi)). Place names with qualifiers that do not occur as the first element of a locality name may have their qualifier name abbreviated (e.g. Kruger National Park, South Africa; Skukuza, Kruger N.P., Mpumalanga Prov., South Africa). Also, if the 1st level geographic divisions for a country have standardized abbreviations as defined in State / Province Codes for Countries, use the state-level abbreviations. In the case of United States localities, omit United States from the locality name and leave the state code at the end. Political division types should be abbreviated if an appropriate abbreviation is available (e.g. Province -> Prov., State -> St., Município -> Mpio., etc.) If the specimen data includes coordinates, include those coordinates in the locality name. As a general rule, always consult locality names that are already in DEA to use as a template for a new locality name. Many new locality names are merely slight derivations of existing localities present in DEA.


Coordinate / Elevation Information

When the coordinates for a locality are found, fill in the coordinate information. This includes latitude (format: DD MM SS), latitude direction (N or S), longitude (format: DD MM SS), longitude direction (E or W), locality precision (POINT or POLYGON), coordinate source (GEOnet, USGS-GNIS, etc.), elevation (in meters), and max elevation (in meters).

Coordinate information


Latitude & Longitude (columns: lat, lat_dir, long & long_dir)

The latitude and longitude for the locality can come from the label, a geographic gazetteer, the internet, literature, personal communication with collector, or Google Earth. A coordinate column must be in the format DD MM SS where DD are the degrees, MM are the minutes, and SS are the seconds. All of the coordinate parts must be a number which may include decimals. Thus, the coordinate, 16°37.341'N 102°34.467'E, would have the lat column as 16 37.341, the lat_dir column as N, long column as 102 34.467, and the long_dir column as E. The directional columns, lat_dir and long_dir, specify the direction of the coordinates, which can only be N, S or E, W, respectively. Negative values within a coordinate column are forbidden.

Locality Precision (column: loc_prec)

There are two types of localities, points and polygons. A point, POINT, is a locality that has a small margin of error for the specified coordinates, while a polygon, POLYGON, is a locality that has a large margin of error. Generally, a locality is a point if the margin of error can be bound to an area smaller than a county. The quantifiable bounding area to use to determine a point is 325 sq. mi. (18 mi. by 18 mi.) Any values defining a specific amount of error for a coordinate are not stored in the database.


Coordinate Source (column: source)

The resource that was used to obtain the coordinates is the coordinate source. If the coordinates used are directly from a gazetteer, then the source would be the name of the gazetteer used (i.e. GEOnet, USGS-GNIS, CGNDB, etc.) Some localities are specified relative to a place (e.g. 7mi NW of Columbus). For these, follow a major road in the specified direction for approximately the specified distance (slightly shorter due to error in exact road tracking) using the measuring tools available in Google Earth to get a more accurate coordinate. Then, in the locality comments specify which source the coordinates were derived from (e.g. derived from USGS-GNIS). In some cases coordinates for the locality will be directly on the specimen label. In this instance, use a gazetteer to verify the locality, but use the coordinates on the label and enter label as the source.

Elevations (columns: elevation & max_elevation)

The elevation for a locality is only determined by the elevation given on the specimen label(s). If a single elevation is given, enter that value in meters into the elevation column, otherwise, if given as a range, enter the lower value into the elevation column and the higher value into the max_elevation column. In order to convert the elevation from feet into meters, simply type this string into Google: x ft to m where x is the elevation in feet.

Locality Comments

The locality comments is where any information that helps link the locality to the label is input. Record any discrepancies, concerns, and/or rationales within locality_comments column of the localities worksheet as well. For example if something on the label is spelled wrong, the locality has different names, or the coordinates are derived from a different source.

Examples of comments:
  • derived from [gazetteer]/WWW
  • xxy is a variant name/alternate spelling/short name/misspelling of xxx
  • “[gazetteer] coords. adjusted to place over land” (if gazetteer puts coordinates in middle of body of water)

Georeferencing Tips

  • When to use / - or commas:
/ used to separate habitats and field codes (if there are more than one)
Ex: ground litter / cloud forest
(-) “or” something; can show a range of numbers
Ex: oak-pine-soybean
sec. 5-7
, separate different elements not habitats
Ex: Pine Creek, Knob River, etc.
+ “and” something (not often used)
Ex: sec. 5+7
  • Township sections should be entered as “sec. x+y” to include two separate adjacent sections, or “sec. x-y” for sections x through y (x and y not adjacent but :collecting may have occurred in intermediate sections). e.g: label: “Defiance Tp. Secs. 5 and 7 Defiance Co. O.” would yield the following locality: Defiance Township, sec. 5+7, Defiance Co., OH
  • If the locality is in the USA it ends with the state abbreviation EX: NM, CA, OH
  • If the locality is not in the USA, it ends with the full spelling of the country name
  • Common abbreviations:
“nr” for near
“Twp” for township
“km” for kilometers
“mi” for miles
“m” for meters
“ft” for feet
  • Place names with qualifiers that do not occur as the first element of a locality name may have their qualifier name abbreviated (e.g. Kruger National Park, South Africa; Skukuza, Kruger N.P., Mpumalanga Prov., South Africa)
    • Recreational Area/Mountains/etc. is abbreviated if not at the beginning of the locality, but is spelled out if at beginning
    • Roads/Avenues/Drives are part of the name of the feature, and therefore should be spelled out.
  • Populated places in USGS-GNIS & GEOnet are Towns
  • If a feature or place has an English equivalent, use the English equivalent as long as the qualifier is not part of the formal name (e.g. Parque Nacional Henri Pittier -> Henri Pittier National Park, Rio Pisque -> Pisque River, Cerro de la Equis -> Equis Hill, Wadi Saluki -> Wadi Saluki (no English equivalent for wadi)).
  • U.S. localities should end with their appropriate state codes (CA, OH, ID, etc.) (not end with United States)
  • Political division types should be abbreviated if an appropriate abbreviation is available (e.g. Province -> Prov., State -> St., Município -> Mpio., etc.).
  • Brazil, Autralia, Canada, and Mexico use division abbreviations found at http://osuc.osu.edu/osucWiki/State_/_Province_Codes_for_Countries (from the wiki)
  • Don’t use statoids for places in Japan – Wikipedia is usually better
  • Cities in east Asia mean something very specific (we don’t treat them as towns)
  • Directions from localities should be abbreviated (e.g. N of Papallacta, 4km N Sálakos)
  • Prefecture is state level in Japan
  • East Pakistan is Bangladesh
  • W.T. is Washington Territory and should be entered as WA (Washington)
  • Sylvania Cal is located in Nevada (already in DB)
  • Cols. O is an old abbreviation for Columbus, OH
  • Do not add a space between mile/kilometer markers on highways (e.g. mi318, km58). Nor as a direction from somewhere (e.g. 82mi E, 30km W)
  • Any uncompleted locality should be highlighted, then add a row at the bottom of the file using the same color highlight to describe the problem. E.g. “can't find anything on Lebang Hara, except that it may be somewhere in West Borneo” and also cite any relevant links that may help in fully determining the locality. Uncertainties can also be pointed out using this system.
  • Lowland/ mid-elevation/etc. forest is a qualifier making the term one habitat entity
  • Use the “savannah” spelling of the word (vs. savanna) for localities.
  • Guano, unless specified (e.g. seagull guano), is a habitat rather than association.
  • “Mouse nest” is an association (specific enough, most mice are contained within one genus). “Nest” would be a habitat.
  • Barrenando is formally defined as “drilling”, but can be thought of burrowing. As in “barrenando en caña de azucar” which translates to “burrowing in sugar cane”.
  • Georef labels with coordinates from Chile collected by Sharkey, Argentina collected by Archangelsky or coordinates on labels collected by C. C. King as if the coordinates weren’t there. (the coordinates on the label were added after the fact and are not necessarily accurate)
  • Always abbreviate township to “twp.” for every case.
  • Convention for highway outside of US: for a specific name use “Hwy. xx” e.g. Highway 62/Hwy62 would become Hwy. 62. Use “hwy.” for a non-specific name, e.g. nr. hwy., side of hwy. For Highway in US use:
    • State highway (white circle) Ex: OH-315
    • US highway (white shield) Ex: US-650
    • Interstate (blue shield) Ex: I-62
    • County road (white square), different based on state Ex: Farm-to-Market Road 791 (Texas)
  • “Shores” and “seeps” as in Silver Creek shore/seep should be formatted within a locality as “Shore/Seep of Silver Creek”
  • Near should always be abbreviated as “nr.” no matter where it appears in the locality
  • The junction of two roads are abbreviated as such: “jct.”. If “Road” appears in the name of either road, it should be not abbreviated.
  • Vicinity should not be abbreviated.
  • E.g. locality is in Alamosa Co. rather than adjacent Costilla Co. or for parks/mountains/ boundary related things locality is contained in Alamosa Co. rather than adjacent Costilla Co.