NYS ITS GIS Program Office
Geographic Information Systems Clearinghouse
GIS Data Set Details
|Data Set Name||Description||Theme||Metadata|
|Address Geocoder||2 geocoder web services using NYS Address (Sam) Points and NYS Streets||
- Deciding between Spatial Location and Postal Addresses
- Geocoding Help 101
- Geocoding in ArcMap
- Geocoding in QGIS
NYS GIS Program Office Geocoding Services
Last Updated: 11/27/2017
Geocoding is the use of technology and reference data to return a geographic coordinate when a street address is entered. Geocoding is used when lists of addresses need to be placed on a map, when an address is entered in an application to center a map or return information for that location, or any other time you have an address and a geographic coordinate is needed. An address is entered either manually or by bulk input from a database or other source. The geocoder then compares the entered address to a set of reference data. The geocoder returns a coordinate pair and standardized address for each input address it is able to match. The GIS Program Office geocoder uses a series of combinations of reference data and configuration parameters to optimize both the likelihood of a match and the quality of the results. The reference data supporting the geocoder is stored in Federal Geographic Data Committee (FGDC) standard.
While there is only one way of storing an authoritative address using the FGDC Address Standard (https://www.fgdc.gov/standards/projects/address-data), addresses entered by users can vary widely. The NYS GIS Program Office has tried to allow for as much flexibility as possible in user input addresses which can result in a successful geocode. We do this by incorporating multiple versions of address reference data. Each combination of reference data and configuration parameters are referred to as a locator. Groups of locaters are combined into composites. By geocoding against the separate composite locators in a specific sequence the best result is achieved. In order of preference the reference data includes Street Address Mapping (SAM) - address points, street segments, municipalities, and zip codes.
One way that we allow maximum flexibility is to allow multiple city names. The post office has a preferred 'city' name for each address, but we also will evaluate reasonable alternates such as the municipality the point is in (which often differs from the postal 'city' name), as well as other alternate place names. The description of each locator below explains how these alternate names come into play.
Now that the SAM project (see: http://gis.ny.gov/streets/) is complete and the GIS Program Office has address points in every county, we have incorporated these authoritative address points into the composite locators. Since the SAM points are complete statewide, this eliminates the need for the locators to use the old Navteq Address Points as their underlying source data. Since these address locators have been removed, the GIS Program Office has reorganized the two composite locators.
We have also added additional address locators to improve geocoding results. These locators have been optimized for speed and flexibility to return the most accurate and standardized results as often as possible. For example, the locators built using street segments with no address number are designed to return a hit along a street segment. Locators built with street segments can also geocode to intersections (e.g. Lark St & Madison Ave, Albany NY).
The GIS Program Office has found a few limitations of the address locators and has made ESRI aware of these limitations. Here is a list of the limitations and current work around until they are corrected:
- The geocoding service is only supported between 8 am and 4 pm, Monday through Friday. If an issue is encountered during off hours, please notify the GIS Program Office and we will look into the issue as soon as possible.
- In some cases, the State field is causing a high score for an Address Point in New York City where not appropriate. An example of this is if you enter 620 Madison St, Syracuse NY 13210 the geocoder returns 620 Madison St, New York, NY 11221. To avoid this error when geocoding with the Street_and_Address_Composite, the GIS Program Office recommends not mapping the State field. If you enter 620 Madison St, Syracuse 13210, the correct location is returned. The State field can still be mapped in the Street_NoNum_and_ZipCode_Composite as there are no longer any address point locators within this composite to cause this issue.
- When geocoding to addresses in New York City that contain a hyphen (e.g. 123-01 Roosevelt Ave, New York, NY). The geocoder is inserting an extra space into the address between the Prefix Address Number and the Address Number (e.g. 123- 01) which may prevent the address locator from finding a matching address point. After some research, we have discovered two workarounds for this scenario:
a. The preferred method of geocoding hyphenated addresses in NYC is using the Single Field entry method. This requires creating a concatenated address field and using that new field for your input addresses. This entry method seems to disregard the extra space that the geocoder inserts and can find a matching address point without modifying the source data that you are geocoding.
b. If you do not have the option to use the Single Field entry method (ArcGIS 10.2.2 and newer) and you know you have addresses in NYC that contain hyphens, insert a space between the hyphen and the number that follows it. This will allow the geocoder to find corresponding address points if they are present in the locator reference data.
c. When geocoding addresses with hyphens that have a leading 0 after the hyphen (such as the address in this example), the leading 0 will need to be removed prior to geocoding the address. If the leading 0 is not removed, the geocoder will return a Street Segment hit even if an Address Point exists in the source data.
- When geocoding to addresses in New York City, the geocoder may find an Address Point in an incorrect location rather than the correct Street Segment if an address point is missing from the geocoder's source data. For example, if the point for 235 East 11th St, New York NY 10003 (in Manhattan) was missing, the address might geocode to 235 East 11th St, New York NY 11218 (in Brooklyn - notice the different Zip Codes). This is due to the fact that all Address Points in New York City have a City Name of "New York" and there are many duplicate Street Names within the 5 boroughs of New York City. Please be sure to review the Match Address for records that are geocoded with the 1B_SAM_AP_CTName or the 1C_SAM_AP_PlaceName locators to be sure that they geocoded to the correct location. There is a workaround for this scenario as well:
a. When geocoding addresses in New York City, remove the City Name and leave the Street Address and Zip Code (e.g. 235 East 11th St, 10003). This will allow the geocoder to fall back to the correct Street Segment locator if no Address Point is present in the correct Zip Code.
- Fractional Suffix Address Numbers (e.g. ½, ¾, etc) are not geocoding to existing Address Points using the multi-line geocoding option within ArcGIS 10.1 and newer. However, if you have upgraded to ArcGIS 10.2.2 and can use the Single Field geocoding option, these addresses geocode to the matching Address Point as long as one is available. In general, the Single Field geocoding option will provide better geocoding results than the Multi-Field geocoding option, however, at least right now the Single Field option runs considerably slower.
- When geocoding from an Excel file, fields are copied into the geocoding result feature class as part of the batch geocoding process. If the geocoding result is saved as a Shapefile, numbers that are larger than 8 digits get rounded up to the nearest 10. This is a known issue to ESRI and they told us that currently the only workaround is to convert the Excel table into a file geodatabase table or a dbf and then use the table to geocode. Doing this will prevent the number fields from being rounded.
- Geocoding to street intersections will only work if the municipality name is included. There is a setting within the locators that will allow a match without the municipality. However, when we tried changing this setting, it greatly degraded the geocoding results so the setting was changed back to not geocode without the municipality.
- The Geocoding Service is unable to locate addresses outside of New York State. The GIS Program Office is looking into alternatives to allow for geocoding addresses outside of New York, but currently only addresses within the State will geocode.
The first composite locator (Street_and_Address_Composite) is made up of the following set of locators which are most likely to return a high quality hit. The locators are listed in the order in which they will be accessed along with a brief description of the locator's source data. These six locators will generate the majority of the results when geocoding addresses. **Please note the changes to each of the two Composite locators below.**
|Locator Name||Source Data||Description|
|1A_SAM_AP_ZipName||SAM Address Points||SAM address points using the postal zip code name for the city name in the locator.|
|1B_SAM_AP_CTName||SAM Address Points||SAM address points. The city or town name is used for the city name in the locator.|
|1C_SAM_AP_PlaceName||SAM Address Points||SAM address points. The city name is populated using the NYS Villages and Indian Reservations, the Census Designated Places and Alternate Acceptable Zip Code Names from the USPS. These names do not exist everywhere so there will be a limited number of points in this locator.|
|3A_SS_ZipName||NYS Street Segments||NYS Street Segments dataset using the postal zip code name for the city name in the locator. The location is interpolated from an address range on the street segment. The city name can be different for the left and right sides of the streets.|
|3B_SS_CTName||NYS Street Segments||NYS Street Segments using the city or town name for the city name in the locator. The location is interpolated from an address range on the street segment.|
|3C_SS_PlaceName||NYS Street Segments||NYS Street Segments using an alternate place name for the city field. This field is populated using the NYS Villages and Indian Reservations, the Census Designated Places and Alternate Acceptable Zip Code Names from the USPS. These areas do not exist everywhere so there will be a limited number of segments with this attribute. The location is interpolated from an address range on the street segment.|
Any address that does not successfully geocode to the first composite can then be run through the second composite locator (Street_NoNum_and_ZipCode_Composite). Recognizing that hits from this locator will not be spatially accurate. This composite locator is made up of the following locators.
|Locator Name||Source Data||Description|
|4A_SS_NoNum_ZipName||NYS Street Segments||NYS Street Segments dataset using the postal zip code name for the City name in the locator. The location is placed on a street segment with the matching name. Please note this may or may not be the correct street segment.|
|4B_SS_NoNum_CTName||NYS Street Segments||NYS Street Segments dataset using the city or town name is used for the city name in the locator. The location is placed on a street segment with the matching name. Please note this may or may not be the correct street segment.|
|4C_SS_NoNum_PlaceName||NYS Street Segments||NYS Street Segments dataset using the alternate place name is used for the city name in the locator. This field is populated using the NYS Villages and Indian Reservations, the Census Designated Places and Alternate Acceptable Zip Code Names from the USPS. These areas do not exist everywhere so there will be a limited number of segments with this attribute. The location is placed on a street segment with the matching name. Please note this may or may not be the correct street segment.|
|5_ZipCodePts||Zip Code boundaries||Point placed at the centroid of the Zip Code boundaries.|
Currently, the geocoding service will return all of the results when using the Find Tool within ArcGIS. The user will then be responsible for choosing which of the results they want to keep. The SAM Address Points are the most accurate data available and should be picked anytime a result is returned from one of the SAM address point locators. If the geocoding service is used in the ESRI batch tool, the locator will return a Match from the first locator it comes to in the cascading order. If there are multiple locators with the same score or within the same locator the first result is returned and it is coded as a Tie.
The locators will output a field named 'User_fld' which should be used in conjunction with the Loc_Name field. When the Loc_Name field contains one of the Address Point locators (1A, 1B or 1C) this field will contain either a 1,2,3,4 or a 5. When the Loc_Name field contains anything other than the Address Point locators, the 'User_fld' will either be NULL or "0". The numeric values correspond with the type of Address Point that was located:
2. Primary Structure Entrance
4. Parcel Centroid
There is also a locator available that will allow the user to find Incorporated and Unincorporated Places throughout New York State.
|Locator Name||Source Data||Description|
|NYPlace||Municipality Centroid Points||This locator contains points placed at the centroid of NYS Cities, Towns, Villages, Indian Reservations, Unincorporated Places, and Neighborhoods.|
Using the Locators in ArcGIS Desktop
The composite locators and the Place Name locator are available as web services through the following updated links. For documentation on how to add these locators to ArcGIS, please reference Adding the Statewide Geocoding Web Service. If you would like these locators to be your default locators in ArcGIS, copy DefaultLocators.xml to C:\Users\<username>\AppData\Roaming\ESRI\Desktop10.X\Locators, where <username> is your username (sometimes it is username.NYS) and X should be replaced by the version of ArcGIS you are running.
Using the Locators in QGIS
The composite locators and the Place Name locator web services are also available to be used in QGIS through a customized geocoding plugin. For documentation on how to add and use these locators to QGIS, please reference Geocoding in QGIS
Updated Composite Locators:
Change is Coming
In cooperation with the GIO, the GIS Program Office is working on a new geocoder which uses all of the locaters described above, but with a much more simple interface. We intend to embed logic in the geocoder so that only the best result is returned. This will eliminate the need for developers and users to interpret the many results which may be returned from the various locators to decide which one to use.
If you wish to be notified about any changes or updates to the address locator services, please send an email to email@example.com and let us know. Your name and email address will be added to our distribution list and we will notify you if any changes are being made that will affect the locator web services.
Developers coding against the web service should include code which returns the best response from the results returned. The locaters are numbered in the order of spatial quality. Match score should not be used to choose a result from the many that may be returned. SAM points are preferred over street segments with address numbers, which are preferred over street segments without address numbers, and so on. These locaters are designed to only return valid hits, so the actual match score is of little consequence.
By ITS Policy, these services can only be used with the HTTPS protocol.
1220 Washington Avenue
Bldg. 5, Flr. 1, State Campus
Albany, New York 12226