Improving a street-based geocoding algorithm using machine learning techniques

Kangjae Lee, Alexis Richard C. Claridades, Jiyeong Lee

Research output: Contribution to journalArticlepeer-review

15 Scopus citations

Abstract

Address matching is a crucial step in geocoding; however, this step forms a bottleneck for geocoding accuracy, as precise input is the biggest challenge for establishing perfect matches. Matches still have to be established despite the inevitability of incorrect address inputs such as misspellings, abbreviations, informal and non-standard names, slangs, or coded terms. Thus, this study suggests an address geocoding system using machine learning to enhance the address matching implemented on street-based addresses. Three different kinds of machine learning methods are tested to find the best method showing the highest accuracy. The performance of address matching using machine learning models is compared to multiple text similarity metrics, which are generally used for the word matching. It was proved that extreme gradient boosting with the optimal hyper-parameters was the best machine learning method with the highest accuracy in the address matching process, and the accuracy of extreme gradient boosting outperformed similarity metrics when using training data or input data. The address matching process using machine learning achieved high accuracy and can be applied to any geocoding systems to precisely convert addresses into geographic coordinates for various research and applications, including car navigation.

Original languageEnglish
Article number5628
JournalApplied Sciences (Switzerland)
Volume10
Issue number16
DOIs
StatePublished - Aug 2020

Keywords

  • Address
  • Alias
  • Geocoding
  • Machine learning

Fingerprint

Dive into the research topics of 'Improving a street-based geocoding algorithm using machine learning techniques'. Together they form a unique fingerprint.

Cite this