×
Namespaces

Variants
Actions

HERE Maps API - Converting any data file to KML

From Nokia Developer Wiki
Jump to: navigation, search

This article explains how to read address data from an arbitrary file format, and create a KML file for display on a map.

Article Metadata
Code ExampleTested with
Devices(s): Firefox , Internet Explorer, Google Chrome, Opera
Compatibility
Platform(s): Web Browser
Dependencies: HERE Maps 2.5.3
Article
Keywords: HERE Maps, JavaScript, KML
Created: jasfox (18 Jan 2012)
Last edited: jasfox (20 Dec 2013)

Contents

Introduction

Keyhole Markup Language (KML) is an XML notation for geographic applications. The advantages of using KML are numerous, and have been listed in a previous article. A typical enterprise may wish to add some markers representing addresses onto a map for their website, but without necessarily learning too much about geocoding or KML. It is likely that the address data they have is already held in a file or spreadsheet somewhere.

This example aims to take the pain out of creating a KML dataset. It aims to take any data format and attempt to locate addresses given by specified fields from the file. The addresses are then transformed into KML <Placemark> elements with associated <name> and <description> elements taken from other fields from the same record. The generated KML data can be inspected and edited using the editor example given here and then displayed using the code from the How to display KML data example. In summary, this article demonstrates a real world use of the geocoding service and is an example of making sequential asynchronous JavaScript calls to obtain longitude and latitude.

Defining the issue

For a typical data file, the first line in the file will be a header line which defines the fields in the records below. Each field will be separated from the next by some arbitrary separator character. For a CSV file for example, the separator character will be a comma (,) for other data formats it may be a space( ), a pipe (|) or some other character. At the end of the header line there will be some form of terminating character (typically a new line). Thereafter each subsequent line of data will hold a record of separated fields of data, with each field being associated to the definition of the field held in the header.

For example the following spreadsheet

serial name unused address_street address_city address_district description
13556 Millenium Stage Something 401 Bay Drive New York This is an example
3243 The Chambers Something else 53 Bothwell Street Watford Hertfordshire Some more text here
9954 4th Dimension More data 226 Myrtle Ave Toronto
8645 E.K.B.R. Even more data Invalidenstrasse 117 Berlin Hier gibts etwas

Comma Separated Variable Format

Comma Separated Variable (CSV) Format is a simple de facto standard for spreadsheet data, the table above would be represented as follows:

 serial, name , unused,address_street, address_city, address_district, description
13556, Millenium Stage, Something, 401 Bay Drive,New York, This is an example
3243, The Chambers, Something else, 53 Bothwell Street, Watford,Hertfordshire, Some more text here
9954, 4th Dimension, More data, 226 Myrtle Ave , Toronto,,
8645, E.K.B.R, Even more data, Invalidenstrasse 117, Berlin,, Hier gibts etwas

Tab Separated Variable Format

Tab Separated Variable (TSV) Format is similiar to CSV, with each colum separated by whitespace as shown below:

serial	 name 	 unused	address_street	 address_city	 address_district	 description
13556 Millenium Stage Something 401 Bay Drive New York This is an example
3243 The Chambers Something else 53 Bothwell Street Watford Hertfordshire Some more text here
9954 4th Dimension More data 226 Myrtle Ave Toronto
8645 E.K.B.R, Even more data Invalidenstrasse 117 Berlin Hier gibts etwas

Other proprietary formats

The Right Move Estate Agent Data Format is a proprietary format popular amongst estate agents in the United Kingdom. A simplified extract for the properties defined above would look something the data below. This has been chosen as an alternative to illustrate the problem.

#HEADER#
Version : 3
EOF : '^'
EOR : '|'
 
Property Count : 4
Generated Date : 19-May-2010 12:29
 
#DEFINITION#
AGENT_REF^NAME^SOME_OTHER_FIELD^ADDRESS_1^ADDRESS_2^ADDRESS_3^DESCRIPTION|
 
#DATA#
13556^Millenium Stage^ Something^401 Bay Drive^New York^This is an example|
 
3243^The Chambers^ Something else^ 53 Bothwell Street^Watford^Hertfordshire^Some more text here|
 
9954^4th Dimension^ More data^ 226 Myrtle Ave^Toronto^^|
 
8645^E.K.B.R^Even more data^Invalidenstrasse 117^Berlin^^Hier gibts etwas|
#END#

It can be seen that to parse data from an arbitrary file, the problem can be split into several parts:

  • Deciding where the header line starts - there may be no preamble, but potentially the data could have a prefix or some white space before it.
  • Deciding what the field and line terminators are
  • Deciding where the data starts - there may be no gaps between the header line and the data, but potentially there could be a some other extraneous information.
  • Deciding which fields are needed in the KML - Some columns of data will be irrelevant, empty or not used.
  • Deciding which fields form the address to geocode. Obviously the names of the field headers could differ as well.
  • Deciding what to do if the geocoding fails. The quality of the address data may be poor, or cover various addressing standards. Should the house number come before or after the street name for example?

GeoCode - a KML generator

The KML Generator, reads an arbitrary data format from a text box, and discovers the locations specified on a map. The data is then displayed in KML format.


Initialisation

The following values need to be used for the formats described above:

CSV Tab Delimited Right Move
Start of Header Indicator blank blank #DEFINITION#\n
Start of Data Indicator \n \n #DATA#\n
Record Separator \n \n \n\n
Field Separator , \t ^

Note.pngNote: 

  • The character sequence \n may be used to indicate a new line
  • The character sequence \t may be used to indicate a tabbed white space

Each record will be transformed into a KML <Placemark>. Obviously the decision as to which fields to add to the KML will depend on the data provided. The following KML elements are supported:

  • ID Field: This translates to the <Placemark> id attribute, a unique indicator for each marker.
  • Style URL Field: This translates to the <styleUrl> within the <Placemark> - it can be used to change the appearance of the marker by addding an associated <Style>
  • Description Fields: This translates to the <description> within the <Placemark>, it is able to support HTML tags such as <b> or <h2>
  • Name Fields: This translates to the <name> within the <Placemark>, it must be plain text
  • Address: This will hold the address for which the geocode attempt was successful. See addressing attempts for details.


By default HERE Maps will display the <name> and the <description> of the marker in an infobox when clicked. This may be styled by defining an associated <BalloonStyle>, defining a <Style> is not within the scope of this article.

All these variables must be initialised, and read in from the form as shown:

   var headerStart , dataStart, lineSep, fieldSep, addressFields,  descriptionFields,  nameFields ,  idFields, styleURLFields;
  var dataInput = document.getElementById('dataInput').value;
 
headerStart = document.getElementById('headerStart').value; //"#DEFINITION#\n" for Right Move.
headerStart = headerStart.replace("\\n", "\n").replace("\\n", "\n"); // Convert up to two \n into carriage returns
dataStart = document.getElementById('dataStart').value; // "#DATA#\n" for Right Move.
dataStart = dataStart.replace("\\n", "\n").replace("\\n", "\n"); // Convert up to two \n into carriage returns
lineSep = document.getElementById('lineSep').value;// ; "|\n\n" for Right Move.
lineSep = lineSep.replace("\\n", "\n").replace("\\n", "\n"); // Convert up to two \n into carriage returns
 
fieldSep = document.getElementById('fieldSep').value;
addressFields = new Array();
 
// Each of these address strategies will be tried in turn.
addressFields.push(document.getElementById('addressAttempt1').value.split(fieldSep));
addressFields.push(document.getElementById('addressAttempt2').value.split(fieldSep));
addressFields.push(document.getElementById('addressAttempt3').value.split(fieldSep));
addressFields.push(document.getElementById('addressAttempt4').value.split(fieldSep));
 
// Any fields added here will be appended to the <address> element.
descriptionFields = document.getElementById('descriptionFields').value.split(fieldSep);
// Any fields added here will be appended to the <name> element.
nameFields = document.getElementById('nameFields').value.split(fieldSep);
// This field will be the id of the <Placemark> element.
idFields = document.getElementById('idField').value.split(fieldSep);
// These fields will make up the <styleURL> of the <Placemark>
styleURLFields = document.getElementById('styleURLFields').value.split(fieldSep);
 
// Convert up to two \t into tabs
fieldSep = fieldSep.replace("\\t", "\t").replace("\\t", "\t");

Geocoding

A call to nokia.places.search.manager.geoCode() is required to process the address of each element. We need to wait for the function to finish (by adding the onSearchComplete as a callback function) and then add the marker to the map if found. The methods for adding the marker and holding the state of the KML data are the same as in the How to create a KML data file example, and have been removed from the code snippet for clarity. It is important to notice that the next call to doNextGeocode() is made once the previous search has finished, hence each record will be processed sequentially.

// Search Manager taken directly from playground examples.
var onSearchComplete = function (data, requestStatus) {
if (requestStatus == "ERROR") {
// Try again with geocoding the same data, using alternate fields
// to define the address.
lat.innerHTML= "Not Found: " + address;
addressingAttempt++;
if (addressingAttempt == addressFields.length){
// There are no more fall back addressing options.
// Move on to the next record regardless.
addressingAttempt = 0;
currentRecord++;
}
} else {
 
var markerData = new Object();
// Since we have an address we can add the current data to the map
// as the addressing data has been found to be valid.
markerData.coords = data.location.position;
 
 
markerData.id = id;
markerData.title = map.objects.getLength() + 1;
markerData.description = description.trim();
markerData.name = name.trim();
markerData.address = address.trim();
markerData.styleURL = styleURL.trim();
addMarker(markerData);
// Center on the new marker and start to process the next record.
map.setCenter(data.location.position);
addressingAttempt = 0;
currentRecord++;
lat.innerHTML= "Found: " + address;
}
// Find the next address, either a new record or using new address fields.
doNextGeoCode();
}

doNextGeoCode() kicks off the geocoding process, the current record is split into fields, and the address, description, name etc are calculated and held as global variables. The search manager is called after each request (which then chains back to doNextGeoCode() to process the next record. If no further addressing strategies can be tried, the record is not added - the code could be altered here to alert the user if necessary. Once we have completed all the records, the KML is generated using the saveMapObjects(map) method taken directly from the How to create a KML data file example.

  // Obtains the Longitude and Latitude of the next record
// Based upon the data in the chosen fields of that record.
function doNextGeoCode(){
 
if (currentRecord < data.length){
 
var dataRecord = data[currentRecord].splitCSV(fieldSep);
 
address = getFieldsFromDefinition(addressFields[addressingAttempt], headers, dataRecord );
description = getFieldsFromDefinition(descriptionFields, headers, dataRecord );
name = getFieldsFromDefinition(nameFields, headers, dataRecord );
id = getFieldsFromDefinition(idFields, headers, dataRecord );
styleURL = getFieldsFromDefinition(styleURLFields, headers, dataRecord);
 
// Assuming we have an address to try, we should geocode it.
if (address != "" ){
nokia.places.search.manager.geoCode({
searchTerm : address,
onComplete: onSearchComplete
});
} else {
// Otherwise we need to try another addressing strategy.
addressingAttempt++;
if (addressingAttempt == addressFields.length){
// Since we have run out of addressing strategies,
// try the next record.
addressingAttempt = 0;
currentRecord++;
}
doNextGeoCode();
}
} else {
// We can generate the KML
saveMapObjects(map)
}

The splitting of each data record into fields is achieved by a standard library for splitting up texts.

String.prototype.splitCSV = function(sep) {
for (var foo = this.split(sep = sep || ","), x = foo.length - 1, tl; x >= 0; x--) {
if (foo[x].replace(/"\s+$/, '"').charAt(foo[x].length - 1) == '"') {
if ((tl = foo[x].replace(/^\s+"/, '"')).length > 1 && tl.charAt(0) == '"') {
foo[x] = foo[x].replace(/^\s*"|"\s*$/g, '').replace(/""/g, '"');
} else if (x) {
foo.splice(x - 1, 2, [foo[x - 1], foo[x]].join(sep));
} else foo = foo.shift().split(sep).concat(foo);
} else foo[x].replace(/""/g, '"');
}
return foo;
};

Addressing Strategies

The quality of the address data in an arbitrary data set is by definition unknown, and may not match the geocode data held by Nokia. It will be necessary to attempt to mitigate this, by attempting to form a recognised address from different sections of each record. Several text boxes are displayed offering a variety of addressing strategies. These should hold the field separated header names used to build up the address. When the address is created using the getFieldsFromDefinition() method each field corresponding to a header element is added in turn. This means you can try various addressing strategies from the most specific to the least specific. For example using the fields defined in the example CSV file:

 serial, name , unused,address_street, address_city, address_district, description
13556, Millenium Stage, Something, 401 Bay Drive,New York, This is an example
3243, The Chambers, Something else, 53 Bothwell Street, Watford,Hertfordshire, Some more text here
9954, 4th Dimension, More data, 226 Myrtle Ave , Toronto,,
8645, E.K.B.R, Even more data, Invalidenstrasse 117, Berlin,, Hier gibts etwas

The following addressing strategy would be used:

  1. address_street,address_city,address_district
  2. address_city,address_district
  3. address_district

Note that each field must be separated by the standard field separator - in the case above a comma. If the geocoding service recognizes the full address, the marker is added, otherwise a more general strategy is used. It would also be possible to alter the order of the fields (e.g. put house number before or after street) or to add in hard coded field, since if an element of the addressing strategy does not correspond to a field in the header record it is added directly to the addressing strategy. For example, if you have a list of cities in Spain, the following addressing strategy would avoid placing "Toledo" in Toledo, Ohio, USA:

  1. address_city,Spain


addressFields = new Array();
 
// Each of these address strategies will be tried in turn.
addressFields.push(document.getElementById('addressAttempt1').value.split(fieldSep));
addressFields.push(document.getElementById('addressAttempt2').value.split(fieldSep));
addressFields.push(document.getElementById('addressAttempt3').value.split(fieldSep));
addressFields.push(document.getElementById('addressAttempt4').value.split(fieldSep));
function getFieldsFromDefinition(definition, headerFields, dataRecord ){
var result = "";
for (var defFieldCount = 0; defFieldCount < definition.length ; defFieldCount++){
for (var headerFieldCount = 0; headerFieldCount < headerFields.length ; headerFieldCount++){
 
if (headerFields[headerFieldCount] == definition [defFieldCount]){
if (headerFieldCount <= dataRecord.length){
result = result + dataRecord[headerFieldCount] + " ";
}
break;
}
if (headerFieldCount == headerFields.length - 1){
if ( headerFieldCount <= dataRecord.length){
result = result + definition [defFieldCount];
}
}
}
 
}
return result.trim();
}

getFieldsFromDefinition() is also used to build up the data to be held in the other KML elements. Most of these elements will hold simple ASCII text. The <description> element is the only element which is capably of holding formatted HTML. As such it is possible to use the same "hard coded field" method in the description definition to add HTML directly to the <description>. For example to add an image to the <description> following description definition could be used:

<img src="http://www.example/path_to.image/^IMAGE_FIELD^"/>^SOME_OTHER_DESC

Where IMAGE_FIELD and SOME_OTHER_DESC are header definitions, the field separator is a circumflex ^ and the HTML is hard coded into the description definition.

Working Example - Right Move

EstateAgent.png

A working example of a KML generator can be found here:

http://heremaps.github.io/examples/examples.html#generate-kml__generate-kml-file-from-data


Once the data is generated, it may need to be edited using a KML Editor

The following <Style> - changing the colour of the marker was manually added to the generated KML file after processing the file. The colour of the marker is different for one, two, three, four and five bedroom properties.

<Style id='1'><!-- i.e.  A style for properties with 1 bedroom-->
<IconStyle>
<Icon>
<href>http://www.developer.nokia.com/Community/Wiki/images/0/02/000000.png</href>
</Icon>
</IconStyle>
<BalloonStyle><text><![CDATA[<h2>$[name]</h2><p>$[description]</p>]]></text></BalloonStyle>
</Style>
... etc...

The final result may be seen by loading the following samples of KML data:

Note.pngNote: in order to load a KML file successfully, the generated KML file should be hosted on the same domain as the JavaScript or the results may be unpredictable. Some browsers will automatically prohibit cross-domain access.

For example, if you are hosting at example.com, the final line of the JavaScript to load the KML will need to be:

kml.parseKML("http://example.com/" + "generated_kml_data_file.kml")
and both the KML loading HTML and the generated_kml_data_file.kml should be placed on http://example.com/

Tip.pngTip: The KML reader of the HERE Maps API for JavaScript will ignore elements it is unable to interpret. KML readers vary and are more or less forgiving in the strictness of interpreting the KML specifications. It is recommended that you validate your KML syntax yourself through an online validator such as: http://feedvalidator.org

Encoding the extended character set

Under the strictest definition of KML, the data in the file must not contain any characters which lie beyond the simple ascii character set (0-127). This presents problems for the following:

  • Any language which does not use the roman alphabet (e.g Russian, Chinese)
  • Any language which uses accents on its characters such as e.g German (ä ö ü ß) or Polish ( ą ć ę ł ń ó ś ź ż)
  • Any situation which requires currrency symbols such as ¥ £
  • Also the use of the single ' and double quote marks" and the ampersand symbol & have special meaning in KML.

In order to create well formed KML, the data must be modified to use the unicode character codes or HTML codes for the extended character set, examples are given below:

Example Data Escaped format for <description> Escaped format for other fields.
São Paulo S&atilde;o Paulo S&amp;atilde;o Paulo
北京 &#21271;&#20140; &amp;#21271;&amp;#20140;
Köpenicker Straße K&ouml;penicker Stra&szlig;e K&amp;ouml;penicker Stra&amp;szlig;e
ul. Stefana Wyszyńskiego 23

65-536 Zielona Góra

ul. Stefana Wyszy&#324;skiego 23
65-536 Zielona G&#243;ra

ul. Stefana Wyszy&amp;#324;skiego 23
65-536 Zielona G&amp;#243;ra

£ 500,000 &pound; 500,000 &amp;pound; 500,000
& &amp; &amp;amp;

A library function for transforming extended characters into KML readable equivalents has been added and is called on the data of each element prior to creating the KML data set so that the overall KML file remains well formed. The data is singularly encoded within the CDATA section and doubly encoded outside of the CDATA section. In order to maintain the readability of the file (and to reduce the size), the encoding only takes place when an extended character is found. This function could be replaced with a String.prototype but not all browsers will support it.

markerData.id = toUnicode("&amp;#",map.objects.get(i).$data.id);
markerData.latitude = map.objects.get(i).coordinate.latitude;
markerData.longitude = map.objects.get(i).coordinate.longitude;
markerData.description = toUnicode("&#", map.objects.get(i).$data.description);
markerData.name = toUnicode("&amp;#",map.objects.get(i).$data.name);
markerData.address = toUnicode("&amp;#",map.objects.get(i).$data.address);
function toUnicode (prefix, input){
var output = "";
var splitInput = input.split("");
for (var i = 0; i < splitInput.length; i++){
var currentChar = splitInput[i];
// Encode any extended character plus &
if (currentChar.charCodeAt()> 128 || currentChar.charCodeAt()== 38 || currentChar.charCodeAt()== 39 ) {
output = output + prefix + currentChar.charCodeAt() + ";";
} else {
output = output + currentChar;
}
}
return output;
}

Summary

It should be possible to generate KML data for a map from an arbitrary data set, and once it is in a standard format it becomes an elementary exercise to display the data using standard techniques.

This page was last modified on 20 December 2013, at 18:28.
373 page views in the last 30 days.
×