Simply taking a GPS watch on a single run does not provide sufficient data to reasonably evaluate its GPS Accuracy. To perform my testing I measured a course and repeatedly ran the course with each device, gathering 100s of data points on each.

1 The Course

So to gather the data for this test I marked my usual running route at quarter-mile intervals. The course I run along is a little challenging for GPS, with lots of twists, tree cover, power lines, and one bridge that I go under. The bridge carries a 4 lane road, so it's wide enough to cause most watches to briefly loose GPS signal. However, I believe that it's reasonably representative of real-world conditions, and probably less challenging than running in the city with skyscrapers. At both ends of the course there is a turnaround, and I set the mark an eighth of a mile from the end. That way I can evaluate how well the watches handle an about turn.

This is the course I use to evaluate the accuracy of GPS Running Watches.

2 Course Measurement

The route was measured using the USATF course certification process. Some highlights:

I used the Jones counter on a mountain bike with hybrid road tires inflated to their maximum pressure. Tires were warmed and pressure was checked before and after certification.
I steel tape measure was used to lay out the course, with temperature adjustment for ambient air temperature.
The calibration runs were all within the required 3 counts.
I did NOT use the 0.1% calibration adjustment as I did not require "short course prevention."
The measurement ride started at one end with the wheel on the end of the asphalt. The first measurement was after 1/8 of a mile, then every 1/4 mile until the last marker that was at 1/8 mile. Permanent markers were placed at each interval.
All tangents were taken within the required 12 inches of the apex, aiming for approximately 6 inches which reflects my usual distance from the edge of the asphalt.

In addition, I verified the course using a surveyor's wheel, measuring each segment of the course twice. All measurements were within a few inches, confirming the accuracy.

3 Data Gathering

On first usage, a watch is left on and with GPS acquired with a full view of the sky for ~45 minutes. This is to ensure the watch can download the full ephemeris data (it should only take 12.5 mins, but I want to be sure.) To prevent per-run startup problems each device was turned on, satellites acquired, and then left for around 5 minutes before the run starts. This is to counter the problem of a device claiming to have acquired the satellites when it only has a minimal lock. The watches are worn on the wrist or held in the hand in roughly the same orientation as it would be on the wrist. The lap button is pressed as I passed over the marker, and if it is not possible to press at the correct time the lap button is not pressed until the next marker. (Software will reject the data from missing lap marks.)

4 Initial Data Preparation

Some preparation was required to allow for subsequent analysis.

From the initial data, the location of each lap maker (point) was estimated. This was based on the clustering of latitude and longitude values. These estimations are only used for determining the nearest lap marker, so accuracy is not critical.
Each point was then named for convenience.
A master list of segments (laps) was then constructed that defines the pairs of points and the correct sequence.

5 Data Extraction

The software uses the TCX format for analysis. Many devices provide TCX files by default, but others require conversion.

6 Data Analysis

The analysis process uses an application I wrote in C#.

Files are scanned to validate they have approximately the right number of laps markers for the distance.
Each file is read and the first GPS location of each lap is mapped to a known point. The predicted sequence is used first and the point is checked to see if it is close to the expected point. If it is not close enough then a simple selection is used to map to the nearest point.
Each pair of points is then mapped to the appropriate master lap. If no master lap exists then the pair is rejected. (This is usually due to missed lap markers.)
The list of pairs of points is then added to lists based on device.
The lists of pairs of points are then analyzed by device and condition.
Each list is summarized by
- The absolute value of the arithmetic mean.
- The standard deviation from the arithmetic mean.
- The standard deviation from the known true value.
For the overall course the data is calculate for each segment and then combined. The mean (trueness) is mean of the means. The standard deviation (precision) is the square root of the average of the squared standard deviations.

7 Footpod Analysis

The footpod data was collected with a no calibration in the watch (factor = 1.0), and then retrospectively calibrated. This allowed me to provide both raw and calibrated data. I tried several calibration methods:

A fixed calibration for each shoe I used in the testing period. This calibration was based on the average of three runs with the shoes.
A simple average of the three prior runs.
A weighted average of the three prior runs using weights of 3, 2, 1 for the three prior days.
The average of the prior day's run.

Calibration	Trueness (Average Distance Error)	Standard Deviation (From mean)	Standard Deviation (From true)
No calibration	2.50% (132.2 Ft/Mile, 25.0 m/Km)	3.83% (202.5 Ft/Mile, 38.3 m/Km)	4.58% (241.8 Ft/Mile, 45.8 m/Km)
Fixed calibration	0.02% (1.2 Ft/Mile, 0.2 m/Km)	1.38% (72.7 Ft/Mile, 13.8 m/Km)	1.38% (72.7 Ft/Mile, 13.8 m/Km)
Simple 3 day average	0.15% (8.0 Ft/Mile, 1.5 m/Km)	2.47% (130.4 Ft/Mile, 24.7 m/Km)	2.47% (130.6 Ft/Mile, 24.7 m/Km)
Weighted 3 day average	0.10% (5.5 Ft/Mile, 1.0 m/Km)	2.41% (127.1 Ft/Mile, 24.1 m/Km)	2.41% (127.2 Ft/Mile, 24.1 m/Km)
Yesterday's average	0.03% (1.4 Ft/Mile, 0.3 m/Km)	2.51% (132.3 Ft/Mile, 25.1 m/Km)	2.51% (132.3 Ft/Mile, 25.1 m/Km)

8 GPS Satellite Analysis

I calculated the theoretical GPS error for each point using the agi.com C# library. I followed their cookbook for GPS Accuracy Analysis, which consists of:

Create a constellation of satellites using AGI's SEM formatted almanac.
Modify the constellation based on AGI's outage data.
Configure a GPS receiver model.
- Set Elevation Angle Constraint to between 5 and 40 degrees.
- Set the latitude/longitude to that recorded by the point.
- I set the receiver to have 12 channels, which is the norm for a GPS watch.
Evaluate the recorded GPS accuracy, Dilution of Precision and the number of satellites in view.

9 Sources of Error

There are a number of possible sources of error.

The temperature adjustment for the steel tape used ambient air temperature and it's possible that the asphalt was warmer.
I may have shifted my weight between the front and rear of the bike between calibration and measurement. However, I attempted to keep my weight on the rear of the bike at all times.
The certification may not have followed the shortest path. A sample of the measurements were validated a second time and the variation was within a couple of inches.
The weather, satellite positions, etc., may all change accuracy. However, the large number of sample should average out these variations.
The runs may not have followed the shortest path. This happens occasionally when I have to go around someone, but my course is quite enough that this occurs quite rarely.
For the turn around laps, my path may have varied more than other types of lap. I took care to have one foot reach the marker without going past it.
I may not have pressed the lap button directly over the lap marker. I place my hand on the button ahead of the marker and press as I go over, so I believe these errors should be low.
The devices may have a delay between the button press and the recording of the data. However, I am analyzing the distance recorded between recorded lap markers, so this error should be the same for all markers.
There may be differing processing during the creation of the TCX files. For instance, some tools will perform elevation correction. However, the reported accuracy of the testing matches what I generally see reported by the watches in use.

GPS Testing Methodology

Contents