Testing the Location Telemetry of a (Flutter) Mobile App

This is a report on a comparative field test I planned and performed, which tested the "Location Telemetry" of my Senior Software Engineering Capstone team's Flutter App.

Route walked for test in Balboa Park

Motivation

For the last 8 months I have been collaborating on the development of a cross-platform evacuation drill app. This app instructs and guides participants through an evacuation drill. It also records and eventually uploads the location and survey datasets that are generated by participants. These datasets go on to support Evacuation Traffic research.

Our app is replacing a previous workflow for drill participants which included:

  1. Answering "roundtable" discussion questions verbally
  2. Following written instructions during the drill
  3. Using an exercise tracking app to record and export a location dataset

My team had previously tested and ensured that the app could administer surveys individually and record the results. We had also tested and ensured that the app could deliver novel instructions as prepared by the drill coordinators.

However, we had not yet tested the app's ability to generate and upload location datasets (also known as trajectory datasets: location + time). In order to test this functionality of the app I planned and carried out a field test. To determine the efficacy of our app in comparison with the previous workflow, I planned the field test as a comparison with the previously used exercise tracking app.

Tools

I currently have three operational smartphones;

  • an iPhone 11
  • an iPhone SE (1st gen)
  • a moto g(7)

I used all three devices in the test.

To load our app onto the devices I used USB links to my development machine and installed the app using the flutter run command.

To load the exercise tracking app onto the devices I visited the relevant App Store pages and installed the app.

Procedure

There were several concerns which shaped my testing procedure:

  1. I needed to compare the results I generated against each other.
    • To generate a collection of datasets which could be meaningfully compared, I would need to track my location while repeatedly walking an identical route.
  2. A known issue was causing early test results to show the output from the moto g(7) as approximately 60 ft below what should have been recorded.
    • To determine the exact offset I would need to gather results from a route which had varied elevation to produce signal despite potential low-level noise.
  3. The was danger of a device battery running out of charge part-way through the test.
    • I needed to mitigate the possibility of gathering much more data with one app versus the other. So each lap I would need to alternate which app I was testing.

These concerns informed my search for a test location. My priorities for the test location were for it to be:

  • near public transit for ease of access
  • a ~0.5km loop that I could walk repeatedly in a reasonable amount of time
  • a minimum 10 meters of elevation change (which would create datasets that I could determine the Android elevation offset from)

I eventually settled on this route in Balboa Park (I currently live in San Diego, CA):

Route walked for test in Balboa Park

To comparatively test both apps, I planned to walk the route 10 total times: 5 times running each app, exporting the results after each complete lap.

Results

In the end, I generated the following quantity of location datasets:

Device Exercise Tracking App Evac. Drill App
iPhone 11 1 1
iPhone SE 2 3
moto g(7) 3 3
Number of datasets generated, by device and app.

Upon returning to my development machine I compiled the results from both apps and began to visually inspect the GPS data. I generated several comparative displays, the most interesting of which I have included below.

There is clearly something awry with our telemetry!
These results look ok, but there is a small error in our app's data. I only gathered one dataset for each app, as this phone was my bus pass home and the battery was nearly dead!
Here we see clear errors in our app's data.
Wow, here we can see our app's data is really wacky. But the exercise app's data is a bit wonky as well…

I then extracted the elevation data and created elevation vs. time plots for each device:

Our data is less smooth, but generally looks ok…
A false-start trial is evident in this view (in green)
The level of noise in this data was completely unexpected, but it was replicated in later trials.

These results led to the following conclusions:

Conclusions

  1. Our app needs an “Acquiring GPS Signal” feature to filter out initial noise.

This is a feature which the exercise tracking app employs, and is evidently the reason for the erroneous GPS data at the start of the iPhone SE datasets. Such a feature will need to begin the streaming of location data, and determine when the signal is settled by using a decaying timer which resets upon any data point that is significantly different from the previous data point.

  1. Noise in GPS data varied more by device than by app.

While the trajectory smoothing algorithms do set the exercise app results apart, this difference is smaller than the difference between results from different devices. This is significant because it means that the data our app generates is worth using for the intended application of Civil Engineering Evacuation Traffic research.

  1. Researchers may need to post-process calculation elevation data from GPS data if device altimeter outputs poor quality data.

The exercise app calculates elevation data from GPS data if device lacks barometric altimeter. Researchers using our evacuation drill app can perform the same calculation in post-processing of datasets. However, this will take time and monetary resources. Future development teams may choose to implement an automatic post-processing feature to mitigate the time spent performing these corrections.


Upon being presented with these test results our Project Partners deemed our evacuation drill app's Location Telemetry performance to be adequate for their research needs.

Subsequent testing will need to be performed when the recommended features are added, and on a regular basis as app development continues to ensure that unrelated features do not corrupt this functionality.