In The Deep Rough - Lessons Learned From Logging On The Golf Course

March 6, 2024 · 8 min read

Kevin is a seasoned Android engineer and now an engineering manager, brings 13 years of experience in the IoT realm, focusing on developing innovative software for devices such as hearing aids, irrigation systems, and golf sensors. Throughout his career, Kevin has extensively utilized cross-platform technologies like Xamarin, Kotlin Multiplatform, and Rust. Kevin leads by example, inspiring his team through dedication and a hands-on approach, ensuring they achieve excellence together.

Lessons Learned From Logging

Intro

Writing software can be extremely complex. This complexity can sometimes make finding and resolving issues incredibly challenging. At Arccos Golf, we often run into these kinds of problems. One of the unique things about golf is that a round of golf is like a snowflake or a fingerprint, no two are alike. A golfer could play the same course every day without ever replicating a round identically.

Thus, trying to write software rules about the golf domain inevitably leads to bugs. And since no two rounds of golf are the same, trying to reproduce a user issue they encounter on the golf course is nearly impossible. So, what have we done to attempt to track down some of these issues? You guessed it, logging.

Logging has proven to be an indispensable tool in my workflow, especially when developing new features. This article aims to guide you through key questions that shape a successful logging strategy along with some considerations around performance. Concluding with practical insights, it features a few case studies from Arccos Golf, demonstrating how logging has been instrumental in resolving real-world bugs.

Figure 1: The Arccos app showing several shot detection modes available (Phone & Link)

Should you log?

When trying to track down a bug or build a new feature, and you're considering logging, the first question to ask yourself is, “Is logging the right choice?”. There are many things to consider when deciding whether to add logging to a particular feature. Some of those considerations are:

Do I have any other tools at my disposal that could be useful if this feature fails? This could be an analytics platform that shows screenviews, and perhaps logs metadata in another way besides traditional logging.
Will adding logging harm the user in any way? This includes privacy, security, and performance.
How will I actually get or view these logs if something goes wrong?

What to log?

Logging should be strategic, focusing on areas that yield the most significant insights.

Identifying Critical Workflows: Determine which parts of your app are crucial for both users and your business. For instance, in a finance app, logging transaction processes is key.
Focusing on Error-Prone Areas: Analyze past incidents and pinpoint sections of your app that are more susceptible to errors. For example, areas with complex database interactions or integrations with 3rd party SDKs might require more intensive logging.

What About Performance?

One of the primary challenges with logging is its impact on performance, a concern that becomes more pronounced when dealing with extensive string creation. To mitigate this, consider the following tips:

Method Calls in Logs: Be wary of incorporating method calls within your log statements. These methods can be opaque, masking the complexity or time-consuming operations they perform internally.
Log Sparingly: Practice judicious logging. Over-logging, particularly in loops, can severely degrade performance. Log only what is essential for debugging or monitoring.
Asynchronous Logging: If your logging involves file operations or third-party libraries, always ensure that these tasks are executed on a background thread, thus preserving the main thread's responsiveness and application performance.

Implementing these strategies will help you strike a balance between obtaining valuable insights from logs and maintaining optimal application performance. I have found that you develop an intuition about what to log the more you practice and learn about the intricacies of your system.

How Do I Access The Logs?

The most straightforward and easiest method to access your applications logs is utilizing a third-party software tool like Shipbook, which offers the convenience of remote, real-time access to your logs.

Finally, I wanted to showcase a few stories illustrating how logging has helped us solve real-world production issues, along with some lessons learned about logging performance.

The 15-Minute Mystery

Our Android mobile app faced an intriguing issue. We noticed conflicting user feedback reports: one showed normal satisfaction, while another indicated a significant drop. The key difference? The latter report included golf rounds shorter than 15 minutes.

Upon investigating these brief rounds, we found that their feedback was much lower than usual. But why? There were no clear patterns related to device type or OS.

The trail of breadcrumbs started when we examined user comments on their rounds, many of which mentioned, "No shots were detected." Diving into the logs of these short rounds, a pattern quickly emerged. We repeatedly saw this line in the logs:

    [2023-12-01 14:20:09.322] DEBUG: Shot detected with ID: XXX but no user location was found at given shot time

This means that we detected a user took a golf shot but we didn’t know where they were on earth to place the shot at a particular location. This was unusual because we had seen log lines like this in our location provider which requests the phones GPS location:

    [2023-12-01 14:20:08.983] VERBOSE: Received GPS location from system with valid coordinates

So, we were clearly receiving location updates at regular intervals but we couldn’t associate them when a shot was taken by the user. After some further analysis, we discovered this line:

    [2023-12-01 14:20:09.321] VERBOSE: Attempting to locate location for timestamp XXX but requested source: “link” history is empty. Current counts: [“phone”:60, “link”:0]

We have a layer above our location providers that handles serving locations depending on which method the user selected for shot detection mode (either their Phone or their external hardware device “Link”). It was attempting to find a location for “Link” even though all of these rounds should have been in phone shot detection mode. Finally, we located this log line:

    [2023-12-01 14:14:33.455] DEBUG: Starting new round with ID: XXX and shot detection mode: Link … { metadata: { “linkConnected”: false, linkFirmwareVersion: null }... }

Once we analyzed this log line it became immediately obvious - The app was starting the round with the incorrect shot detection mode. Some rounds were started with shot detection mode of Link even if the Phone was selected in the UI (Figure 2).

Figure 2: The Arccos app showing a round of golf being played and tracked

We eventually identified the issue and it was due to some changes in our upgrade pathing code if users had certain firmware and prior generations of our Link product. Thankfully, this build was early in its incremental rollout and we were able to patch it quickly.

This experience highlighted the crucial role of widespread effective logging in mobile app development. It allowed us to quickly identify and fix an issue, reinforcing the importance of comprehensive testing and attentive log analysis.

When Too Much Detail Backfires

Dealing with hardware is especially difficult given you can rarely easily get information off of the hardware device. We often rely on verbose logging during the development phase to diagnose communication issues between hardware and software. This approach seemed foolproof as we added a new feature to our app, implementing detailed logging to capture every byte of data exchanged with the hardware of our new Link Pro product. In the controlled environment of our office, everything functioned seamlessly in our iOS app.

While on the course testing our app faced an unforeseen adversary: it began to get killed by the operating system. The culprit? Excessive CPU usage. Our iOS engineer, armed with profiling tools, discovered a significant CPU spike during data sync with the external device. Our initial assumption was straightforward – perhaps we were syncing too much data too quickly.

To test this theory, we modified the app to sync data less aggressively. This change did reduce app terminations, but it was a compromise, not a solution. We wanted to offer our users real-time experience without interruptions. Digging deeper into the profiling data, we uncovered the true source of our problem. It wasn't the Bluetooth communication overloading the CPU; it was our own verbose logging.

The moment we disabled this extensive logging, the CPU usage dropped dramatically, bringing it back to acceptable levels. This incident was a stark reminder of how even well-intentioned features, like detailed logging, can have unintended consequences on app performance. We decided to use a remote feature flag paired with a developer setting to be able to toggle detailed verbose logging of the complete data transfer only when necessary.

Through this experience, we learned a valuable lesson: the importance of balancing the need for detailed information with the impact on app performance. In the world of mobile app development, sometimes less is more. This insight not only helped us optimize our Link Pro product but also shaped our approach to future feature development, ensuring that we maintain the delicate balance between functionality and efficiency.

Afterword

In conclusion, our experiences at Arccos Golf have demonstrated the invaluable role of logging in software development. Through it, we’ve successfully navigated the complexities of writing golf software, turning unpredictable challenges into opportunities for improvement. Tools like Shipbook have been instrumental in this journey, offering the ease and flexibility for effective log management. I hope I’ve illustrated that logging is more than just a troubleshooting tool; it's a crucial aspect of understanding and enhancing application performance and user experience.

Intro​

Should you log?​

What to log?​

What About Performance?​

How Do I Access The Logs?​

The 15-Minute Mystery​

When Too Much Detail Backfires​

Afterword​