If you read our blog update about our time at Global EMC doing our EMC and safety testing, you may recall we’d noticed something strange: our new firmware seemed a little unstable (it stopped “broadcasting”). We’d thought it was a software bug and have spent a ungodly amount of effort trying to trace what’s going on, which has been exacerbated by the holiday and flu season.
To help track down the issue, Lorenzo had written some neat code that implements a watchdog timer (a way of ‘rebooting’ our devices when they become unresponsive) and this also sets the LED to be constantly be lit red if this happens as a quick visual indicator.
Something I noticed last weekend was that I could consistently make this LED go red by rapidly changing the absent temperature the device was exposed to (it was about -15c here at the time, so it was easy as I wander around with a Wimoto Climate in my backpack all the time, and going from outside to inside was enough to trigger the issue).
We had been working with our silicon partner, Nordic Semiconductor, since before Christmas trying to figure out why the devices would randomly, spontaneously reset — and at least now we had a working theory as to what was causing the reset (rapid temperature changes). We setup some A-B test environments in the office (using our office fridge’s freezer and a ceramic heater), and we were able to prove consistently that fast temperature variations would trigger the watchdog. More interestingly, older versions of our firmware and even our development kits did not exhibit the issue — so the plot thickened.
Armed with the above information, we’ve been working with Nordic’s great support team (first their tech support in Norway, and now one of their Application Engineers in California) on a fix. It turns out that a constellation of factors (silicon revision, SDK version and Bluetooth stack version) lead to the low-freqency clock becoming unstable. Because the low-frequency clock is onboard the chip and is an RC (Resistor-Capacitor) oscillator, it’s temperature sensitive. Under normal circumstances, the main system clock (HFCLK) periodically wakes up and “recalibrates” the low-frequency clock. In newer versions of the SDK it does this based on temperature changes at the silicon die level — it takes 35-microseconds to measure the temperature of the die and 17-milliseconds to recalibrate the clock, so the former is more power friendly.
Nordic’s initial advice was to use a later revision of the silicon, which isn’t really an option when you have thousands of Wimoto’s already made, and thousands more chips sitting at your contract manufacturer. Although we’ll be using a later silicon revision in later builds. Another option would be to use an external low-frequency clock, and we’ve already designed that in to later hardware revisions — although the on-chip one is expected to work too.
We were going to start shipping last week but have suspended shipping anything until we get 100% clarity on this firmware issue. Why is it a big issue, even if it happens intermittently or randomly? Well, it upsets the data-logging capability of our devices and also currently clears any set alarms, although we can work around the alarm issue by persisting them to local Flash memory if we have to. Whilst the data-logger is not necessarily a feature everyone will use, it is an important feature and one we promised we’d deliver. It also uses the realtime clock, although we haven’t implemented the “this alarm happened at this date/time” feature in the apps yet.
We have a number of options as things stand today:
(1) Find a software workaround in the current firmware. This is our preferred option, of course, and we’re somewhat hopeful this can be done (with no major downsides). We’re a little bit at the mercy of Nordic in this respect as it’s their silicon and they have the expertise to fix this. However, our Application Engineer is working with us today to try and find a solution. I’m keeping my fingers crossed that either today or Monday, we’ll have a fix, and now that we know the nature of the problem, it’ll be very quick to test and then burn this firmware to boards and get some out of the door! This is “Plan A”.
(2) Ship our old firmware. This still works with our apps, but we lose what we call “concurrent broadcast and peripheral mode”. In simple terms, this means that we lose both the ability to work with an iPhone and a gateway (‘Mesh’ with a capital ‘M’) at the same time and it also limits our next feature — meshing (‘mesh’ with a small ‘M’). We also have to test that our over-the-air firmware updates and boot loader will still function properly, so Lorenzo will do that today. This is currently “Plan B”.
(3) Throw away ~2,000 Climates ~300 Sentry’s and make new ones. This is actually an option we’ve considered, even though it’s incredibly expensive (tens of thousands of dollars). One of the challenges, though, is replacing the existing system-on-a-chip with a revision that has a newer Nordic nRF51822 silicon in it. The semiconductor industry works very slowly and we’re looking at a 90-day “factory lead-time” as the SoC is built for us; which isn’t really a viable option. Time is actually bigger deterrent to us at this point than money.
We’ll provide another update on Monday (or sooner) regarding this as it’s a critical path issue. We’ll also make a decision one way or the other on Monday the 19th.
Test Flight Feedback
We had fifteen or so people sign up to play with our iOS app via Test Flight and a couple of people who weren’t running iOS8 who we’ve manually gotten the app to (our app runs on iOS7; Apple’s Tests Flight is an iOS8-only feature, but we have a procedure to get the app to iOS7 folk for beta-testing purposes).
The biggest and loudest feedback was that the “hamburger” menus were confusing in Build 205, so in Build 206 we’ve:
– Removed the hamburger menu icons at the top of the screen
– Expanded the “wave” area at the bottom of the screen
– Built a full button bar in the “wave” area that has buttons for the drawer menus that contain the list of Wimoto’s you have (right drawer), and the settings and other information (left drawer)
– Added a quick shortcut to add additional Wimotos’ directly from the list of Wimoto’s in the right drawer menu
– Generally tightened-up the UI and UX
Build 207 will be the one that will be the first release in the App Store, which should happen next week (the approval counter unfortunately resets with every new build). Build 208 adds an app-walkthrough, which we think is also important, and that should be the first “App Update” push if we can’t sneak it in to the Apple review process.
Build 100 of the Android app should hit Google Play within the next few days. It’s still trailing the iOS app in terms of functionality, but it’s better that we get it out there and improve it rather than wait for it to reach parity and then release it. Build 200 will be the first build that mimics the iOS functionality and is currently 2-3 weeks away. Luckily there’s no Google Play shenanigans to contend with for “appcessories” makers 😉