Another trip through the mirror

The problem

In our new application, we are integrating Logi Info, a reporting UI tool, to handle a page of interactive reports. The software is essentially aimed at doing the same thing we are, but is easier to deal with and offers more advanced options out-of-the-box since it’s a specialized tool rather than a coding framework. This process relies on embedding their reports into our own Dashboard.

Logi proved to be a persistent thorn in our sides for some time. The tool was adopted thought the company and while other teams were getting it to work, we were hitting brick walls over and over largely due to complexity of our environment. The first issue was getting it to embed at all, which required a bit of finagling and using a secondary more complex method of embedding when the simpler HTML static embedding didn’t quiet work out. After that we had to deal with the issue of getting the server to talk to our API. Our API uses a security token and we had to transfer that token from the user’s local machine running JavaScript back to the Logi Server sitting on our end so it could authenticate with the web API sitting on the same server. (Yes, we had to have the server send a token to the user so it could be sent back to us so the server could talk to itself. Programming is weird). Logi has parameter passing and can be configured for HTTPS to allow for secure transport of the token, so it seemed like an easy task. Then came our protracted battle with the security token.

Left my golden token in my other jacket

In testing, we could pass a token to Logi, have it render a debug report to tell use what it received, and see the token being passed. Yet, we were seemingly randomly getting 403 errors for an incorrect token. In addition, Logi seemed to render stable reports but when trying to summon up debug information in the report, we keep getting the equivalent of 404 errors for missing debug files. We had a contractor that had worked for Logi Analyticals, (who own Logi Info), previously working with us, but his focus was on the design aspect and he did not have a solution for us, even after talking to his contact in the company. A such we set about trying to figure out the problem over the next couple of months.

This process was frustrated by having several solution that appeared to work temporarily but later showed the same seeming random behavior of missing files or dropping the token inconsistently. We figure that perhaps the token might be being dropped because it was being stored in a temporary or semi-temporary variable on the server, so we had it pass the token to another variable in the user session. That worked for a day, then the problems started again. We noticed that the token was being passed correctly every time, but sometimes it did not seem to show up in debug when the report was actually render, so another developer wrote a plugin that set the token value when the user session was created, right after it got the token and before the report began to render. This worked for a few days, then the problems appeared again. We though it might be an issue with user sessions since certain update information that we were passing to the server was not actually getting updated properly, so we put effort into trying to invalidate sessions immediately and going to a session-less mode. This is only quasi-possible in Logi anyways and once again did not solve things.

From hell’s heart I stab at thee

Around this time, the task of working on Logi came back around to me as my primary task when another developer had other pressing tasks. After having a stare down for some time while I tired to wrap my head around what could be going on, I noticed an issue that seemed to explain our missing token. Sometimes when the browser made a report request it would get a response and sometimes it would get a 302 redirect and then get a report. This seemed to correspond to when Logi would hiccup and lose information. After looking thought the packets to see if this was significance, I noted that the 302 redirects seemed to include only some of the parameters we were passing to the report request. After a quick Wikipedia search, I found that 302 redirects, by their very nature, drop all post request information. The partial information that was being written to the new URL must be Logi partially translating the post for the redirect.

This seemed odd and looked like a problem with Logi itself. Since this behavior was buried in some DLL somewhere, I wouldn’t have much recourse but to simply file a ticket and see what happened. Being leery of posting tickets for only half investigated issues (don’t you hate those?), I continued to look into this. After asking myself what could cause the website to want to redirect rather than just send back the data, I remembered that for us the application continuous loaded data slowly while other departments reported that the reports would load slowly once and then cache and run more quickly. Figuring that a bad cache mechanism may be deleting some data and forcing the report to run again (and thus redirect when the original vanished) and cause this issue, I delved into Logi’s cache files. Sure enough, the cache files were appearing during rendering and rapidly vanishing when the report rendered. The documentation I found noted that these temp files should stick around for at least 1 hour by default and only have garbage collection once every 5 minutes.

This cache issue explained our missing debug files (which would be in these temp files that were deleted erroneously), and the mysterious 302s that were dropping our token and causing our 403s. After working with some of IIS’s settings to see if Windows was not caching the files due to size restrictions, None of the settings seemed to affect anything. At this point we submitted a ticket to Logi, but this time we had a specific problem that we could describe and had narrowed down our issue to one specific thing. Sure enough, the support team actually recognized this issue and directed us to a patch for a bug in the version of Windows our server was running. We applied the patch and we finally, finally fixed the issue.

In conclusion

I’m not sure what, if any, moral there is to this story, aside from the fact that sometimes chasing bugs is like having a serpent drag you though wormholes across the universe. Perhaps, its just that these trips down the rabbit hole suck but you’re guaranteed to learn something you never knew you wanted to know.