{"id":3911,"date":"2016-05-13T22:48:18","date_gmt":"2016-05-14T04:48:18","guid":{"rendered":"http:\/\/augerhandle.net\/blogs\/jumpingfish\/?p=3911"},"modified":"2016-05-13T22:54:47","modified_gmt":"2016-05-14T04:54:47","slug":"two-days-in-a-row","status":"publish","type":"post","link":"https:\/\/augerhandle.net\/blogs\/jumpingfish\/2016\/05\/13\/two-days-in-a-row\/","title":{"rendered":"Two Days In A Row"},"content":{"rendered":"<h3>1. Thursday<\/h3>\n<p>We had tested the logic. We made some changes and tested again. And when we deployed into production, we tested again, pushing a few small datasets thru the route to make sure that everything worked as expected, which it did.<\/p>\n<p>\u201cLet me give you a complete data dump now,\u201d Vyas said.<\/p>\n<p>And we ran it thru the system.<\/p>\n<p><em>\u201cThe route is processing the input files,\u201d<\/em> I reported. And then a few moments later, <em>\u201cIt\u2019s generating the output files.\u201d<\/em> And then finally, after a few more moments, <em>\u201cThe output files are being picked up by the listener.&#8221;<\/em><\/p>\n<p>After all the output files were picked up, we waited a moment, and then he confirmed, \u201cI got the data in our system.&#8221;<\/p>\n<p>But a few minutes later someone chimed in on one of our Skype channels, \u201cWe\u2019re getting a bunch of bogus messages without a time tag. And soon after that, there was a cascade of automated notifications and alarms sent out by email.<\/p>\n<p>Although in our post mortem we weren\u2019t so sure that those alarms were related to our errors, and although the root cause of the problem was the format of the input files, it\u2019s indisputable that it was the execution of my code that unleashed those furies.<\/p>\n<p>\u201cSorry guys,\u201d I later said.<\/p>\n<p>\u201cTomorrow,\u201d\u00a0Vyas\u00a0said, \u201cwe\u2019ll turn the system on.&#8221;<\/p>\n<h3>2. Friday<\/h3>\n<p>The next morning, I Skyped Vyas my plan. &#8220;I\u2019ll manually process a few of the oldest files. If they run ok, then we can turn the system on.&#8221;<\/p>\n<p>\u201cAwesome,\u201d he said.<\/p>\n<p>Moments later I was again reporting the progress.<\/p>\n<p><em>\u201cThe route processing the input files,\u201d<\/em> I said. And then, <em>\u201cIt\u2019s generating the output files.\u201d<\/em> And finally, <em>\u201cThe output files are being picked up by the listener.\u201d<\/em> (Sound familiar?)<\/p>\n<p>This time, I could see the results showing up in the output queue. The message count kept rising. I kept watching. The curve kept going up. As the count reached 1200, it was clear that the worker was not pulling anything out of the queue.<\/p>\n<p>\u201cHmm\u2026\u201d Vyas said. \u201cI\u2019m seeing empty payloads.&#8221;<\/p>\n<p>There was again an error of some sort, but worse, this one was blocking all incoming data from any customers.<\/p>\n<p>It was Friday afternoon. The room was dark and mostly empty. Being relatively new to the team, I was woefully unequipped to debug the problem. Fortunately, there were a few generous souls still hanging around.<\/p>\n<p>After about two hours of spelunking, we came up with a workaround. An hour later, sitting alone in the darkness and quiet, I put the final touches on a trouble ticket for the outage. I had also come up with a credible explanation of the root cause which, again, absolved my code of responsibility.\u00a0<\/p>\n<p>But absolved or not, the indisputable fact remains: my stuff broke things in production two days in a row.<\/p>\n<p>TGIF!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>1. Thursday We had tested the logic. We made some changes and tested again. And when we deployed into production, we tested again, pushing a few small datasets thru the route to make sure that everything worked as expected, which it did. \u201cLet me give you a complete data dump now,\u201d Vyas said. And we [&hellip;]<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/augerhandle.net\/blogs\/jumpingfish\/wp-json\/wp\/v2\/posts\/3911"}],"collection":[{"href":"https:\/\/augerhandle.net\/blogs\/jumpingfish\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/augerhandle.net\/blogs\/jumpingfish\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/augerhandle.net\/blogs\/jumpingfish\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/augerhandle.net\/blogs\/jumpingfish\/wp-json\/wp\/v2\/comments?post=3911"}],"version-history":[{"count":5,"href":"https:\/\/augerhandle.net\/blogs\/jumpingfish\/wp-json\/wp\/v2\/posts\/3911\/revisions"}],"predecessor-version":[{"id":3916,"href":"https:\/\/augerhandle.net\/blogs\/jumpingfish\/wp-json\/wp\/v2\/posts\/3911\/revisions\/3916"}],"wp:attachment":[{"href":"https:\/\/augerhandle.net\/blogs\/jumpingfish\/wp-json\/wp\/v2\/media?parent=3911"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/augerhandle.net\/blogs\/jumpingfish\/wp-json\/wp\/v2\/categories?post=3911"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/augerhandle.net\/blogs\/jumpingfish\/wp-json\/wp\/v2\/tags?post=3911"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}