Wednesday, March 19, 2014

Why a "Startlingly Simple Theory" is so Startlingly Wrong

A gentleman by the name of Chris Goodfellow posted to Google+ his theory about what happened to Malaysia Airlines Flight 370.  This post was subsequently reposted on Wired.com and quoted by the Christian Science Monitor.  The Wired piece describes Mr. Goodfellow:
Chris Goodfellow has 20 years experience as a Canadian Class-1 instrumented-rated pilot for multi-engine planes.
There are so, so many things that are factually wrong with Goodfellow's piece that I honestly have to question if he is or ever was actually a pilot or has ever flown an aircraft himself. Regardless of his professional or technical experience with airplanes I cannot in good conscience leave the myriad falsehoods, misstatements, and faulty conjectures he made unchecked.

My background? I am a former US Air Force test pilot with over 5,000 hours in 35+ aircraft, including over 3,000 hours in heavy, multi-engine jets.  I have corroborated much of what's written here with a current 777 pilot.


A nose gear tire fire


The author asserts that a nose gear tire fire could have been the source of a fire onboard. The nose gear area is not pressurized on a 777, so a fire would vent smoke out of the wheel well, not into the aircraft.  

Assuming there was a fire, it would have to burn completely out of control to the point where it penetrated the pressurized bulkhead, and that would have caused an explosive decompression which would in turn probably blow out the fire.  This all assumes that a fire could survive and grow in an unpressurized area like the nose gear well, which is not likely. 

An overheated nose gear tire after takeoff is an almost unheard of event, especially on a commercial aircraft.  The temperature at takeoff that night was probably in the 80s, given the high in the mid-90s that day and the low of 75.  Those are not extreme conditions, so an overheated nose gear tire leading to a fire is even more exceptionally unlikely.


Declaring an emergency with ATC


Ignoring the source of the fire, if the pilots were acting rationally and thought for even a second that they had an electrical fire, or a fire of any sort, they would have advised Air Traffic Control (ATC) of their situation and declared an emergency.  Getting on the ground quickly is pretty much the only thing you care about when you have an inflight fire.  Time is life with a fire. 

The pilots would have communicated their problem to ATC because that's what pilots are trained to do, and they can lend a helping hand in situations like the one proposed.  

Telling ATC that you have an emergency will make you their top priority over all other aircraft, get you immediate radar vectors to a landing field if you like, have them clear other aircraft out of your way, and alert the field you're heading to so they can get the fire trucks rolling.  All competent commercial pilots know this and act accordingly.  But no emergency radio calls were made by MH370.


Oxygen masks and smoke goggles


With a fire or any type of smoke in the aircraft the pilots also would have immediately donned their oxygen masks (with goggles or face masks so they could still see) to prevent incapacitation and allow them to start running the emergency checklists.  

In fact, getting your oxygen mask on is probably the first critical action or "boxed" step (or "boldface" as military pilots would say) when handling smoke in an aircraft inflight.

This isn't a "no-no", as the author suggests, and his statement as such would make any professional pilot question his credentials.  Utterly to the contrary, this is critical to a pilot's immediate survival and ability to deal with the emergency, and should be reflexive with well-trained pilots.  There is a more than ample supply of oxygen for the pilots in such circumstances.  

The author mentions carrying a "smoke hood" in his briefcase.  These devices typically have a reflective mylar hood and a small, self-contained oxygen generator attached to them.  They are designed for crew members to be able to walk around an aircraft performing emergency duties when there's smoke.  They usually contain only a few minutes worth of oxygen, do not typically protect the wearer when the aircraft is depressurized, and are not the type of equipment a commercial airline pilot would use in the cockpit.  


Handling an electrical fire


Turning various parts of the aircraft electrical system off to isolate and remove power from the source of the fire is what the checklist guides you through, and the pilots would have practiced this many times in the simulator.   

The checklist is designed to keep essential electrical components powered to the maximum extent possible, components such as the flight instruments (so you can fly the aircraft), the VHF radio (so you can communicate), and the transponder (so ATC can track you and help you).  

Perhaps the author's experience is limited to small aircraft that have a single "master power" switch that turns everything off, but that's not how a 777's electrical system is designed or how it's managed in an emergency.  Even with a large majority of the aircraft's electrical system depowered certain key components would still operate on battery backup, and I venture to guess that includes at least one radio.


"Aviate, navigate, and communicate"


If the pilots had a fire that didn't immediately kill them they undoubtedly would have run these emergency procedures, and they take some time, more than enough time to at least make a radio call.  

Even if the emergency checklist had them turn off power to the transponder and all radios, this would most likely be called out in the checklist and in recurrent training, giving them an opportunity to preemptively make a radio call prior to losing all communication.  

And for that matter, if they needed to, they could repower the system long enough to squeeze off a "Mayday" call if they forgot to do so beforehand.

Alternatively, they could have set the transponder to the Emergency setting, which replies with a special code that tells ATC you're having an emergency, even if you can't talk via the radio.  If the pilots had an emergency they could have alerted ATC, but this simply didn't happen.


How NOT to handle a fire


The author makes a point about how supremely experienced the captain of the aircraft was.  If he'd had an electrical fire a pilot with his level of airmanship and experience would have executed the checklist, communicated with ATC, and probably initiated a rapid descent.  But that's not what the author suggests happened.

He suggest that the pilots may have climbed to 45,000 feet to "quell a fire."  Beside the fact that this is well above the maximum service ceiling of a 777 (not "at the top of its operational ceiling"), no experienced, competent pilot would ever do that in this situation, and it wouldn't help put out an electrical fire anyway.  

In order to take advantage of the "lowest level of oxygen" at 45,000 feet the pilots would have had to depressurize the aircraft, further endangering all their passengers and increasing the likelihood that they, too, would become hypoxic and incapacitated.  They would have been compounding their problems.  

About this the author says:
"That is an acceptable scenario."
No, dammit, it's not.  It is the MOST unacceptable scenario imaginable.  This phrase alone probably gobsmacked every pilot who read this piece.  It is inconceivable that these pilots would have climbed to try to put out a fire, and stating that as fact seriously calls into question the author's credibility.

To the contrary, Mr. Goodfellow, those pilots would have been descending like a bat out of hell to land if they had a serious fire, not climbing to "quell a fire".  And you can't dive at high speed to extinguish a fire that's inside the aircraft.  

I will give Mr. Goodfellow credit for finally stating the obvious near the end of the article, though it's completely at odds with the previous "acceptable scenario" statement:  
"Fire in an aircraft demands one thing: Get the machine on the ground as soon as possible."
Unfortunately none of what the aircraft apparently did in terms of climbs, descents, and turns, is consistent with competent crew actions in fighting an inflight fire.  

But for argument's sake let's assume there was a fire and this crew was so incompetent that they did climb to 45,000 feet and then dive to 25,000 feet to try to put it out.


Continued flight after a fire 


If the aircraft suffered a catastrophic electrical fire severe enough to kill or incapacitate the pilots and immediately disable all communication systems, it most likely would have caused other aircraft systems that are required for flight to fail as well, including the autopilot.  And it sure wouldn't have been sending ACARS pings to satellites the entire time, either.

The aircraft would not have continued flying for several more hours, it would have crashed much, much sooner.  

Look to any of the recent inflight fires on cargo aircraft for examples of how quickly airplanes come down when they have uncontained fires, even with pilots wearing oxygen masks still at the controls.  Mr. Goodfellow mentions similar situations near the end the post but they run contrary to nearly everything else he proposed.  

The author suggests the aircraft would have crashed when the fire "destroyed the control surfaces."  How an electrical fire near the cockpit could propagate through the aircraft structure, leaving it intact and the airplane still flying, yet somehow destroy the metal or composite control surfaces exposed to the slipstream on the backs of the wings, horizontal stabilizers, and vertical tail must be an amazing tale.  Perhaps he'll cover his hypothesis on that in a follow-up piece.


Choosing an emergency airfield


The author indicates that this entire theory came to him when he saw on the news that the aircraft made a left turn after its last communication with ATC.  From this vague and imprecise information he concluded, in fact, "instinctively knew" that the pilot was heading to another airport due to an emergency.  That is pretty fantastic intuition.

He describes how pilots are "drilled to know what is the closest airport of safe harbor" while in flight.  While this may be true of a single-engine Cessna tooling along at 90 knots and 5,000 feet, he overstates the case for pilots operating a modern jet airliner at 37,000 feet and 500 knots groundspeed.  The graphical moving maps in modern jets do a nice job of displaying airports near an aircraft, relieving the pilots of this impractical task.

I'd much rather my pilots be drilled on and instinctively know to put on their oxygen masks, declare an emergency, and descend if they have a fire.


Direct-To


If they needed to find the closest suitable airport for landing they wouldn't have to rely on their memory -- they would use the flight management system (FMS).  With a button press or two they'd have a list of the nearest airports, in order from closest to farthest, the distance and direction to each one, and possibly the current weather as well.  

If the autopilot was functioning they could instantly steer "direct-to" that airport via the FMS and not even worry about turning the aircraft themselves.

But he proposes they chose to fly to Langkawi, an airport fully on the opposite side of the island of Malaysia.  That airport was almost certainly not the closest suitable field to them at the time of the supposed fire, so they would have had to have either intentionally selected a more distant airport via the FMS or manually flown the aircraft towards Langkawi.  

This makes no sense and is again antithetical to the training commercial pilots receive on how to handle emergencies.  The automation in the aircraft would have guided them to a different field.  Time is life, get the jet on the ground.

The aircraft actually overflew a very suitable emergency field as it climbed to its cruising altitude. Sultan Mahmud Airport, Kuala Terengganu (airport code WMKN) was directly behind them on the east coast of Malaysia.  

It has an 11,400' runway with an overwater approach, an Instrument Landing System, and would have been half-again closer than the airfield at the crux of the author's theory.  The pilots could have gotten there quickly without overflying any land mass at all.

ATC could have advised them of the same information regarding nearest airports in short order if the FMS was unavailable or the pilots didn't have time to operate it.  Barring the runway being closed or otherwise unavailable at WMKN this would have been a much better, closer option than the canard suggested by the author. 


Final thoughts


Just because there was an airport with a 13,000 foot runway more or less inline with the last known direction of flight is hardly conclusive.  Furthermore, the facts at hand are completely at odds with the actions professional pilots would have been expected to take in the scenario Mr. Goodfellow proposed.

As an experienced pilot myself I can say that this is one of the most bizarre mysteries involving airplanes I've ever witnessed.  Perhaps some day we'll know what actually happened, then again, perhaps we won't. 

However, in the vacuum of hard facts and evidence before us, it is reckless and arrogant to so confidently proffer "the answer" as this author has, especially when there is no solid evidence to support his hypothesis, when what he does suggest is riddled with factual and procedural errors and terrible lapses of airmanship and judgment, and when not a single one of the dozens of demonstrable, overt actions indicating a crew fighting an inflight fire or other aircraft emergency occurred.  


This "analysis" is terribly, terribly flawed and flat-out wrong in many places.  In the bigger picture it does much more harm than good by spreading misinformation and unfounded conjecture.  I hope to chalk this piece up to Internet trolling at its finest, lest it be an embarrassment to the community of actual professional pilots.


Epilogue


In a follow up post to the original Mr. Goodfellow wrote (emphasis added):
Opening a new thread here for MH370 comments to continue as we reached the max of 500 on the other post. WOW!
Continue posting your comments - all good. Somehow I believe this is coming to an end. The reported sighting over the Maldives coincides with the time line well. The aircraft is probably a small distance west of Maldives. I believe it will be found in next 24 hours hopefully. Finally searching in the right place on that extended line which goes over the Maldives I suggested in the first post. 
I have purposely declined all media interviews with all major networks CNN FOX CBS ABC NBC MSNBC ESPN. I do not wish to join the talking heads on 24 hour news channels. I wrote the original post because I felt the speculation was running amok. I have done two radio interviews with BBC in London because I respect the way they have handled it. We discussed primarily why this had become such a huge social media story to begin with.
If the plane is found in the area suggested I will probably fly to New York and hold a news conference at which time I will let the media beat me up on my terms :)
I'm tuning out tonight but before I do I want the families who have missing loved ones to know they are in our prayers and everyone around the world who has joined this story embraces you with love and compassion.

He wrote that piece because he felt "speculation was running amok"?  How utterly ironic -- this tripe is nothing but speculation, and ill-informed, error-crusted speculation at that, and now it has run amok.  I look forward to the news conference in New York (winky face).

Wednesday, July 23, 2008

My First Lesson - production.log?

Okay, so right off the bat I'm not going to do what I said I was going to do, but hear me out. I haven't actually made this mistake before but, true to the spirit of avoiding self-inflicted wounds, this one was self-inflicted, so it counts!

Standard scenario: try to do something as the admin in a production application and discover a bug. "Hmmm, wonder why that happened -- let's check the log." Jump into a terminal window, tail -f production.log, repeat the faulty action and...what the hell, why is nothing being written to the log?

Well, almost nothing. While staring at the log the cron job that runs ar_sendmail wrote to the log just fine, but the application itself wasn't, regardless of what I was doing with the app in a browser.

Without the benefit of time travel and 20/20 hindsight I couldn't check this blog, so I started searching for an answer. The probable culprit in most of the relevant posts (and there weren't many, which surprised me) was a permissions problem on the log file itself.

Not to toot my own horn (so to speak [as if a Viper pilot is going to read this]), but I don't limit myself to being a n00b in just one area. In addition to being wet behind the ears in Rails I'm also a rank (as in, "Man that stuff stinks! It's rank!") beginner with Linux, bash, chmod and a whole bunch of other stuff that, frankly, I don't know well enough to even know that I don't know it.

But enough about me -- back to the topic at hand. I took a look at the permissions of all the files in the /log directory (using ls -l) and quickly noticed that the permissions for production.log didn't match the development.log or mongrel.log files.

Being razor sharp I decided to make 'em the same. But how? Meet our friend chmod. Not being an expert here I just copied what I saw in some of the stuff I found online:

chmod 0666 production.log

Now the permissions at least matched the other files in terms of rw-rw-rw-. I tried the app again but still, nothing was written to the log.

Going back to the /log directory with another ls -l it dawned on me that the owner and group for production.log (chris sudo) were different from the other files (root root). But how to change that?

Meet our other new friend, chown, cleverly named for what it does, "change owner". A bit more of the google and I found the syntax (I know, I know, how about man chown? I'm a newbie, remember?):

chown root:root production.log

Now the permissions, owner and group all match the other non-FUBAR'd files. A quick click in the app and, thankfully, it's working again.

To quote David Byrne, "but, how did I get here?" Again, in the true spirit of this blog, I was too smart for my own good. I'd seen numerous "best practices" articles about not letting the production.log file get too huge, and all suggested using the *nix logrotate command to customize how the log file gets truncated, copied, compressed or whatever other stuff you want to do to it.

Wanting to join the club of People Who Produce Professional Rails Apps (yet really don't know what the f*ck they're doing), I dived right in, head first (watch out for rocks). Here's the logrotate.conf script I came up with (stored in ~/cfg):

/home/me/myapp/shared/log/production.log {
rotate 4

weekly

create 0666 root root

compress

notifempty

missingok

dateext

nomail

}

Note the 0666 root root after create. That wasn't there before and was, I'm pretty sure, the source of the problem. When the production.log file was being re-created with plain, unadorned create it didn't have the appropriate permissions, owner or group to allow the rails process to write to it. Or at least, that's my newbie suspicion.

I'll know for sure that it works once the log rotates and actually functions properly. If I hadn't run out of steam last night after getting this far I would have done the truly professional thing and figured out how to run my logrotate.conf file on the spot to be sure.

But, as is the nature of the software development business, that's a rabbit hole I may run down some other time, undoubtedly in the wee hours, after a long day, with a deadline looming, and a tiny voice in my head saying, "Haven't I seen this before?...."

Time for bed -- I've got to spend some time above Armstrong's Line tomorrow morning.

Lessons Learned (and Mistakes Repeated) - Adventures in Ruby on Rails

Tell me if this sounds familiar: you run into some totally goofy problem with your Rails application, a problem that you're completely confident you've seen (and conquered) previously after a few hours of using the google on the internets, finding a solution, then spending several more hours tweaking your app, the deployment or one of the other five-dozen moving parts in the typical RoR app.

But the elation lasts about 5 seconds because you realize:
  • You didn't write down how you fixed it last time, or
  • You didn't bookmark the site where you found the answer, or
  • You did bookmark the site, but have no idea what the hell the bookmark is called, or
  • You did bookmark it, you do find the bookmark, but the site's gone belly-up
Yeah, I thought so. Well, this is my attempt to save myself, and anyone else who stumbles across this blog, from suffering so many self-inflicted forehead slaps.

Being a relative newcomer to Ruby on Rails, and despite my expensive attempts to get smart (measured in money spent on books, Peepcode screencasts and PDFs and, actually, really good training) I somehow still manage to make the same dumb mistakes regularly.

The band-aid I'm going to place over that gaping wound of jackassery is the listing here of those foul-ups and how I fixed them. Perhaps the next time I get that deja vu feeling all over again I can spare my cranium an open-handed "Dooh!".