Our Worldship Broke!

Jim Beall


"The reason I asked to speak with you here, in this place, is to tell you that something has broken."

Perhaps we are meeting in the heart of the Tabernacle, with you in the vestments of the High Priest and me in the raiment of the ArchDeacon of Engineering. Or maybe we are on the bridge of our great vessel and you're wearing the glittering dress uniform of a ship captain, with me your engineer. There are countless other possibilities — from business suits to no clothes at all! — but my fear in every case will be the same.

It's not my fault!

"Please don't excommunicate, execute, or recycle me!"

I am not going to try to blame our ancestors. Whether I am reading from scripture, logs, or reports, I will attempt to convince you that failures have occurred before, and they simply happen no matter what. After all, we have been travelling in the vacuum of space towards our destination star for a very long time.

"Raise not your staff to me, I beseech you, Your Eminence! Lord Captain, please sheath your sword! I meant no disrespect to the Designers. Their near-zero operational failure rate is miraculous, but even 'near-zero' is not zero, especially over centuries of operation. The reliability level that they did achieve merits admiration, if not adoration."

#

The worldship designers may or may not use religious tracts, but they would certainly rely on Redundancy, Diversity, and Margin when choosing and sizing essential systems.

Redundancy has long been recognized as a critically important design element. Indeed, the mantra of the nuclear engineer is, "Redundancy is good. Redundancy is good." Worldship designers would be expected to hold it in even higher esteem. Nuclear power plants generally have two one-hundred per cent capacity, physically independent groups of systems (called a "train") for each safety function. A worldship might have three or more. Redundancy allows removing a safety train for inspection, testing, and maintenance. If one train fails during an accident, another full capacity safety train is there to save the day and, on more than one occasion, it has.

Diversity is an important social imperative, but it's an even more important design one. No matter how reliable a given machine may be, relying on only one design creates vulnerability to the phenomenon called "common mode failure." Extrapolating from an historical scenario, if four helicopters are needed to complete a desert operation, an inadequate sand screen design on the engine intake would doom the mission no matter if eight — or eight times eight! — identically vulnerable choppers were dispatched. Similarly, a materials defect (e.g., tainted lubrication oil) could simultaneously fail all machines that used it. Even diversity in location is important, as demonstrated during the recent accident at the Japanese Fukushima Daiichi Nuclear Power Station. The designers of the coastal Fukushima plant had placed all emergency power sources in basements, despite flooding being a possible common mode failure risk. The extended duration of worldship transits would make their creators even more sensitive to design diversity.

A typical nuclear power plant taps its own generator output for normal ac power, and can also connect to the grid through a separate transformer. Worldships would similarly tap the main propulsion drive (e.g., fusion or anti-matter "torch"), but by a great variety of diverse methods such as magnetic coupling, photovoltaic, thermovoltaic and even thermophotovoltaic. The intent would be to provide multiple copies (trains) of every chosen power design, each of sufficient size to provide the necessary output. Where defective lubricant might fail all the magnetic coupling-driven generators, the "solid state" photovoltaic trains would be unaffected. Nuclear plants supplement diesel generators with gas turbines for on-site emergency ac power diversity. Worldship emergency power design would doubtless include multiple long-lived, battery-style fission plants, fuel cells, and the like.

Margin is another vital design element, both in building codes and operational hardware. Each nuclear plant safety train is nominally capable of supplying one-hundred per cent of needed power or fluid flow but, in practice, can provide more, sometimes much more. US naval history is replete with wartime stories of propellers turning at more-than-possible rpm. What those events had in common was that scared engineers called on those margins. One peacetime example reportedly took place during USS Enterprise (CVN-65) sea trials. Admiral Hyman G. Rickover -- with one eye on the increasingly-restive civilian vendor representatives on the bridge — kept adding rpm to the maximum flank bell. According to the story, after a couple of the Admiral's "Two more turns, Captain," one vendor rep suddenly announced. "One more turn, Admiral, and they're your reduction gears." Rickover then reduced speed, confident that he had learned both the limiting propulsion component and its design margin.

On a worldship, the designers would craft their margins to be synergistic. For example, if the ship's radiators experienced a failure beyond their design margin, the propulsion drive would necessarily be limited to the heat output the remaining radiators could shed. Full drive thrust would be impossible. The effect would be to lengthen the trip from, say, five hundred years to seven hundred. As long as life support and other key systems had that much margin, the worldship could still safely reach its destination, albeit later than planned.

#

Once you have calmed down, you will have questions. Hopefully, you have spared me so that I might answer them. Otherwise, you will have to summon another.

“Are you sure? How did you learn of this?”

“By the will and word of the Designers, Your Eminence.” If your rank is military, I would cite the applicable standing orders. No matter what, however, my answer would be steeped in the design elements of Monitorable and Testable.

#

Monitorable systems allow operators to discern system status. Well-designed systems provide continuous affirmation of operability, and clearly announce failures or other variances from expected performance. System sensors would monitor a great many parameters. The classics of temperature, pressure, flow, level, voltage, current, etc. would be joined by ones such as continuity, tension, torque, thickness, flux, field strength, and a vast host of others. Oversight routines would interpret and weave the streams together into qualitative depictions (e.g., green, amber, and red), yet allow human inspection of the quantitative data upon demand.

Some control panels feature a layout that imitates the displayed system ("mimic bus") to simplify operator recognition. For example, plastic shapes of pipes, pumps, and turbines might depict a system, with the switches to operate valves inserted in their proper places, and with indicator lights showing position and gauges showing flow. As is the case at nuclear plants, our worldship will doubtless have systems too complicated for classic mimic bus treatment. However, the designers would know that multi-generational nature of the ship rendered "user friendly" (here, "operator friendly") an absolute requirement. They would likely use expandable three-dimensional holograms, easily accessible and possibly even triggered by alarms.

Testable systems enable operators to determine status, whether to follow up alarms, or to routinely confirm operability in the absence of alarms. "Trust but verify," may have been a 1980s signature phrase of President Ronald Reagan, but it has always been a crucial design element. Operators are taught to trust their indications, but to verify them to the maximum extent possible. Well-designed systems facilitate both troubleshooting and operability confirmation.

#

Eventually I convince you that the failure is real, which leads inevitably to your next question.

"What do you propose to do?"

"It will be my duty and honor to lead the repair effort myself, Your Eminence.”

#

While they were not successful in this case, worldship designers would strive to minimize the need for human-effected repairs. They would do so by automating as many maintenance and repair activities as possible, and also by preferentially selecting Passive (vs. active) and Static (no moving parts) design elements.

Passive components are those that do not have to change to fulfill their design mission, while active ones must. For example, the pump that must turn on, rotate its internals, and not over-heat is far more likely to fail than the pipe that will transport the pump’s output. One illustration of the probabilistic difference is that US nuclear regulations require designs to preserve safety during an accident even if any one active component anywhere in the facility fails during the first few hours. In contrast, those same regulations presume all passive components remain operable during that same period. Instead, a design must be able to survive a single passive failure during the long term cooling phases that follow, which could be months or even years.

Static aspects greatly reduce failure risk. For example, a battery that needs only a single breaker to close is far more reliable than a diesel generator that requires a great many internal moving parts to operate, as well as all the external components in its fuel and cooling systems. Worldship designers would probably make extensive use of electromagnetics and magnetohydrodynamics. Electromagnetic pumps, for example, do not rotate vanes or impellers in the flowpath, but use electric power to produce magnetic forces to move electrically conducting fluids (including liquid metals and plasma). Similarly, radiators would use heat pipes, whose absence of moving parts makes them superior to systems using pumps and condensers.

#

Once you agree that human-performed repairs are necessary, you have additional questions.

“How extensive will the effort be and how long will it take? Is it dangerous? Are we in danger until the repairs are complete?”

My answers will vary according to the situation, of course. Except in catastrophic cases (like large object collisions), however, I will be able to tell you that a backup system (Redundancy) is already doing the failed system’s job as I have personally confirmed (Monitorable). Thus, the present unavailability of the broken system would constitute not so much risk as a reduction in Margin.

“There is little danger, Your Eminence, and Scripture is clear on how to proceed.”

#

Whether it be Scripture, Starfleet Technical Specifications, or something else entirely, a comprehensive database would exist containing repair instructions for every failure the designers could envision, no matter how unlikely. The instructions would not be limited to the spoken or printed word -- languages change over time -- but be in the form of youtube-style hologram sequences. Raw materials would be retrievable, probably from vaults layered in the bow for shielding. Also in the front would be ice, not only for shielding, but also for biosphere backup and even emergency heat sink purposes. Other items there would include spare parts, especially ones impossible or very time consuming to replicate. Fabrication facilities, such as 3-D printers and forges, would be used to produce everything else when needed.

While my answer as to how long repairs would take would depend on the specifics of the failure, they would be influenced by the design attributes of Accessibility, Modularity, and Standardization.

Accessibility anticipates the need for servicing and repair, by providing spatial separation between components and an absence of physical interference. This is sometimes not achieved, most often when design modifications are made after initial installation. Late during the construction of one nuclear plant, engineers identified that component accessibility had been severely compromised in one area of the reactor building. They ended up having to compile charts listing what pipes would have to be cut to access valves in other systems deeper within the crowded compartment. Such drastic measures vastly complicate and lengthen repair activities.

Modularity simplifies maintenance and repair by grouping functionally linked components into one easily replaceable unit. It requires far less system down time to change out a multi-component module than it does to identify precisely which individual component (or components!) has failed, gain access to it, sever it from the system, and replace it without damaging other parts in the process. In system areas involving adverse thermal conditions, radiation levels, or vacuum, swapping out modules may be the only way repairs can be accomplished.

Standardization shortens repair times because it allows a parts inventory to be practical. That is, it is far quicker to use existing spares than to fabricate each part necessary for every repair. The overall fabrication process would involve identifying the necessary stock, retrieving the materials, manufacturing the parts, inspecting the finished parts against tolerances and specifications, and then performance testing the parts. Particularly if time is important, it is far superior to pull a proven part from inventory, use it, and later employ the fabrication process for its inventory replacement. Only standardizing parts can make this possible or, at the least, reduce the number of items to be fabricated each time.

#

You are relieved to learn that the failure has added no appreciable risk to our ship, our world. Nonetheless, you want to know how soon all can be returned to the way it was, the way it should be.

"What personnel will you use? And, when can you begin?"

My answer will be the summation of many factors, including failure extent, collateral damage, fabrication (versus replacement) needs, repair complexity, and training requirements. All of those aspects except for training would be relatively straightforward, in that they could be readily calculated. How I would go about choosing and preparing personnel for executing the task itself would depend on the existing Training Programs, Simulators, and — most importantly — on Social Engineering.

#

Training Programs of a sound and effective nature would be a worldship requirement, absent sentient robots and/or cold sleep storage of pre-trained human experts. After all, even with careers lasting fifty years, a five hundred year transit means that those standing watch and making repairs when the ship reaches the destination star will be ten or more generations removed from the ones who received their training before departure. Adequate training can be accomplished by a variety of approaches, including apprenticeships and shadowing, as well as by schools and testing. To sustain competence over long periods, however, programs would have to include periodic verification and demonstrations of expertise, as well as formal refresher training periods.


Simulators would not only be the key element in achieving and maintaining expertise, but would also be vital in preparing for non-routine evolutions and repair activities. US nuclear operators have benefitted enormously by the federal mandate after the 1979 Three Mile Island accident for nuclear plants to have site-specific simulators. Before that, operator training relied on far less realistic methods, with perhaps a few hours on a remote vendor simulator that usually did not precisely replicate the plant that they would operate. The growth in computer power now allows current operators to learn how to respond to almost any failure in the plant. More than that, however, nuclear plant simulators have been used as a powerful investigatory tool, including verifying procedure accuracy. Worldships would have far more powerful simulators, closer to the holodecks of Star Trek fame than the ones of today. Such machines would be capable of simulating any place aboard the ship, allowing rehearsals of repair activities as well as control room scenarios.

Social Engineering would be pervasive in its effects, including how to prepare for a complex repair evolution. It is, quite frankly, the "long pole in the tent," the "800 lb. gorilla," or any other such analogy. Has the worldship had a stable culture throughout the long transit? After all, three or five hundred years is a long time. Is the culture a technological one? Or, do "we" live in an artificial, low-tech society, established as such in an attempt to increase stability? Maybe the ship contains cultures at multiple levels in some sort of class system. These choices matter!

Ideally, a major repair effort would involve three or more large teams, so that the work — once begun — could proceed until completion without interruption. Remote monitoring would also be continuous, as just one part of quality control and assurance activities. Materials, modules, and supplies would constantly be staged in to the work area with inspectors verifying that all is proceeding as planned. These are just some of the many jobs that would require specialized training separate from that of those actually performing the on-site labor.

How deep is the pool of technically literate and competent workers? Will the repair leader be able to simply choose from lists of qualified and experienced individuals to fill the organizational slots? Or, will repairs have to wait until enough personnel get screened for aptitude, become educated, receive basic training, and only then begin to prepare for the task?

Once personnel are selected, how many will stick out the potentially rigorous training? How many will agree to do the probably uncomfortable (and possibly dangerous) work? How will their compliance be ensured? Will they be naval officers and ratings self-selected for fidelity to duty? Will they be clerics under vows of obedience? Or, might the rewards be designed to attract the top athletes of the day?

Will the repair procedures and requirements rely on rites and liturgy? Or, would the simulators have become a central part of a free-wheeling, holodeck-style gaming culture? Factors such as those will dictate how long the training will take for any evolution, including a major repair activity.

#

I give you my answer and, to my profound relief, you accept it.

"Very well, I approve. What are you going to do now?"

"Thank you, Your Eminence! I am off to St. Tesla's to meet with the abbot. I have a pilgrimage to plan, to the Fourth Radian Magnetic Coupling."

"May the Designers watch over you."

They have, for all these many centuries.

###



Copyright © 2015 Jim Beall


Jim Beall (BS-Math, MBA, PE) has been a nuclear engineer for over 40 years, a war gamer for over 50, and an avid reader of science fiction for even longer. His experience in nuclear engineering and power systems began as a naval officer. Experience after the USN includes design, construction, inspection, enforcement, and assessment with a nuclear utility, an architect engineering firm, and the US Nuclear Regulatory Commission (USNRC).