Making systems - 7: Elements of making a system

7.1 Introduction
7.2 Objective
7.3 Model
7.3.1 Artifacts
7.3.2 Tasks
7.3.3 Team
7.3.4 Tools
7.3.5 Operations
7.4 Using this model

7.1 Introduction

The previous chapter defined what a system is. In this chapter, I turn attention to how to make that system. “Making” includes the initial design and building of the system, as well as modifications after the initial version has been implemented.

Making the system is a human activity. Building a system correctly, so that it meets its purpose, requires a team of people to work together. Building systems of more than modest complexity will involve multiple people, usually including specialists who can work on one topic in depth and people who can manage the effort. It involves people with complementary skills, experiences, and perspectives. Such systems take time to build, and people will come and go on the team. Systems that have a long life that leads to upgrades or evolution will involve people making modifications who have no access to the people who started the work.

This chapter provides a model to organize and name the things involved in the making of a system—the activities, the actors, and what they work with. Later chapters provide details on each part of this model. This model includes both elements that are technical, such as the steps to design some component, and elements that are about managing the effort, such as organizing the team doing the work or planning the work. Note that this model does not attempt to cover all of managing a system project—there is much more to project management than what I cover here.

The model presented in this chapter only serves to name and organize. I do not recommend here different approaches one can take for each of the elements of the model; only attributes that good approaches should have. Later parts of this book address ways to achieve many of these things. For example, the team that is designing a system should have an organization (a desirable attribute), but I do not address which organizational structures one can choose from.

The assembly of all the parts involved in making a system is itself a system. In those terms, this chapter presents the purpose (Chapter 9) of the system-making system and a high-level concept for how to organize the high-level components (Chapter 11) in that system.

7.2 Objective

This model of making captures the activities and elements involved in executing the project to make or update a system.

The approach used for making the system should:

Build a good system. This means the system should have a clear design, be safe and secure, and be maintainable.
Be cost and time efficient. A system usually has a budget or a deadline; the work should proceed without wasting either time or money.
Keep the workers who build the system satisfied.
Satisfy the customers who will sponsor or use the system.
Position the organization building the system for future work, if appropriate.

7.3 Model

The making model has five main elements:

Artifacts: the things created that make up the system and its records
Tasks: the activities that are performed to make artifacts
Team: the people who perform tasks
Tools: things that the team uses in performing tasks
Operations: how the team manages the work to be done

7.3.1 Artifacts

The artifacts are the things that are created or maintained by the work to make the system.

The artifacts have three purposes. First, the artifacts include the system’s implementation: the things that will be released or manufactured and put in users’ hands. The artifacts should maintain the implementation accurately, and allow people to identify a consistent version of all the pieces for testing or release. Second, the artifacts are a communication channel among people in the team, both those in the team in the present and those who will work on the system later. These people need to understand both what the system is, in terms of its design and implementation, and why it is that way, in terms of purpose, concept, and rationales. Finally, the artifacts are a record that may be required for future customer acceptance, incident analysis, system certification, or legal proceedings. Those evaluating the system this way will need to understand the system’s design, the rationales for that design, and the results of verification.

The artifacts should be construed broadly. They include:

Records of the system’s purpose (Chapter 9).
Documents recording the system’s concept and design.
The system’s implementation.
Verification records.
Rationales for design choices.
Plans, defects, analyses, and activity logs.
Procedures and processes to guide work.
Information about the team and roles.

Artifacts other than the implementation are valuable for helping a team communicate. Accurate, written documentation of how parts of the system are expected to work together—their interfaces and the functions they expect of each other—are necessary for a team to divide work accurately.

Many engineers focus solely on the implementation artifacts, especially in startup organizations that are trying to move quickly, and do not produce documents recording purpose, design, or rationales. If the organization is successful and the system they are building enters service, at some point this other information will be required—as the team membership turns over, or as the complexity of the system grows, or as the team finds flaws that need to be corrected. The startups I have observed have all had to reconstruct such information after the fact; the reconstructed information is less accurate and costs more than it would have been if it had been recorded from the beginning.

The system artifact graph is the collection of all the artifacts that the team works with. It includes both every artifact and relations between them that show how each one one derives from others. Among other things, for each component, the artifacts graph includes its purpose specification, and design as well as implementation and verification artifacts. I discuss this more in Chapter 15.

Finally, the artifacts should be under some kind of configuration management. Artifacts will evolve as work progresses. One artifact may be a work in progress, meaning others may want to review or comment but that they should not count on the artifact’s contents being stable. An implementation artifact may reflect some design artifact; when the design artifact is revised, people must be able to see that the implementation reflects an older version of the design. When the implementation artifacts are packaged up and released, the resulting product needs to have consistent versions of all the implementation parts.

7.3.2 Tasks

These are the individual activities that team members perform. The tasks use and generate artifacts. I rely on the colloquial definition of “task” and do not try to formalize the term here.

Systems projects usually have vast numbers of tasks. These include tasks for designing, building, and verifying the system; they also include tasks for managing the project, reviewing and analyzing parts of the system, and approving designs and implementations.

There are usually far more tasks to be worked on than people to do them. Tasks also usually have dependencies: something needs to be designed before it is implemented, or one part of the system should be designed before another.

Tasks, in themselves, need to be known and tracked. People on the team need to know what they can be working on, and who is doing other tasks that might relate to their work. Managers need to be able to track what is being done, what tasks are having problems, and ensure that tasks are coordinated and completed.

Operations, discussed below, addresses questions of what tasks are needed and which ones should be performed in what order.

7.3.3 Team

These are the people who do the tasks. They are not an amorphous group of indistinguishable and interchangeable parts; each person will have their own abilities and specialties. Each person will also have their own authority, scope, and responsibilities.

The team should be organized. This means:

Everybody knows who is on the team.
Each person knows what they should work on—the scope of their responsibility—as well as what they should not work on.
Each person knows who has what responsibility and authority, so that they know who to talk with when they have a question or request.
Each person knows who they should inform of both technical and administrative matters, so that when one person is making a discovery or decision they know who they should tell.
Each person should know the process by which decisions are made, so that they know what decisions have been made or not.

In addition, the team needs to be staffed with enough of the right people to get work done. This means that people with management responsibility need to know who is on the team and their respective strengths, as well as the workload each one has and the overall plan for moving the project forward.

7.3.4 Tools

These are things that the team uses to get its tasks done. The tools are not part of the system being produced, though they are often systems in their own right. An end user of the system being produced will not use these tools, either directly or indirectly.

The tools include things like:

Tools for storing and tracking artifacts as they are created and updated.
Software build tools and hardware design tools (e.g. CAD systems).
Verification and analysis tools.
Testing infrastructure, including physical equipment, procedures, and analysis software.

7.3.5 Operations

Operations is about organizing the work that the team does. Its primary function is to ensure that the right tasks are done by the right people at the right time.

Operations sets up “a set of norms and actions that are shared with everyone” in the project [Johnson22, Chapter 2]. It gives people in the team a shared set of rules and procedures for doing their work, and it uses those procedures to manage a plan and tasks that coordinate that work. When people share a set of rules and procedures, they can each have confidence in how others are working and in the results that others produce.

There are two primary objectives for operations: making sure the work proceeds efficiently, and ensuring product quality. Operations has secondary objectives, including keeping the organization informed of progress and needs.

Ensuring the project runs efficiently implies several things.

Avoiding unnecessary work and rework.
Avoiding delays where some team members are blocked because some other part is not yet ready.
Ensuring resources are available when needed.
Organizing communication within the team so that needed information is shared.
Ensuring that the project can continue to operate into the future by providing for communication with future team members.

Ensuring quality means:

Making sure that design and implementation are firmly based in system purpose.
Ensuring that work is checked for meeting correctness and quality standards.
Managing work to account for uncertainties and unknowns.
Ensuring that work involving multiple people is coordinated so that they do not work divergently.

I look at operations through the lens of the tasks that people on the team will do. Operations is about tracking what tasks need to be done, who is working on them, and how those tasks are going. It is also about organizing tasks so work proceeds with as few interruptions as possible. Operations is, in a way, a feedback control system that keeps the flow of tasks running smoothly.

Operations is more than overseeing tasks, however. It is equally about guiding the team through its work, working out what tasks need to be done and looking forward to plot out how those can be best scheduled.

Determining the tasks to be done begins with the list of artifacts that the team will develop. Developing this list begins in turn with the structure of the system, particularly the components in it. For each component, there is some pattern of artifacts: purpose, specification, design, and implementation for example. Then there are questions of how the team should tackle those artifacts—one after another, or in multiple iterations, or something else. The project also establishes checkpoints or milestones, where the artifacts will be reviewed, some results are made available, or decisions are made.

How to schedule all the tasks comes next. The scheduling must respect how tasks depend on each other. They must be arranged so that long tasks are started early enough that people don’t have to sit idle waiting for the long task to be done. The project must also account for who can do different tasks and ensure that the right people are available when needed.

Obviously, the team does not know what all the components will be, and thus what all the tasks will be, early in the project. Operations is an exercise in the unknown, tracking how the work ahead changes as the team works out the structure of the system or when people find problems with artifacts already developed. Requests for changes add to the complexity. undisplayed image

The following model divides the work of operations into parts to make the job more tractable. These parts are interdependent. I discuss this model in more detail in Chapter 20.

The development methodology determines the guidelines for doing work—incrementally or not, and how to focus effort. The life cycle and procedures define the steps that the team will follow for steps of the work. The life cycle and procedures depend on the definitions of what artifacts the project will produce, since they define task for building them up. These three parts define policy for how the team will work. They are defined early in the project, and change slowly once the project is moving.

The methodology and life cycle combine with the components in the system to define the tasks that need to be done. As system structure is developed, more tasks are revealed. As problems are found or change requests received, other tasks get added.

Planning then looks forward at the tasks that have yet to be done, aiming to understand the medium- to long-term direction for the work. It includes both tasks that are defined at the moment as well as placeholders for work that can be expected. For example, part way into a project to build a small spacecraft, there could be defined tasks to specify and design the electrical power system and expected work to verify its integration with the rest of the spacecraft systems. The plan looks at how tasks depend on each other, the resources needed to complete each one, and whether some tasks will take a long time to complete. It produces a plan for how to order the tasks for efficient execution; this plan provides information to scheduling activities that predict when the project might complete different milestones.

Tasking is the final step, deciding which specific tasks to do next. For example, when someone becomes available to take on another task, this determines which tasks are the best options for them.

All of these depend on information that supports project operations: records of the methodology, life cycle patterns, and procedures, tracking tasks and schedule information, and tracking who is available for what kind of work when.

Development methodology. This defines how the project will organize the overall flow of work in general terms. It defines in what order people will do work, how and how often they will plan that work, how they will break up complex tasks, and so on. Many people are aware of basic development methodology approaches like waterfall, spiral, or agile; these provide basic patterns that can be used to inspire the development methodology that a project actually uses.

The development methodology defines things like: “The project defines periodic intermediate milestones, and then works toward those milestones,” or “the project builds components in multiple iterations, focused on reducing risk and most essential function first.” I discuss development methodologies more in Section 20.3 and Chapter 22.

Life cycle. This defines the overall patterns of actions that the team will perform as it does the project. It defines phases of work and how one phase should happen before another. A typical phase is made up of many tasks; it covers (for example) the the work designing some artifact that is part of one component. The life cycle also defines milestones, which provide planned times when checks on work are done in a phase.

A life cycle pattern says things like: “First work out purpose, then specifications, then design, then implementation. At the end of each of these phases, have a review with one person designated to approve moving forward.”

There are many different life cycle patterns, and usually an organization or a project will need to pick one—and then customize the life cycle to meet its specific needs. Sometimes the life cycle will be determined by external requirements; for example, NASA defines a common life cycle for all its projects [NPR7120].

Procedures. While the life cycle defines in general what to do, the procedures define how to do some tasks. They provide specific instructions for how to do particular actions or tasks. The instructions might take the form of a checklist, a flow chart, or a narrative.

People on the team need to know how to do things that require coordination. While team members should be able to do most of their work independently, at some point they will need to work together. The work will go more smoothly if everyone understands when they need to work together and how to do it.

There are also some tasks that are procedurally complex, even when only one person is involved. For these tasks it is helpful to have written down the steps to perform—which serve in effect as a checklist.

Procedures should be defined for tasks where getting the actions right is critical or where the task is complex. In the example below, checking a document artifact into a repository is simple, but needs to be done correctly. Performing a design review and approval has potentially many steps to go through: communicating the design to others for review, an approval decision by a designated team member, and changing the status of design documents to show that it has been released. When the life cycle defines a point in the project when something should be checked, such as during a review, procedures ensure that all the needed checks actually happen.

Documented procedures help the team perform tasks accurately, helping to make sure that steps aren’t missed. They also help the team do those tasks in compatible ways so that one person’s work can build on another’s.

I have seen teams that try to operate without some ground rules for working together. This can work quite well for teams up to three or four people, and when the artifacts they produce do not need high assurance (that is, when what they produce is not safety- or security-critical). On larger teams that have not written down their basic process rules, I have always seen failures to communicate or consult. These failures sometimes led to errors in the system that had to be corrected later once found. Sometimes they led to one person damaging another person’s work, requiring time and effort to recreate overwritten designs.

Documenting procedures also provide a way for the project to learn and improve. If some procedure is not working well, the team can identify which procedure is the problem and then change it. As long as team members then follow the revised procedure, the team’s ability to work should improve over time. Contrast this to not documenting a procedure: some people may have opinions on how to do it better, and they may start doing it the new way, but not everyone will know about the change, and people may forget it after a little while. This makes learning slower and less reliable.

Plan. The plan defines the overall intended path forward to a completed system, along with selected milestones along the way. It is a current best estimate of the general steps needed to move the project toward that goal.

A plan records the approach the team intends to take to build the system. It lays out the phases of work expected, in coarse to medium granularity. In doing so, it records decisions like the flow from specification to design to implementation to verification. It records when the team decides to investigate different ways to design some component, perhaps prototyping some of the ways. It documents expected dependencies and parallelism.

The plan is, therefore, a record of how parts of the life cycle pattern are applied to this specific project. Just as there are many patterns that a project can choose to use, there are many different ways to organize the project’s work. I discuss these choices in depth in Chapter 20.

A plan is not necessarily a schedule. A schedule is usually taken to mean a sequence of events with a high confidence of accuracy and completeness. A plan, on the other hand, reflects the uncertainties that come with developing a complex system. In the beginning, the plan can be specific about a few things in the near term but must be vague about the longer term until enough design has been completed to fill out later work. As a project progresses and more and more becomes known, the plan should converge to something like a schedule.

A plan is broader than a list of specific tasks. It consists of a number of work phases, and dependencies among them. This information then guides the specific tasks, as discussed in the section on tasking below.

Plans are used in prospect, in the moment, and in retrospect. They should provide guidance on what direction the work will likely go in the future, even when that direction has uncertainty. They are used in the present to track what is happening now. They provide history of what has been done, to understand how the team’s work compares to predictions and to provide accountability for everyone responsible for working on the project.

I have never encountered a project that had a single plan for the whole duration of the work. Plans have always been dynamic. Early in a project, we would know that we needed to develop a concept for the system but did not yet know enough to sketch out the work involved in building that concept. Later we had a general structure for the system, but there were technical questions to resolve; once resolved, we would know what we were building. Later in the project, we would find defects or we would get a change order, resulting in unanticipated work.

Tasking. This is the day-to-day definition of tasks to be done, their assignment to team members to perform, and tracking their progress.

Tasking involves continuous decision-making: the choice of which tasks should be performed next, or which tasks should be interrupted to deal with higher-priority tasks. These choices merge several streams of potential tasks: ones that derive from the nearest parts in the plan; ones made newly urgent by a change in what is known about the system; ones about fixing errors that have been discovered; and tasks related to new outside requests.

The team will need to keep track of both the potential tasks and the ones that have been assigned and are being worked on. This implies record-keeping artifacts.

The criteria for deciding about tasks should be encoded in procedures, as discussed above. The procedure for choosing tasks can be viewed as a control system that responds to project events to affect the set of tasks assigned for work, with the aim of making the project’s execution run efficient. “Efficiently” means meeting the goals set out above for operations: ensuring that the right work is done, that people aren’t blocked from getting work done, and that the work follows orders or dependencies needed for high-quality work.

How the tasking control system works depends on the development methodology used in the project. Agile development, for example, often focuses on making tasking decisions at regular intervals (for each “sprint”); other methodologies focus on making tasking decisions continuously.

Support. The decisions made during operations take into account several kinds of supporting information. These include:

The project’s budget. This accounts for the money and other resources made available to build the system and how much has been used to date. This can be combined with the estimates in the plan to determine whether the project has enough resources to complete, or to reach intermediate milestones.
Risk. This is a list of potential problems that could affect the project’s execution. These risks are things like “part X may not be delivered on time” or “regulator Y may object to a design decision”. Good management practice keeps track of these, checks in on them regularly, and works to either eliminate the risk or mitigate the outcome.
Uncertainty. This is a list of the technical uncertainties in the system—things like “are batteries of energy density X available” or “the control algorithm that maintains output within range Y is not yet determined”. These uncertainties can be addressed by including work in the plan to investigate the questions. Finding answers will allow the plan to become a more accurate estimate of the path forward, as well as leading to choices in the system design.

Sidebar: Resource-constrained projects

Traditional project planning approaches grew out of projects, such as building construction, that focus first on time and budget. This kind of project treats the completion date as the driving factor in organizing work, assumes that in general as many workers can be brought in as are needed to complete the work quickly, and that parallelism between tasks is limited primarily by dependencies between tasks. For example, in building a house, one contractor typically brings in a team to frame the structure, while another brings in a team to add the electrical wiring or plumbing into the structure. Each of these teams can bring in as many people as needed to get the work done, and then those people go on to another construction project elsewhere when their part is done.

This model of project planning leads to tools organized around a graph of dependencies between tasks. These tools usually provide analyses like critical path analysis, which shows the longest path through the graph of tasks and therefore the hardest constraint on how quickly the work can be completed. Planning the project well often hinges on understanding the dependencies between tasks and the critical path through them.

Most complex technical system projects, on the other hand, do not fit this model well. Each person working on the project needs to understand the context of their work, and there is usually a substantial cost to add someone to the project—largely in them learning about how the project works and how the system is organized. The collection of trained people on the team constitutes a valuable resource that the organization tries to keep around to maintain the system or to work on similar systems.

This approach leads to a different approach to planning work. While dependencies are certainly important, there are often many tasks that any one person can work on (and it is common to expect some degree of multitasking). In this case, getting the order of operations precisely right is not as important. It is more important to ensure that everyone can stay busy and that any major dependencies are accounted for.

7.4 Using this model

This chapter has presented a model for thinking about the work involved in making a system. This model, in itself, does not prescribe any particular way of managing building a system; it only names the topics that need to be addressed and provides some objectives by which an approach can be judged.

In Part IV, I go into more depth about each of the elements in this model.

Those who manage a project will need to decide how they will go about organizing their work. As I noted earlier, how a project is organized and run is itself a system, and the techniques discussed in this book apply as much to designing and operating the project’s operations as they do to designing, building, and operating the system product. Chapter 6 and Part III discuss the model for what a system is.

Part V discusses how the work of building a system can be organized around the life cycle of a project. Chapter 23 introduces the idea of a life cycle. It also introduces the idea that a life cycle model provides a basis for working out the tasks that need to be done to build the system. Subsequent chapters discuss what each of the phases of a life cycle, along with the artifacts and activities that go into each one.

Part XII discusses ways to organize the team that will do the work.

Part XIV presents approaches for planning and organizing the tasks that need to be done.