Making systems - 19: Teams

19.1 Purpose
19.2 Model of teams
19.2.1 Communication
19.2.2 Groups
19.2.3 Trust
19.2.4 Authority and responsibility
19.2.5 Division of labor
19.3 Using model of teams
19.3.1 Team culture and structure as a control system
19.3.2 When to use the team model
19.3.3 Example: conflicting instructions leading to inconsistent design
19.4 Directory

Business enterprise (or any other institution) has only one true resource: man. It performs by making human resources productive. It accomplishes its performance through work. To make work productive is, therefore, an essential function. But at the same time, these institutions in today’s society are increasingly the means through which individual human beings find their livelihood, find their access to social status, to community, and to individual achievement and satisfaction. To make the worker achieving is, therefore, more and more important and is a measure of the performance of an institution.

— Peter F. Drucker [Drucker93, Chapter 4]

Building a complex system involves a team of people to do the work. The people in the team fill many different roles: developers, managers, customer and regulatory interfaces, support staff, among others.

A team of more than perhaps three or four is not an amorphous blob of anonymous people; it is organized so that each person has a role. The way a team is organized may arise spontaneously or deliberately, but it will end up with an organization. A well-functioning project will design its team organization and take deliberate actions to maintain its good function.

In this chapter I discuss the issues to be addressed when deciding how a team should be organized, including its structure, roles, and communication.

19.1 Purpose

Building a complex system requires many people to share the work. One person cannot do all of the work: they will be overwhelmed, it will take too long to complete the system, and the project will likely require skills no one person has.

In the model of making systems (Chapter 16), the team consists of a group of people who do the tasks that create the artifacts that make up the system. The team are informed by the project’s operations—plans, procedures, life cycle—and use tools to do their tasks.

The team is a social entity. The people in the team work together and interact constantly. How well they get along with each other influences how well they get work done.

A team is, however, less than a complete society. The team’s social structure is relevant only to the work they do on the shared project. The team structure does not define how the people on the team organize the rest of their lives: these fall to community and family interaction. This means that the social structures in a team are simpler than those of a complete society.[1]

It is generally understood that the structure of a system is homomorphic to the structure of the organization that is building the system [Conway68]. This means that people must work to ensure that the structure of the team and the structure of the system are compatible, possibly by organizing the team around the system structure when possible. Doing so requires having an understanding of what the system structure is, and the hierarchical component breakdown (Chapter 41) provides part of that understanding. In the other direction, the team’s organization will inevitably bias how the system is organized and built; being aware of the two organizations helps one to see unhelpful bias reflected in the system organization.

One can look at the purpose and needs of the team from the point of view of the people in the team and of the customers, organization, and funders who want to see the system built (see Section 16.2 for a discussion of these stakeholders).

Members of the team (Section 16.2.2, Section A.2.2) generally look for satisfaction in their work, enough help to get the work done, and a working environment that gives them a secure sense of how to do their work. Team members generally want team cohesion, when the people have developed bonds and trust that allow them to work together without friction. That is, they are motivated first by how the project affects them. The needs of the stakeholders are a secondary concern, mainly in how meeting those needs contributes to satisfaction and compensation.

Other stakeholders (Section 16.2.1 through Section 16.2.5) look to the team to build the system efficiently and accurately. They are motivated by the value that having the system completed will bring and by the cost of building it. The needs of the team members are secondary, in the ways that the well-being of the team contributes to the cost or benefit of building the system.

Meeting the stakeholder needs involves:

The team making system parts that are consistent with each other, so that the parts can be integrated into the whole;
The team doing its work accurately, so that few errors need to be fixed and as little work needs to be redone;
The team using resources efficiently, both in terms of who on the team does different work and how the team uses external resources like tools;
Team resilience so that the project keeps going even when people have problems; and
Team sustainability so that the team continues working at good efficiency without burning out.

An effective team balances these two classes of need—those of the people on the team and those of external stakeholders. The needs can be in conflict when the need to build a system efficiently and rapidly means that someone on the team has to do a task that they don’t enjoy. More often, both classes of need can be met by organizing the team and its culture. A team member’s satisfaction increases when they have confidence that their work is contributing to the project’s success, which comes in part by assigning tasks to the most appropriate people, avoiding duplication and rework, and ensuring that people communicate well. In general, when a team is able to use resources—people, tools, funding—effectively, the team members‘ confidence in the project will increase.

19.2 Model of teams

The following is a model for reasoning about teams. I will use this model in a later section (Section 19.3) to discuss how a team’s structure and culture can be understood, and how that can be used to manage a team.

The model begins with people. Teams are fundamentally social structures, made up of a group of people, each of whom have their own skills and experience. These people are sharing the work of building the system and of the needed supporting activities.

The role of the team in a project is to do the work of building the system. The work can be understood in terms of time-limited tasks and ongoing roles. A task is a particular piece of the work, with an intended result and a limited duration. A role is an ongoing assignment of responsibility, which leads to performing tasks within the scope of that responsibility.

Consider a team from the point of view of one team member. That team member has tasks to do, and roles for which they are responsible. They need to know what tasks they should be doing, what roles they are responsible for, and what they are not responsible for (so that they can refer instead to others who do have the appropriate role). As they do their tasks, they need input: how their task fits with other tasks, including ones that other people are doing, and how parts of the system are supposed to work. They will have questions to ask of others. In the course of doing a task, they will make decisions—about concept, about design, about implementation. These decisions will in turn affect others. From time to time they will find problems, both technical and social, and will need to identify who to work with to resolve the problems

At the same time, the team member sees themselves as part of the group. They will need to understand the team’s culture and norms. They will want their social needs met, developing trustful relationships with others they work with. The personal relations that someone has with others on the team influences who they choose to work with and who they will avoid, and influences how well they work together when they need to.

How someone works in a team can be expressed in terms of the team’s basic structure. The elements of this structure include:

Communication: how people ask questions and convey information about the work being done.
Groups: how people join together to work closely on some part of the project.
Trust: how the people on the team relate to each other.
Authority and responsibility: the scope of responsibility that each person on the team has.
Division of labor: how different people take on different kinds of tasks.

The objectives of the team and other stakeholders are emergent properties that arise from the low-level interactions among people on the team, following the structure.

Some of these elements deal with separating people from each other, while others deal with uniting them [Durkheim33, Chapter 3, pp. 115-122]. Authority and division of labor are about how each person has their own role, and they are expected to refrain from exceeding those bounds. Communication, groups, and trust, on the other hand, are about how people are joined together to achieve more than they could individually. A team needs both to function well: the ability to work in a group depends in part on each person knowing their role.

19.2.1 Communication

The communication elements of the model describe how people on the team share information about the project.

The work of building the system will be divided up amongst the team members. When one person, for example, designs one component, they will need to communicate with the people designing related components (using the model of relations in Chapter 12). Similarly, a team member who is handling planning and tasking (Section 20.2) will communicate with many other team members to track progress and status.

There are four general times when people will need to communicate:

When they are looking for information that another person may have. For example, when someone finds they need to know how some component is going to behave.
When they have information that will affect someone else’s work. For example, when one person decides on a component design, and that component interacts with another component.
When they need a decision or action. For example, when someone has completed a proposed design and procedures indicate that the design should be reviewed and approved before moving to implementation, or when someone has a team problem that needs to be resolved at a higher level.
When a decision or action has results. For example, when reviews are done, or when action is being taken on a team problem.

Communication can push information from where it is generated or known to people who need that information. Communication can also pull information from someone who has it, by asking them a question.

Communication can happen interactively or asynchronously. Interactive communication happens when two people are communicating directly with each other. Asynchronous communication happens when one person makes information available and another finds that information later. Documentation is a way for one person to communicate with another over long periods of time.

Communication happens when a decision or action is needed, or when one of them has produced a result that others need to know about.

Communication patterns can thus be characterized by:

When people communicate: interactively, over long periods of time, immediately after information is generated, in response to a need.
With whom people communicate: who knows what information, who asks whom to request information, how people find the person or document artifact from which to get information. How do people make decisions about who to communicate with? What incentivizes people to choose a good person to talk to?
Is the communication direct from one person to another, or indirect through an intermediate person or document?
What kind of communication is needed: normal technical communication, normal operational communication, or exceptional communication.

These communication patterns are encoded in team culture, in procedures that people use to do tasks, and in how people are organized into groups.

19.2.2 Groups

Many people like to work together: interacting regularly, sharing work, building social bonds. Working in a group is helpful when people are working on closely-related tasks or have closely-related roles. How closely depends on the person; some people are gregarious and gravitate toward groups while others reserve their interactions for fewer, more trusted people.

People can come together as a group when doing tasks together, or closely-related tasks requiring lots of interaction. They can do so spontaneously based on the work, or because a group is organized deliberately. People can also come together based on shared interests, experience, or work discipline.

A group is more than just people who communicate a lot. A group generally gives its members with some sense of identity and shared purpose.

One person can be part of multiple groups. It is common, for example, for one person to be part of one group that has been deliberately organized to work on a collection of components, while being part of a second deliberately-organized group based on work discipline, as well as being part of ad hoc, informal groups based on social interactions.

Groups can promote trust. When the people in a group behave respectfully toward each other and demonstrate behavior in line with team norms, the high level of interaction within a group provides a way for the group members to establish trust. When trust develops within a group, it can also promote feelings of trust for people outside the group: if person A recommends person B to person C, and C trusts A, then C is more likely to assume that B is trustworthy.

Groups can also promote distrust. If two people within one group don’t get along, they can create a rift among more people. A group also runs the risk of in-group identity turning into out-group dislike, expressing itself as teams working in silos because they lack trust for people in the out-group.

Sometimes people need to form a group with people they don’t get on with. This happens when there is a need for them to work together that overrides their relations with each other.

Groups can be characterized by:

Who is in the group? What is the basis for determining who is in the group?
What is the commonality between people in the group?
Is the group established deliberately, or does it come together spontaneously?

19.2.3 Trust

Trust is a condition describing part of the relations between people in the team.

Trust arises from social norms and respect. By norms, I mean standards of behavior both for interaction between people and for technical work, to which everyone on the team is expected to conform. By respect, I mean each one believing that the others have worth or value, and acting accordingly.[2] Trust is the confidence that others will follow the team’s norms, and act and communicate with respect.

Trust starts by one person learning from experience that they can trust another person. Trust arises from demonstrated behavior. People may enter into a working relationship with someone with a predisposition to trust them, but that is different from demonstrated reasons for trust. A team culture that incentivizes people to behave in trustworthy ways can result in that predisposition when someone learns that someone they trust also trusts a third person. A team, however, cannot meaningfully incentivize trusting someone; the team can only incentivize someone behaving in a way that can earn someone else’s trust.

Because trust comes out of experience working together, not everyone in a large team will know everyone else well enough to have a trusting relationship. In those cases, trust operates at a level of groups rather than individuals: person A believes that the people in group C are trustworthy based on reputation and team cultural norms. This is a weaker form of trust but just as essential for a well-functioning team.

Ideally, trust is reciprocal but it does not have to be.

When person A trusts person B, the two of them can work together more effectively compared to when they do not trust each other. A can share work with B and expect that B will follow the team’s norms about doing accurate work and communicating well. B can expect that A will delegate a task and then respect B enough to avoid micromanaging them. As long as the trust remains, both A and B have less anxiety about the work being done, both are more productive, and both get greater satisfaction than they would otherwise.

Lack of trust leads to the opposite results. If A assumes that B will not behave in ways that accord with team norms, then A will believe that they need to check on B’s work more often. A and B will share less information with each other and will be less willing to share work. Poorer communication will lead to errors in the work, and result in more work and greater anxiety for both parties.

A breakdown of trust can happen between groups as well as between individuals. When a team has a breakdown of trust, they do not communicate. Factions within the team stop coordinating their work, hiding information from each other. I was part of one large multi-company software project with teams at several sites; the teams would try to undermine each other in order to get their version of some software component accepted into the system. After a few years the project ended and the product languished. As another example, specific failures on the Boeing CST-100 Starliner crew capsule have been blamed in part on team mistrust. For example:

Neither team trusted one another, however. When the ground software team would visit their colleagues in Texas, and vice versa, the interactions were limited. The two teams ended up operating mostly in silos, not really sharing their work with one another. The Florida software team came to believe that the Texas team working on flight software had fallen behind but didn’t want to acknowledge it. (A Boeing spokesperson denied there was any such friction.)

—Eric Berger in Ars Technica [Berger24].

Trust can be characterized by:

The degree to which each person on the team trusts each other person.
The degree to which each person believes people they do not know well to be trustworthy, based on reputation.
The norms that people in the team are expected to follow, on which trust is based.

19.2.4 Authority and responsibility

While the previous model elements—communication, groups, and trust—are about people uniting to work together, the next two elements are about how people are different from each other.

In effective teams, each person does the right work. They know what is expected of them, and what is beyond the scope of their authority.

Authority and responsibility deal with how the project’s work is split among the team members; that is, the role that each person has.

I treat authority and responsibility as two parts of the same thing. Authority is the right to make decisions or do work on some topic. Responsibility is the obligation to do that work, and to do it well. The two go together: responsibility without authority is perverse, while authority without responsibility means bad decisions.

A role is associated with some scope of work. The scope defines what subjects the person is responsible for. The scope can be defined many ways as long as its meaning is clear enough that everyone will interpret it the same way. Scope for technical work might be based on system component (“person A is responsible for the design of component X”). It might be based on discipline: “person B is responsible for all security analyses”. It can also be based on a procedure (“person C is responsible for making orders from vendor Y”), or on operational work (“person D maintains the plan for meeting the Z milestone”).

The scope defines the right to make decisions or take actions. If one person has the role of designing component X, they are responsible for ensuring that component X is well designed and they have the authority to work out what that design is.

Conversely, if some topic is outside someone’s identified role, they must refrain from making decisions or taking responsibility for that topic.

A role is different from a task. A role is a long-term, ongoing responsibility to do work associated with a scope. That work may include tasks that fall within that scope, but a task has limited duration and a concrete intended result. The person who has a role is often responsible for doing work that is not part of a specific task. For example, the person responsible for design of some component will handle the task to create the design or tasks to correct errors in the design, but they are also responsible for answering questions about that component from other people on the team.

The goal is that every element of the work has someone who is responsible for it, every person has something they are responsible for, and that it is clear to everyone who is responsible for what.

Communication. When someone has authority to make decisions, at some point they need to communicate those decisions (or their effects) to others who will use that information in making their own decisions and taking their own actions.

Sharing roles. More than one person may take on a role. For example, a role that involves providing support to a customer may have more work than one person can do. When people share a role, they have a responsibility to coordinate their work so they give consistent answers or make consistent decisions. That they share the role should be clear to all the people involved and to people who may need to work with them.

Inadvertent overlap in roles can lead to errors. If two people both believe they have the authority to do a certain bit of work, but they are not aware they are sharing the role, they can make conflicting decisions, tell others conflicting information, or produce conflicting artifacts. Each of these situations can lead to errors in the system being built or in the way the team operates.

Delegation. Authority and responsibility can be delegated. Delegation means that one person confers some part of their role to someone else, possibly for a defined period or with restrictions on the kind of authority granted. The delegation might transfer the role from one person to another, so that the first person no longer fills the role (perhaps temporarily). Alternately, the role may be shared with the other person, in which case both people are responsible for the work and for coordinating with each other. A delegated role might also be rescinded.

One way to use delegation is for one person to have the overall role for some system component, and for that person to delegate responsibility for specification to someone skilled in specification, delegate design to a designer, and so on. The person with overall responsibility for the component typically reserves authority to review and approved work that has been delegated to others.

Sidebar: Delegation and micromanagement

Projects involving many people require sharing work. If someone doesn’t share work, then they will be overwhelmed, will take too long to get work done, and will be a single point failure in the project.

Delegating or sharing work implies a dynamic between the two people involved. Person A delegating the work defines the work that Person B, the delegatee, is to do. Person B does the work and periodically gives progress updates. Once the work is delegated, Person B can proceed independently and Person A can turn their attention to other things.

One way this can go wrong is if Person A doesn’t let Person B get on with the work independently, and instead tries to micromanage the work. Learning the habit of managing loosely takes time and effort—but it requires trust between the two people involved. That trust in turn depends on Person A having confidence that Person B will follow shared norms doing the work.

Another way this can go wrong is if Person B isn’t able to complete the work independently. If Person B finds a problem with the work, such as design error, that is beyond their scope, they can raise the issue to Person A and jointly resolve the problem. If Person B is unable to do the work, perhaps because they don’t understand the problem or find they lack a necessary skill, they can raise the issue and jointly handle the problem. If Person B tries to muddle through, however, they stand a good chance of not doing the work needed, leading to Person A needing to check their work in detail and possible redo the work.

In other words, sharing work requires having clear expectations of how to define delegated work and when to raise exceptions.

Resilience. A well-functioning team is able to handle problems when they arise. A team’s resilience depends in part on how authority is structured within the team.

There are several kinds of problems a team will encounter:

Someone being unavailable for a while or leaving the team; for example, when someone becomes ill for a while. Handling someone becoming unavailable involves allowing for redundancy in role assignments, and having one or more people who can take over when someone is unavailable. This implies that the person who takes over will know what they need to know—meaning there is communication before something happens.
Someone making technical mistakes. When this happens, the mistake must be detected, then resolved, then steps taken to reduce the risk of similar mistakes happening again. This typically involves defining procedures to review work to catch mistakes, and assigning someone the role to check that the reviews have happened as well as assigning others roles to perform reviews. (This is related to Section 8.2.6—Principle: Build in checks.)
Someone making operational mistakes. This is similar to a technical mistake, the effects of an operational mistake can affect how a team works, not just the system itself. Some operational mistakes reported by whistleblowers have legal implications. In many organizations, this includes giving someone a role to hear reports of problems from anyone in the organization, and then act on the report.
People having disagreements. Handling disagreement usually involves giving some third party a responsibility to hear about disagreements and authority to resolve them.

There are patterns in common to how many of these problems can be planned for. Providing redundancy in how authority is organized is at the core: planning in advance for someone to take over important roles when needed, building in checks of work, and assigning roles that create alternative communication paths to resolve problems. All of these in turn depend on communication so that someone can take over a role or check work.

Formally, these kinds of structures add nuance to the definitions of scope for the roles that need to be resilient. For example, three kinds of roles are defined to catch and resolve technical mistakes: the role to do the work, the role to check it, and the role to ensure that the check is done. These imply a limitation on the authority of the first role to make any arbitrary decision about the work, because the work must be checked by someone else. It adds a responsibility to ensure that the work is reviewable (for example, adequately documented) and that the relevant artifacts are communicated to reviewers. Similarly, having someone who can take over a role implies that someone who is a backup for the work is responsible for keeping current and stepping in when needed—and also refraining from acting on the role when the regular person is doing their job.

19.2.5 Division of labor

Division of labor is the principle that people do different kinds of work, meaning they have different authority and responsibility. This is desirable because different people have different skills and experience, and because work should not be duplicated unnecessarily.

Division of labor in systems-building is different from the classical usage of the term. The original usage was about a serial production system or assembly line, where one person does one step, hands the result to someone else who does a second step, and so on until the product is complete. (Smith, for example, uses the example of making pins [Smith22, Book I, Part 1].) The argument is that a worker’s specialization leads to increased skill that improves their productivity, and that avoiding the cost of switching from one task to another eliminates wasted time.

Division of labor is directly related to roles as discussed in the previous section. The roles define the units of labor to be divided among the team members.

Systems work divides labor in more ways than just serial production. Work can be divided by component, with a hierarchical structure from system to lowest-level component. It can be divided into supporting role, such as planning and team management, versus system-building roles. Not all roles need full time attention, leading to one person taking on multiple roles. Some roles are associated with specific procedures, such as coordinating purchasing.

Someone in the team has a role to decide how roles are assigned. This might be one person for a small team, or the role might be divided up and distributed to multiple people. These people should follow well-understood norms and procedures for making the decisions about who is assigned what role, including communicating those decisions to everyone affected. The way roles are assigned should take advantages of the way people differ: in their likes, their skills, their experience, and their desired growth.

The way work or roles are divided affects how people grow their skills and experience. If people are assigned work only on their current skills, they will not grow. Giving people tasks that stretch them can lead to improved skills, but can also lead to them doing the work badly and learning bad habits. Learning works best when someone being stretched can get mentorship from someone with relevant skills or experience.

19.3 Using model of teams

The high-level objectives such as efficient and accurate system-building or team cohesion are properties that emerge from the details of how people on the team interact. The structure and norms of the team can be designed and managed to promote these objective, and the model above provides a way to think about the structure.

Note that these properties emerge from how people actually behave, not from how the team is designed or how it is supposed to work. That is, the outcomes depend on the mental models that each team member has of how they work in the team, and the habits that come from their mental models.

Achieving desirable outcomes therefore means getting two things right: designing and maintaining a good intended structure for the team, and the team taking that structure on board and behaving accordingly.

19.3.1 Team culture and structure as a control system

Leveson et al. [Leveson11] discuss how to design systems so that they produce desired emergent behaviors while avoiding undesired behaviors. Their approach treats the problem as a control system, where a control process monitors and shapes the behavior of a lower-level process in ways that lead to the desired high-level results.

The control system in this approach consists of a controller, which monitors the state of the team (the controlled process) and makes decisions about actions the team should take. One or more people in the team take on the role of being the controller. The controller has a process model, which includes the controller’s beliefs about what the team should achieve, how the team is structured, and how all the people in the team are doing. The controller gets feedback from the team in the form of observed behavior and of things team members tell them. Once the controller determines that it is time to act on some issue, the controller can take steps (control actions) to change the team’s behavior.

The social norms and habits of respect come from the example set by the team’s leaders: those team members who have greater scopes of authority, or who are recognized as experienced in their discipline. In practice, one team member who is working on the details of some component has little influence to create the team’s norms, but can cause disorder and disrespect that spreads. The establishment and following of positive norms is a collective action problem that requires some degree of compulsion [Olson65]. Preferably, the compulsion is in the form of rewards for following good examples of adhering to norms, but sanction is needed to back up the rewards.

The model in this chapter can serve to organize the design of this control system.

Who is responsible? The responsibility for making the team work is spread over everyone in the team.

Looking at the team as a control system, there are two reciprocal classes of roles: the roles that fill controller functions and those for everyone on the team (the controlled process).

Everyone on the team has the role of being a team member. This role has the responsibility of following team norms and procedures. In terms of a control system, each person is responsible for accepting and following instructions from the controller, and for providing feedback about work and about how the team is functioning. In particular, when anyone on the team detects that there is a problem in how the team is functioning, they are responsible for communicating about the issue with someone whose role includes resolving the issue.

The controller part of the control system can be broken down into three classes of roles. These are:

The observer role: a person who receives feedback in the control system, meaning they observe how team members are doing their work and are responsible for deciding when there may be an issue to resolve.
The decider role: a person who is responsible for deciding how to respond to an issue; that is, for deciding on a control action that should address whatever situation has occurred.
The exceptional role: a person who is responsible for detecting problems with the normal control system roles or for receiving reports about them. This role comes into play when the normal observer and decider roles are not handling a problem. When someone reports a problem with the control system, it is sometimes called a skip-level or whistleblower report.

These roles can be used to support many different team structures. For example, a traditional hierarchical department/team structure can be represented by each department’s or team’s manager filling the observer and decider role for their department or team. The manager over a manager can fill the exceptional role to address problems with a manager’s work. Separately, many organizations create an explicit whistleblower function to address potential corruption or illegal behavior; the people in this function then fill part of the exceptional role.

Process model. This model is how the controller understands both the objectives for the team and the state of the team.

The process model includes the team’s objectives and how well the team is meeting them; its structure, and how people are working with that structure (or not); the roles each person on the team has and how they are progressing on the work associated with those roles; and generally how well each person is doing.

The observer and decider roles use this information to determine when part of the team is how working as it should, and to decide what steps to take to make things work better.

Unlike the control system for a machine, the process model for a team accounts for the well-being of the people on the team.

The process model also needs to consider what people on the team actually understand of the team’s culture and procedures, and their roles. In managing a team of skilled and well-meaning people, I have found that miscommunication or misunderstanding is the most likely source of problems.

Control decisions. Those people who have the decider and exceptional roles are responsible for deciding when there is an issue to be addressed, and what actions to take. Sometimes when there is some indication of an issue, the choice will be to wait and gather more information.

Some issues may appear in one part of the team but, on investigation, will be found to have causes in other parts of the team. If the decider or exceptional role is shared among multiple people, decisions will require deciders working together.

Problems in team execution can arise because the team has outgrown their current structure, not because any one person is behaving wrongly. I discuss this further below.

Control actions. The control actions are how people influence the team to keep it working on track.

The example set by a team’s leaders is perhaps the most important influence. If someone the team looks up to is doing some activity one way, they will be likely to follow: if someone is seen to be careful following a procedure to get design reviews, for example, others will be motivated to do so as well.

This raises the question of who is considered a leader. Leadership is a social construct; it is not necessarily an explicit role that someone in the team is given. People who are given roles with extensive scope of authority and responsibility are often seen as leaders. Others who are understood to have experience, who mentor other team members, and who establish social connections are also treated as leaders. Having this level of social influence in the team comes with a responsibility to model desired behavior, and should be considered when taking action to fix a team performance issue.

Instructions from one person to another based on scope of authority are a second kind of control action. If someone has a role of managing a subteam, the manager (who has a decider role) can instruct someone on the team to change their behavior. The instruction need not be hierarchical; for example, when two people are peers working on designing related components and are expected to come to agreement on how those components will interact, one of them can inform the other that they will not agree to some part of an interface design.

As I noted in the sidebar on delegation and micromanagement above, there are choices to be made about how instruction should be given. It can be directive, telling the recipient exactly what they should (or should not) do. This is appropriate when related to following a procedure that requires precision, such as operating test equipment that has the potential to cause injury. In other situations this can turn into micromanagement and inhibit the recipient’s ability to improve and work independently. On the other hand, the instruction can take the form of letting someone know that there is a problem and letting them work out how to address the problem. This approach helps the recipient learn and grow, especially if they can discuss potential solutions to get feedback. However, if the recipient is not able to figure out how to make a change, this approach will leave a problem unresolved.

Next, a decider can address an issue through education. Mentoring someone about part of their work can improve their work in the long term as well as addressing an immediate issue.

Finally, sometimes the appropriate control action to address an issue is to change the team’s structure. This can happen as the team grows and authority and communication structure reaches a scalability limit. It can also happen as a project transitions from one phase of work to another—for example, when moving from the initial design and implementation into preparing for placing the system in operation.

Team member behavior. The people on the team make up the controlled process in this approach. A controlled process receives communication from the controller with input that is intended to change the process‘ behavior. The process then generates feedback to the controller as it goes about its behaviors.

The team being made up of people, not machines, so they hear and act in a human way to communication from people in the controller role. When the team’s structures are designed, the ways that people communicate should be worked out so that those who are giving instructions know how to communicate to people, and so that people on the team know how to tell when instructions are being delivered.

People react when they receive instructions or assignment, of whatever kind. In a well-functioning team, the team members will act to confirm the instruction they think they are receiving, then adjust their work behavior accordingly—changing the roles they are working on, adjusting some technical work, and so on.

However, real humans don’t always neatly follow this behavior. Sometimes they misunderstand the instructions. Sometimes they ignore the instructions. Sometimes they develop resentment at the instructions. The communication between people with a controller role and the people on the team must include checks to catch misunderstandings, and continuous communication so that leaders understand how people in the team are feeling about their work.

The controller needs feedback from people on the team in order to continue to make accurate control decisions. In a team, this means that people on the team are providing information. The people have a responsibility to keep those overseeing their work informed of their progress. They are also responsible for communicating when they are dissatisfied with their work situation, or when they observe issues with the project.

Feedback. The people forming the control system have several ways to get feedback and observe the team. Some of these mechanisms can be designed into the system formally; others are informal behavior by the people with control roles.

Getting explicit feedback and reporting from team members is the first formal mechanism. As people on the team make progress on different tasks, they inform the controllers of the work completed, the problems found, and the steps yet to do. These reports can take many forms: updates to a task tracking system, regular status communications, and informal discussions. Explicit reporting has the advantage that it can occur regularly and in a form that encourages documentation of status. It has the disadvantage that it can become impersonal, and the reports can become inaccurate (especially optimistic) over time because team members want to look good to their teammates.

Someone in a controller role can complement these explicit reports with regular informal communication. Some organizations have advocated “management by walking around”, in which a manager informally talks with those who they oversee, without a regular schedule. This interaction ideally happens in person, so that the manager and the team member can treat each other as people and build up social bonds. In person communication also has the advantage of the full range of communication methods, such as body language. These informal communications have the disadvantage of not producing a documented record of what was learned, and if done clumsily they can lead to a team member feeling like they are being constantly monitored.

A well-designed team will account for problems in communication between a team member and those who oversee their work directly. A team can build in periodic “skip level” communication, where a team member can discuss their work and their state of mind with someone other than their direct managers, in order to detect and resolve problems with a manager. A well-functioning team will also provide feedback channels for team members to report larger or more systemic problems. In many industries, organizations are required to provide a way for anyone on the team to report corrupt or illegal behavior, for example.

Whatever feedback mechanisms a team uses, the channels should be designed to address bias and sampling problems. For example, if someone only reports on progress when they complete a task, there is no way to detect when they are having problems completing some task--in other words, reporting at completion biases information toward good news. Having only one path for information to come from a team member through a manager to higher-level controllers can show bias as the manager digests the information they receive and passes along a summary. This is one reason to combine multiple ways to get feedback.

Working with the control system. The team’s structure has to be designed and redesigned. It is expected to result in getting the system built efficiently and accurately, and for the team to maintain its satisfaction and productivity throughout.

Achieving this end only happens when the team is organized deliberately. While historically the organization choices have been made initially based on experience and then by incremental changes, it is possible to do better by explicitly designing and analyzing the team’s structure as a system.

The control system approach allows one to use techniques for designing and analyzing systems that have important emergent properties. In particular, the STAMP model of accident causes [Leveson11, §4.5] and the STPA hazard analysis technique [Leveson11, Chapter 8] provide a sound basis for analyzing how the team is organized. They provide a disciplined methodology for determining what hazards the team could face, such as duplicating work, failure to communicate, disagreements between people, or errors in the work. It also provides a structure for reasoning about how these hazards can occur and how to design the control system to eliminate or handle when the underlying causes occur.

As one example, the STPA hazard analysis methodology calls for identifying and addressing cases where multiple controllers can generate control actions for one controlled process. In a team, this happens when two people have some kind of controller role for one team member. These controllers can give conflicting instructions, or they can give instructions that have unexpected side effects. The hazard analysis methodology includes identifying cases when this can occur and then defining how the multiple controllers will coordinate their decisions to avoid conflicts [Leveson11, §4.5.3].

Team structure and planning. The structures and procedures that a team follows are related to but separate from how the team plans its work. The interactions within the team’s control system are continuous and immediate. They serve to maintain the social bonds that keep a team together. In building social cohesion and team culture, a team’s structure makes the team able to plan and carry out its work.

19.3.2 When to use the team model

The purpose of having a model of teams is to provide a language for describing how a real team is organized, and to provide tools for working out how a team’s organization might need to change.

There are four times to use the team model:

In ongoing team operation;
When forming a new team;
When a team’s structure needs maintenance; and
As a team outgrows its structure.

Ongoing team operation. A team’s structure and culture should determine how the team works and interacts. The purpose of treating the team as a control system is to ensure that it continues to work well, and to provide a basis for adjusting the team’s structure or its members to meet that end.

In ordinary operation, the assignment of roles and tasks has a great effect on how the team is functioning. A good assignment of roles will have at least one appropriate person covering each needed role, and work to spread the workload as evenly as possible across the team. In control system terms, to whom a task or role should be assigned is the control decision. It is based on the decider’s understanding of each person’s ability, current workload, and interests. Communicating the assignments make up the control actions, and then team members do the work.

Once assignments have been made, those with control roles monitor progress. One person may become busier than expected, and workloads adjusted in response. Someone may have unexpected trouble doing some task, which needs to be detected so that that person can be given help.

The team’s culture and social cohesion also needs to be monitored and managed. In control system terms, the controller sets the expected norms and procedures and communicates those to the team. The control actions that communicate this information take many forms: documented procedures, documents of team charters and cultural norms, and the examples set by leaders. The team, as the controlled process, will observe all this input and respond in their behavior. The controller is then responsible for watching how the team members work together, learning how they feel about each other, and identifying when some people aren’t meeting the expected cultural norms or getting along, and make adjustments accordingly.

Team member evaluations are a part of many organizations‘ procedures. These provide an opportunity to give people feedback on how they are doing at performing tasks and how they are fitting into the team’s culture. Having clearly-defined cultural norms and work assignments enables people to give feedback that measures a team member’s work against criteria that everyone should understand in the same way. (And if the feedback process reveals that some people do not understand the criteria in the way they were intended, then this is feedback to the team that the documentation needs to be improved.)

When people detect that the team is not operating as planned, they initiate corrective action. The kind of action depends on the kind of problem. If one person is not working as expected, the actions can be focused on that person: giving them suggestions or education, changing their work assignments, or in the worst case moving them out of the team. If the problem is between multiple people, then the next step is to determine why they are not working well together in order to address the working relationship between them. Sometimes, however, investigation will reveal that the team’s structure or culture needs to be improved. I discuss that below.

Forming a new team. A new team is an opportunity to design the team’s structure and culture. While this is often left to chance, with the first team members jumping into technical work, spending effort early in the process to plan how the team will work pays off in a project that functions well as it proceeds and starts to face organizational challenges.

The way that a team starts out affects how it continues to work years later. The habits that a team forms in its early days continue to influence how people work, and changes to these habits is difficult and slow (Section 8.1.5—Principle: Team habits). This means that some effort early on will pay off for a long time.

The model in this chapter provides a way to organize the thinking about how a team should work. How should authority and work be divided among people? Should the team have a hierarchical department structure, or a matrix organization? What cultural norms are expected for people’s behavior toward each other? What procedures should the team follow to do different parts of the work?

The structure of a team is not just a theoretical construct. To work, it must fit the abilities and experience of the people involved. A structure that requires perfection will not work, because nobody can work perfectly. A structure that people can’t understand or is too complicated won’t be followed, or worse, people will try to follow it but do so in some odd way.

The team’s design provides opportunity to think about how to make the team resilient. What functions should be shared? How can people working on one part of the system help each other? How do those people who are responsible for maintaining team operation keep aware of how the team is doing? What should happen when there is a serious problem within the team?

The decisions about how the team will work should be documented for everyone to read and follow. Having these documented—and brief—helps get everyone into agreement. Going through a process of building draft documents and getting team feedback helps build consensus early on. It also makes the task of adding new people to the team easier (they can read the documents) and smoother (each new person gets the same information others do).

Documenting the rationale for why the team is designed the way it is, along with analyses of how the team’s structure will meet its objectives, provides a basis for maintaining the team’s structure as it grows or as the work changes. It also helps people understand the spirit of the structure, helping them interpret the intent behind the documented structure.

A team typically grows and goes through distinct phases where it needs different kinds of structure; I discuss this below. The initial team structure will likely be simpler than what the team will need a few months later. However, the initial design for the team’s structure should include some thinking about the team will work as it grows. Because the social organization that is a team has inertia in its habits, starting out the team’s structure in a way that can grow into what it needs to be later will help avoid reorganizations that upset team operations and affect productivity.

Maintaining team function. Sometimes the planned team structure doesn’t work the way it was expected to.

When this happens, the response should be to work out why the structure isn’t working, and then determine how to change the structure to work better.

The tools for systems accident analysis are available to analyze why a team is not functioning. The STAMP methodology provides an organized way to determine possible reasons that a team is not functioning as desired [Leveson11, Figure 4.8]: people with the decider or exceptional role are not providing the needed instructions for some reason; those people do not have an accurate understanding of the state of the team; they are not getting accurate feedback from the team; the team are not getting instructions or acting on them as expected; or conflicting control actions. An analysis following this kind of methodology can reveal where the underlying problems are, and in turn suggest ways to change the team structure or culture.

For example, a team has people who are not following defined procedures for getting component designs reviewed and approved before moving on to implementation. An analysis might find that some people are not aware of the procedure (a problem with the control action/controlled process), suggesting that improving documentation and education about the procedure. On the other hand, the problem might come from one group being under pressure to deliver quickly, and those who are supposed to do reviews or give approval are not able to respond at the needed pace. This would suggest that streamlining reviews or adding reviewing resources would address the problem, or that the group needs schedule relief.

As a different example, consider a team organized into groups based on the component hierarchy. This team is having trouble with component integration: components that interact have passed design and implementation reviews, but when they are combined for verification they do not function as expected. This situation could arise from many sources. The organization into groups might be inhibiting communication between groups, leading to interfaces being designed that do not meet the needs of components being built by different groups. This might, in turn, come from flawed procedures that do not account for cross-group reviews; or it might come from group managers that don’t get along; or it might come from a problem with how interface design artifacts are managed. An issue like this could also show related problems, such as the project management not detecting the problems quickly or accurately until there is a crisis. I have found that in situations like this there are often several small changes that need to be made together to address the problem.

Team growth. A team’s organization generally starts small and informal, as a very small group starting to investigate a customer’s need or a potential system project. As the project moves forward, the team grows and its needs for structure change. The team also changes as people join and leave, and as people move from role to role.

I have found that most teams go through phases as they grow—rather than showing smooth changes over time. These changes arise from the combination of complexity growth, development of group relationships, and the growth in understanding of the work ahead.

Small groups (of just a few people) have been observed to go through a development sequence [Tuckman65][Tuckman77]. These small groups begin as the group forms, and the people work out how they should relate to each other and how to get work done. As time goes by they develop into a cohesive group that gets work done and where people trust each other. (The studies do not discuss how this process can fail, leading to a group that does not cohere or disbands.)

The interpersonal complexity of a team grows with the size of the team. The number of potential connections between team members is $O(n^{2})$ in the size of the team. In my anecdotal experience, the amount of time spent on coordinating work within the team grows in line with the number of connections. If there is no structure to the team, at some point the amount of time and effort spent on communication will exceed the amount spent doing work building the system.

When a project starts, the nature of the system to be built is not well understood. The team has to go through a process of working out the purpose of the system, developing concepts, and eventually beginning to design. Along the way, the team gets increasing understanding of the work ahead.

In practice, the combination of these causes leads the team to change its organization over time. At the beginning, the initial exploration of what the project might be (working a purpose and finding some initial concepts) is typically a small group. This small group will go through a process of learning to work together, but typically the group can self-organize and does not need hierarchy for much. As the work progresses and a few more people join the project, they will initially try to fit into the self-organized small group. These additions will alter the interpersonal relationships, but at some point the complexity of using consensus will necessitate creating some initial structure. The team will settle into this structure. But as the team continues to grow, it will initially accommodate people into the structure but eventually reach another point where more structure is needed to manage complexity.

The message is that a project should expect its team organization to change over time. Almost every project I have been part of has been resistant to addressing a need for changing team structure, and has put off dealing with it until a crisis occurs. In every case this cost the organization time and money, needlessly setting back the project. A project’s leadership should be alert to the need to periodically reorganize the team so that this can be done before it causes problems.

19.3.3 Example: conflicting instructions leading to inconsistent design

I worked on two projects that had problems building their systems because someone on the team got conflicting instructions on the objectives for some component they were supposed to be building.

In one case, a software developer was tasked with implementing a particular CPU scheduling algorithm in a real-time operating system kernel. This scheduling algorithm had been chosen in order to make certain system safety properties work, and to enable some high-level control features. The developer in question did not understand the assignment, and reached out to someone else—someone not authorized to make decisions about the CPU scheduling algorithm. The developer got advice from the other source and implemented a different scheduling algorithm. The other algorithm could not provide basic safety and control features the system needed. As this project was being executed on a cost-plus contract, the developer’s organization had to pay for someone to remove the work the developer had done and implement the correct algorithm.

In another case, one senior system architect (systems engineer) was responsible for a particular feature set of the system. The system architect was working with a pair of developers to work out a design for those features. A second senior system architect, who was not responsible for that part of the system, was having a conversation with the developers and instructed them to design the features in a particular way. This conflict in instructions to the developers led to confusion that took several days to detect and resolve.

Both these problems reflect two common team design flaws. First, both are instances of conflicting control ([Leveson11, §4.5.3]), in which a controlled process (the developer) receives conflicting control actions. Second, in both cases design authority (Section 19.2.4) had been assigned, but developers got instructions from someone else. In the first case the developer sought out advice from an inappropriate source; in the second, a senior person gave instructions outside of their authority.

The techniques for addressing a potential system hazard apply to the conflicting authority: first try to eliminate the conditions that can lead to a hazard, then make it unlikely to happen, reduce the likelihood of it causing a problem, and then try to limit the damage when it does happen.

The first line of defense is thus to organize the project so that conflicting decisions and authority do not occur, or make it unlikely. This is most easily done by having for each part of the system exactly one person authorized to make decisions, and making that information clearly available to everyone on the team. Note that this does not mean that only one person is allowed to design; rather, it means that one person has responsibility for the design. The responsible person can and should delegate the design effort as much as possible to the people actually doing the work, and the responsible person should focus on setting objectives for the design, guiding the design, and checking that the results are acceptable.

Theoretically, a team can avoid conflicting decisions or directions by having a few people operating in a way where they reach consensus before making decisions. In practice consensus algorithms work well enough for computer systems but people find it hard to work that way: communication happens informally, people are in a hurry, or someone has a good idea they get enthusiastic about and don’t wait to share it with others for agreement first.

The second line of defense is to have regular review points in the project when discrepancies can be caught.

19.4 Directory

Two of the first things people on the team need to know are their own roles and who else is on the team. Once they have that information, they can communicate with others to learn other things they need to know.

Consider the following scenarios.

Person A is working on some component. That component has an interface with another component, and so person A needs to coordinate how they implement their part of that interface with someone working on the other component.
Person B has finished a design for an update to a component. Project procedures say that they need to have the design reviewed and approved before moving on to implementing the design. Person B needs to find out who the reviewers and approver will be.
Person C discovers an ambiguity in the specification for a component, and they are concerned that this ambiguity may lead to a flaw in the designs that follow from the specification. Person C needs to find the people responsible for the specification so they can discuss the potential problem and find a resolution to the ambiguity.

For all these scenarios, the people need to determine who on the team is responsible for some part of the system beyond what they are working on themselves.

To meet this need, the project should maintain some kind of directory of people on the team. This should record:

Who each of the people are;
Where they are located or how they can be contacted;
The roles and authority each one fills, including what parts of the system them work on; and
How they fit into the team’s structure.

This information is generally fairly simple, but it must be kept current. If people come to believe that the directory is likely out of date they will not trust it.

Sidebar: Summary

Teams are made up of the people who do the project’s work.
A team has structure.
- How and when people communicate.
- How people are grouped.
- Trust and social relations.
- Role, authority, and responsibility.
- Division of labor.
Team structure can be designed and analyzed.
Team structure and system design are closely related.
Team can be treated as a control system.
- Some people responsible for guiding team behavior.
- Detect when team is having issues and work out corrections.
Team structure changes with team size.