Systematic social observation (SSO) is the direct observation of social phenomena in their natural settings. It is often a group enterprise with many researchers using a systematized protocol to gather quantified data. Its application in criminology has been sparse compared to official data or survey methods, but it offers unique measurement capacity that can prove valuable for many issues in criminology.

Below, SSO is first defined, and then the method is described along with issues of validity and reliability. Next, a review of the application of SSO to criminological questions offers some insight into the variety of settings in which the approach has been used, and finally this research paper concludes with speculation regarding the future configuration of SSO and its role in contributing to an understanding of important questions in criminology.

Definition Of Systematic Social Observation

Many individuals engage in social observation, for example, standing on a street corner and people watching or observing audience members at a public event. Viewing persons in natural social settings (or even the setting itself, e.g.,, assessing the quality of neighborhood housing in a location one wishes to move) is an act of observation. What distinguishes SSO is the systematic application of rules and protocols that structure the act of observation. Those explicit rules determine what constitutes relevant information for the observer and how that information is to be structured (coded) for subsequent analysis. Explicit rules for observation and coding make meaningful comparisons of events observed by multiple observers and at different times and places (Reiss 1971). One might say that SSO is structured watching with a purpose. SSO is a powerful tool for the study of human behaviors. Its power derives from the focus and structure it brings to observation of human behavior in its natural settings.


In the late 1920s and early 1930s, scholars studying early childhood social development practiced systematic social observation, using methods “… designed … to ensure consistent recordings of the same events by different observers .. .” (Arrington 1943, p. 83). Systematic social observation came to criminology at the hand of Albert J. Reiss, Jr., who encouraged social scientists to shed some “nonsensical” views about the limits and benefits of different forms of observing social phenomena (Reiss 1968, 1971). Reiss objected to the notion that direct observation of social phenomena in their natural setting was work for solo researchers using qualitative methods, while survey research was suitable as a group enterprise with many researchers using a systematized protocol to gather quantified data. Reiss argued that both direct social observation and survey research were in fact forms of observation that must confront the same set of challenges to producing interpretable information, that both were amenable to either solo or group practice, and that both could be used effectively for discovery or validation of propositions about social phenomena. Beyond these insights, Reiss’s important contribution to criminology in this area was the development and practice of the techniques of SSO. Reiss demonstrated how SSO could be used to answer important questions about what influences police-citizen interactions, with implications for theories about police-citizen relationships and for public policies concerning justice, race relations, and crime control.

Suitability Of SSO

What makes SSO especially valuable to researchers gathering data directly in the natural setting are precision of the observations and the independence of the observer from that being observed (Reiss 1971, p. 4). For example, some classic qualitative field research pioneered researcher access to the police occupation, but the necessarily selective samples of these solo researchers appear to have overstated the uniformity of police practice (Skogan and Frydl 2004, p. 27). SSO researchers have observed considerable variation in the way police use their authority, and some have shown the high degree of variability that may be found with the same officer over time. Precision is also accomplished through the sequencing of events and the detailing of context, matters that may not be well documented by official records or accurately recalled by participants when interviewed – for example, how police encounters with the public escalate into rebellion or violence (Sykes and Brent 1983). Sometimes SSO precision derives from the application of complex standards or expectations to the practices of persons with obligations to perform in particular ways. For example, the extent to which legal actors conform to constitutional standards or a professional standard can be assessed. Further, SSO can be used to determine the extent to which justice officials comply with the preferences of citizens they encounter or whether citizens comply with the preferences of justice officials in everyday situations (e.g., Mastrofski et al. 1996).

SSO is especially desirable when the question demands detailed knowledge of situations, conditions, or processes that are not otherwise well illuminated or where there is reason to question the validity of knowledge based on other forms of data collection. SSO may also be useful in studying people who might find it difficult to provide an objective or accurate account of what the researcher wishes to know (such as their behavior and the context of that behavior in highly emotional situations). Where there are strong temptations to omit, distort, or fabricate certain socially undesirable features, such as illegal, deviant, or otherwise embarrassing situations, SSO offers an independent account. This is, for example, a limitation of survey-based citizen self-reports of encounters with police to deal with a problem caused by the survey respondent, and especially problematic if there is systematic variation in the degree of error across important subgroups within the sample, for example, according to race.

While much of the SSO research has focused at the level of individual persons as decision-makers, the 1980s saw the beginning of studies that use an ecological unit, such as the neighborhood block face, as the unit of SSO analysis. Noting that neighborhood residents find it difficult to offer accurate descriptions of their neighborhood’s social and physical environment, Raudenbush and Sampson (1999) highlighted the value of an “ecometric” approach that uses SSO in conjunction with neighborhood survey research to more fruitfully characterize the state of neighborhood physical and social structure.

SSO may be especially well suited to situations and events where all of the relevant actors and events pertinent to the phenomenon of interest can be observed from start to finish in a limited, well-defined time period. For example, the police decision on how to deal with a traffic violator is clearly bounded in time and place. To the extent that the decision is heavily influenced by the context of the immediate situation (e.g., the offense, the evidence, the driver’s demeanor), the decision on how to treat the traffic offender can be captured by SSO and has been especially useful in understanding informal sanctions, which are not completely captured in official records. In general, SSO lends itself to observing phenomena that occur either with high frequency, such as drivers’ noncompliance with speed limits on public highways, or at predictable times and places, such as criminal trials, scheduled meetings between probation officers and offenders, or even field tests of prison security systems. Events that occur less frequently, such as acts of social disorder in public places, may require considerably more observation time to obtain reliable estimates (Raudenbush and Sampson 1999), or they may be so infrequent and unpredictable as to make SSO simply impractical, such as the police use of lethal force or the life course of criminality in a sample of individuals. One of the most frequent uses of SSO has been investigating how criminal justice workers operate in the context of role expectations generated by their organization or profession. Primarily focused on police, SSO research in criminology has been very concerned with how officers negotiate the tension between the formal (legal, bureaucratic, and professional) standards set for them and those that issue from the occupational culture. SSO could also be applied to role conformance in the context of informal or illegitimate organizations, such as gangs.

SSO is often used in conjunction with other forms of observation. Some studies have used SSO to measure the extent to which treatment conditions in randomized trials have been maintained (Sherman and Weisburd 1995, p. 685). SSO data have been linked to other forms of data collection on research subjects, such as census data (on neighborhoods), survey interviews of police officers, and follow-up interviews with citizens who were observed in encounters with police. And sometimes SSO is used to supply data not otherwise available, such as objective measures of the physical and social disorder in urban neighborhoods (Raudenbush and Sampson 1999).

Perhaps the most frequent reason that criminologists have turned to SSO is their dissatisfaction with the data they could obtain by other means, such as official records and survey research (Buckle and Farrington 1984, p. 63). Self-report and victim surveys have a number of biases and limitations, but since these are different from those inherent in SSO, it can provide an alternative perspective on specific phenomenon relative to those approaches (Parks 1984).

Unit Of Analysis And Sampling

Planning the selection of what is to be observed is an essential element for SSO. Like survey interviewing, SSO requires a careful focusing of what is to be observed and makes it possible to estimate parameters and evaluate error. The first step requires establishing the unit of analysis. Given that much SSO focuses on social interactions, there are three distinct approaches (McCall 1984, pp. 268–269). One uses a time period as the unit of analysis, observing what happens within each discrete time segment, such as what behaviors police officers display in a 15 minute segment of time or the level of social disorder on a street segment during a one-hour period. Another uses a behavior or act as the unit, tracking the sequencing of different behaviors over time, such as the behavioral transactions between officers and citizens who engage each other. The third approach is to socially construct an event as a unit of observation, such as a face-to-face encounter between a police officer and citizen (Reiss 1971) or a public meeting between police and members of a neighborhood organization (Skogan 2006).

Once the unit of analysis is decided, the researcher must consider the sampling frame. The same principles of sampling apply to SSO as any other data collection method, such as survey research. The researcher must consider where and when the units of interest may be found and determine an efficient method of capturing a representative sample. An example of a straightforward sampling strategy is an SSO study of shoplifting that randomly selected shoppers entering a store, systematically varying the location of observers among entrances to the store (Buckle and Farrington 1984).

SSO researchers often use more complex sampling strategies focusing on geographic space. They have sampled police beats and specific days and times within them, oversampling places and times where higher levels of police-citizen encounters were expected. Some researchers rely upon the observed subjects making their own choices as to where observers conduct their observations. This makes sense when the object of study is a specific research subject, but when the object of study is the geographic entity itself, an independent sampling plan is required. For example, a study of public order in a park required researchers to conduct hourly park patrols to observe and record activities of persons by location within the park (Knutsson 1997). Some researchers have used a smaller geographic unit than a police beat or park. Several studies use the face block to apply SSO to the measurement of disorder on public streets, defined in terms of traces of physical and social disorder (trash, graffiti, loitering, public intoxication) (Sampson and Raudenbush 1999). In one study, observers drove through street segments videotaping what was viewable from the vehicle (ibid). Others have performed live observation at the “epicenter” of each block face (best location to observe the most activity), randomly selecting short periods of time for observation from that location and recording them on check sheets (Weisburd et al. 2006). But observers could focus on single addresses, as might be done if one were interested in observing the extent of different kinds of desired and undesired social activity at crime hot spots. Even smaller spatial units have served as the sampling frame. A study of the relationship between crowding and aggression in nightclubs selected high traffic areas within the establishment (10 m2) to observe levels of patron aggression for 30-min time periods (Macintyre and Homel 1997).

While much of the extant SSO research must develop time-or area-based sampling frames that capture unpredictable or unscheduled events, some SSO studies have focused on scheduled events, such as the delivery of therapeutic community programs in correction institutions or the previously mentioned police-community neighborhood meetings. Sampling of regularly scheduled events is common in research on educational practices and physician behavior, a practice easily replicated for certain aspects of the legal process of interest to criminologists.

Sometimes the practicalities of conducting successful field observation make the research vulnerable to sample biases. In cases where consent of those to be observed must be secured, a clear bias is introduced when those who refuse to be observed differ in their behaviors from those who are willing to be observed. The practical requirements of observation can introduce bias as well. For example, the observation of disorder on Chicago block faces required light that was sufficient for observation only between 7 am and 7 pm (Sampson and Raudenbush 1999), meaning that researchers were unable to measure many forms of disorder that mostly occur in the darkness. It would also be challenging to observe many aspects of law enforcement inquiry and exchanges in the investigative and prosecutorial processes, because much of the effort is not limited to face-to-face encounters, but rather occurs through telephone and computer, modes of communication that may necessitate very different sampling frames and observational methods. Particularly challenging are studies that require a sampling of cases rather than individual decision-makers, inasmuch as it is difficult to track and observe the behavior of many different persons who may be involved in making decisions about a case.


Principles that apply to other forms of research also apply to the creation of instruments for structuring and recording SSO. Sometimes the instrument takes the form of a tally sheet or log for recording the frequency at which phenomena were observed, such as counting disorderly elements at block faces (Sampson and Raudenbush 1999) – or the timing and duration of events, such as police presence in a hot spot (Sherman and Weisburd 1995). Often the instrument takes the form of a questionnaire that is directed to the observer. A study of police use of force, for example, might ask observers to code a series of close-ended questions about the citizens involved in an incident (their personal characteristics, their appearance, their behavior), the behavior of police (how much and types of force used), and other features of the situation (location of the event, its visibility, the presence of bystanders).

SSO instruments have the desired effect of focusing observers’ attention on items selected for observation. Field researchers have demonstrated a substantial capacity to recall the relevant features of long sequences of these events, given the repetitive use of the protocols. Nonetheless, greater complexity in the coding system heightens the risk of error. The accuracy of such recall is undoubtedly variable, but researchers have not assessed most of the correlates of recall accuracy (e.g., observer characteristics, instrument characteristics, and the observational setting).

Recording Observations And Use Of Technology

Two issues arise in recording of phenomena observed through SSO: (a) whether it is contemporaneous with the observation or later and (b) whether technological recording devices are employed. Resolving these issues requires choosing the highest priority and what must suffer as a consequence. The more contemporaneous the recording of an observation, the less the vulnerability to recall error and various forms of bias, but in many cases, the act of recording may increase the reactivity of the observed parties to the process of being observed, as, for example, when observers posing as shoppers follow actual shoppers to observe whether they are shoplifting (Buckle and Farrington 1984, 1994).

Employing technological aids is usually intended to increase the accuracy or detail of observation from that which would be otherwise available. Handheld electronic recording devices have been used in observing police-public interactions and in observing the social and physical environment of neighborhoods. Audiotaping of calls for service to police telephone operators has been used to gather data on police workload. Videotaping neighborhood block faces from a slow-moving motor vehicle has been used to observe neighborhood disorder. Use of handheld personal digital devices allows contemporaneous observation and recording of brief, frequent events and is most practical when the number of aspects to be observed per event is small in number, which minimizes the interference of recording events occurring in close succession with observing them (McCall 1984, p. 272).

The major advantage of initially recording events electronically and then encoding those records later for analysis is not only the elimination of recall problems but also that more detailed and accurate observations may be made, and the testing of interobserver reliability is facilitated. Further, field researchers may in some instances feel safer when they make the initial recording of their observations from the security of a moving vehicle or when events are recorded with a remote and unobtrusive device (time-lapse photography of a street corner). Nonetheless, there are a number of drawbacks that have concerned other researchers. Video recordings cannot exactly replicate what all of the available senses would communicate to an observer who was there “in the moment.” The use of drive-by photography can be expensive; it raises ethical concerns because denizens of the neighborhood find them intrusive and perhaps anxiety producing; it may raise legal and human subject protection issues if alleged criminal acts are recorded; and they may cause significant reactivity among those being observed. In other instances, however, the pervasiveness of already-present surveillance technology that records observations in readily shared (digital) formats (closed circuit television in public and mass-private settings or in-car police video cameras to record traffic stops) may afford researchers a relatively unobtrusive source of data that does not encounter these problems. Dabney, Hollinger, and Dugan (2004), for example, used augmented video surveillance to study shoplifters. The researchers followed a sample of drugstore customers at a single location equipped with high-resolution video surveillance cameras. They were able to determine which customers engaged in shoplifting and coded data on customers’ personal characteristics, as well as behavior. Increasingly, much technology-based surveillance derives its unobtrusiveness, not from its being unknown to subjects, but that it is taken for granted (Shrum et al. 2005, p. 11). And with the advent of nonlinear editing packages for digital video, the data itself (traditionally analyzed in quantitative or text format) can be readily manipulated and analyzed as images (Shrum et al. 2005, p. 5).

Error, Reliability, And Validity

SSO data are subject to the same range of threats that befall other methods. Error can be introduced by the observer, and issues of reliability and validity of the method must be addressed. Observers can introduce error intentionally (cheating) or unintentionally (bias or reactivity). Cheating is rarely reported in SSO, although its frequency is unknown. It seems likely that most instances of SSO cheating go undetected. Shirking, a more subtle form of cheating, may occur if observers attempt to reduce their workload by failing to record events that would require writing extensive narratives and structured coding. There has been no direct systematic assessment of the extent and impact of this form of shirking in SSO, but one researcher did examine the effects of time on the job (a proxy for burnout) on researcher productivity but found that productivity was not significantly related to time on the job lower (Spano 2005, pp. 606–608).

Sources of unintended biases in SSO are the mindset and prejudices that observers bring to the field or develop on the job. These may affect what they observe and how they interpret it. The research exploring these issues for SSO does not offer clear and consistent findings. Reiss (1968, 1971b, pp. 17–18) found that an observer’s professional background (law student, sociology student, or police officer) did have consequences for some types of information, but not others. A later study attempted to determine whether a statistical relationship between observed police orientation to community policing and officer success insecuring citizen compliance could be attributed to observers’ own views on community policing (Mastrofski et al. 1996, p. 295). A clear association was not found between the observers’ attitudes and the effects that their analysis produced.

Some types of observation judgment are undoubtedly more vulnerable to personal bias than others. Some research, for example, required field observers to judge whether police officers applied excessive force against citizens. But a more effective strategy may be to bifurcate the process into (a) recording narrative accounts of what happened (without asking the field observer to make a judgment about excessive force) and (b) having separate, specially trained experts review these accounts and make an independent judgment about whether they constitute a violation of some standard, legal, or otherwise.

No responsible researcher can casually dismiss the risks of reactivity in SSO, but a number suggest that it is dependent on the context of the observational setting and the nature of the relationship between observer and observed. In SSO of police it has been argued that reactivity to the observer can be reduced by the observer downplaying any evaluative role and emphasizing one’s naivety as a “learner” (Mastrofski et al. 1998; Reiss 1968). Yet even this approach may generate more “teaching” or show-off activity in police subjects. Observed officers who engage in such teaching have been reported making contact with citizens to illustrate elements of police work to the observer (Mastrofski and Parks 1990, p. 487). And some types of observers (e.g., females in the presence of male police officers) may produce more of this effect than others (Spano 2007, p. 461).

One of the distinct advantages of SSO over solo field research is that it facilitates testing and improvement of the reliability of observations. Early on, much attention was given to the use of multiple observers and estimating their inter-rater reliability. Where many researchers can independently observe the same phenomenon by having multiple observers on scene or by using video recordings, the testing of inter-rater reliability is accomplished by measuring the extent of agreement among the pool of observers for the same set of events. Sometimes disparate independent observations of the same event are resolved by a process of discussion and negotiation. Where multiple independent observations of the same event are not possible, and that is often the case in situations where having more than one observer would be too disruptive, observers might be tested by using their detailed narrative descriptions to determine (a) if they are properly classifying phenomena according to the protocol and (b) the extent of agreement among persons who use those narratives to make classifications. For example, this has been done for characterizing a wide range of police and citizen behaviors in predicting citizen compliance with police requests (McCluskey 2003, pp. 60–74).

Recently, SSO researchers have broadened their reliability concerns to incorporate measurement accuracy and stability. Raudenbush and Sampson (1999) apply psychometrics to the development of “ecometrics” to better understand the error properties of SSO data gathered from observing physical and social disorder in urban neighborhoods. They adapt three psychometric analytic strategies: item response modeling, generalizability theory, and factor analysis to illuminate the error structure of their observational data and to make judgments about the best ways to limit different sources of error in future observational studies. For example, they find that physical disorder can be more reliably measured at lower levels of aggregation than social disorder, due to the much lower frequency of the latter in their observations (Raudenbush and Sampson 1999, p. 30).

While textbooks often note that field studies are less vulnerable to validity problems than surveys because the method places the observer “there” while events are unfolding, they are of course only valid insofar as they produce data that measure what the researcher intends for them to measure. This suggests that issues of validity are related to the protocols, rules, and frameworks that guide the observation. For example, the validity of observations is weakened when observers make inferences about motives or psychological conditions, which cannot be directly observed.

SSO Contributions To Criminology And Future Opportunities

SSO data have made major contributions in two areas of criminological research: the behavior of rank-and-file police officers and the measurement of disorder in urban neighborhoods. SSO has dominated the empirical research on the discretionary choices of police officers: making stops and arrests, issuing citations, using force, assisting citizens, and displaying procedural justice (Skogan and Frydl 2004:ch. 4). One of SSO’s special contributions has been the scope of explanatory elements made available for the researchers’ models. These include many details of not only the officer’s behavior but also the context in which it occurs (nature of the participants, their behavior, the location, and the neighborhood). SSO has also been instrumental in detailing the nature of the process of policecitizen interaction, illuminating the interactive quality of temporally ordered micro-transactions or stages that may occur in even a relatively short police-citizen face-to-face encounter (Sykes and Brent 1983). And SSO has also allowed researchers to observe elements of organizational control and community influence on the work of police officers. For example, researchers can learn more about the influence of police supervisors on subordinates’ practices (Engel 2000), the dynamics of police-community interaction and their consequences when police and neighborhood residents deal with each other at community problem-solving meetings (Skogan 2006).

A second area where SSO research has concentrated is the examination of neighborhood physical and social disorder. It has been used to test the impact of police interventions in hot spots, showing that police interventions in these “micro-places” not only reduce crime and disorder, they also diffuse those benefits to nearby areas (Sherman and Weisburd 1995; Weisburd et al. 2006). The largest project in this area has focused on Chicago neighborhoods and has produced a number of insights relevant to the testing and development of theories of the role of neighborhood disorder in causing crime in urban neighborhoods (Raudenbush and Sampson 1999). Using SSO-based measures of “objective” disorder described earlier in this research paper, researchers have examined the sources and consequences of public disorder. The research has demonstrated the importance of “collective efficacy” in predicting lower crime rates and observed disorder, controlling for structural characteristics of the neighborhood (Sampson and Raudenbush 1999). Collective efficacy also predicted lower levels of crime, controlling for observed disorder and the reciprocal effects of violence. The researchers found that the relationship between public disorder and crime is spurious, with the exception of robbery, which is contrary to the expectations of the well-known “broken windows” theory of neighborhood decline.

In general, SSO has afforded precision that has in many cases shown the phenomena of interest to be more complex than other forms of data collection had indicated. For example, SSO researchers have found rich variation among police officers in their patterns of discretionary choice and even noted the instability of those patterns for individual officers over time. And the independence of SSO observers from the phenomenon of interest has provided a means to understand the contributing factors to the social construction of phenomena, such as the contributions of a neighborhood’s racial profile in assessing its level of disorder (Sampson and Raudenbush 2004).

There are many opportunities to expand the use of SSO to increase knowledge and understanding. Largely untapped is the observation of crime and disorder, especially at the micro-level, where observers have the opportunity to make detailed observations of offenders in the act. SSO studies of shoplifting and aggressive or disorderly behavior in bars and clubs show that this is feasible where observers can easily blend into the environment. Where that is not possible, access to unobtrusive surveillance technologies appears to offer opportunities for detailed observation that reduce reactivity concerns. It is highly likely that criminologists will take advantage of the ubiquity of electronic surveillance to capture events that would otherwise be costly to observe. For example, the growing sophistication of surveillance and identification technology may make it possible to use facial identification software to gather data for a network analysis of persons who frequent hot spots. This includes not only the growing use of video recording devices by government and private sector organizations but the now ready availability of miniaturized recording devices to the general public (through cell phone recording devices).

In searching for efficient ways to use SSO, criminologists will likely capitalize on the growing body of evidence about the predictability of crime and disorder occurring in small geographic spaces. Because much “street” crime is so highly concentrated in a relatively small portion of addresses or face blocks, the location of observers or observational devices can very efficiently generate lots of information on what occurs, especially in public areas. In addition, given heightened levels of obtrusive surveillance in public places, SSO should prove an excellent way to understand how security and surveillance operate, why certain methods are effective, and the collateral impacts of various methods of monitoring and control designed to increase public safety.

Another venue for SSO to be used fruitfully is in experimental studies. SSO can be used to measure key aspects of the process that presumably operate to link treatments to outcomes. For example, if the physical redesign of bars and serving practices of bartenders are intended to reduce violence in those establishments, do patrons in fact alter their patterns of behavior in the ways that are expected to produce less violence (Graham et al. 2004)?


Four decades ago Albert Reiss showed criminologists the utility of systematic social observation, but it remains a method used infrequently. This is due in no small part to two things. First, criminologists are rarely exposed to training and opportunities to do SSO during their course of study. Second, those who know a little about it may often expect that it requires more time and resources than they have available. This may indeed be the case. Many projects could be taken on a smaller scale with a narrower scope of questions than the better-known, large SSO projects, especially if technological advances are leveraged for both recording and coding purposes. Some researchers may decline to use SSO because of reactivity concerns, but the available evidence suggests that these problems are often manageable and may be no more severe in any event than found with other data gathering methods.

Increased use of SSO will undoubtedly attract and stimulate greater scrutiny of its limitations, as well as its advantages. The error properties of most SSO data sets have been underexplored, and more attention here is needed. Expanding the use of SSO and more comprehensively assessing its strengths and limits could be fruitfully combined into a more comprehensive assessment of other methods of gathering data on crime and justice phenomena.

SSO deserves the consideration of researchers because of its many advantages. It offers enhanced prospects of validity for the study of crime and justice phenomena, and it increases confidence in reliability because of the researcher’s direct access to the phenomenon of interest and greater control and transparency of data encoding. It affords greater precision in capturing details of the phenomenon and its context, such as the sequencing of what happens before, during, and after those events. In many cases it may be the least problematic method for acquiring information. Criminology, which has strong roots in the traditions and methodologies of sociological research, remains heavily reliant on the use of sample surveys and official records (McCall 1984, p. 277; Reiss 1971). But as the field matures and diversifies intellectually, more of its researchers may with justification be inclined to make systematic social observation the method of first, not last, resort.


