Observations on film art : Society for Cognitive Studies of the Moving Image

Archive for the 'Society for Cognitive Studies of the Moving Image' Category

Cognitive scientists 1, screenplay gurus 0

Kristin here:

The annual meeting of the Society for the Cognitive Study of the Moving Image is currently taking place at Elte University in Budapest. It runs from June 8 to 11. It must have started off very well. Barbara Flueckiger posted this comment on the opening keynote address on Facebook: “Just attended an absolutely fascinating and inspiring talk by Talma Hendler, a professor of psychiatry and neuroscience: ‘Brain Shows Where the Drama Is. A Call for an Empirical Neurocinematic Agenda.'”

Last week David posted a new web essay on “common-sense film theory” as a way of participating from afar in the work going on in Budapest. Now it’s my turn to post a SCSMI-related item.

The mirage of the second-act desert

Twelve years ago I published Storytelling in the New Hollywood (Harvard University Press). It included a claim that most classical Hollywood feature films have four acts, lasting roughly 25-30 minutes each, adding up to a film of from 100 minutes to two hours. This claim ran counter to the old notion, popularized by Syd (Screenplay) Field, that Hollywood movies (and, according to Field, all feature narrative films) have three acts of 30, 60, and 30 minutes respectively.

My book analyses ten films in detail, dividing them into their four parts: Setup, Complicating Action, Development, and Climax (usually including an epilogue). I also have an appendix listing ninety additional films that I timed by large-scale part, ten for each decade from the 1910s to the 1990s. My conclusion was that most of these films stuck to the four-part structure with each part within five minutes either way of lasting a half-hour. Some of the exceptions were actual three-act films (e.g., Adam’s Rib) and others were longer films. Amadeus, one of my ten main examples, is 160 minutes long and has five large-scale parts. (“Large-scale part” was the rather cumbersome term I employed in the book, trying to avoid the misleading word “act.” I have to admit though, that there’s little chance of people discontinuing “act” and switching to “large-scale part.”)

I also defined the “turning point” that Field says divides acts more concretely than he does, specifying that it usually results from moments when the main characters formulate or modify the goals that drive the narrative forward. (I expand upon the relationship of goal shifts and turning points in a previous entry.)

Unlike Field and other screenwriting “gurus,” I allow for exceptions to my four-part schema. The Pink Panther doesn’t have a turning point until over 52 minutes in; the lengthy early part of the film consists of a bedroom-farce situation of male sexual frustration that makes absolutely no progress toward the putative subject of the film, the theft of a valuable jewel. As far as I can tell, It’s a Mad, Mad, Mad, Mad World has no turning points at all. I remember thinking it was quite funny when I saw it as a child.

Some screenwriters have read the book, and some who teach screenwriting courses assign it as a textbook. At least it’s still in print, which is something to be grateful for at my age.

Since the book was published, it certainly hasn’t knocked out the conventional view that films have three acts. Not infrequently I read a review in Variety or another specialist journal that refers to the last quarter of the film as the “third act.” The idea that films fall into chunks lasting 30 minutes, 60 minutes, and 30 minutes, no matter how counterintuitive or inaccurate, is too entrenched to be dislodged, apparently.

Still, I have the consolation that I’m apparently right. Other people have tried out my claim and found it to work. Creative Screenwriting gave a very kind review to my book (in its March/April 2002 edition, not online). D. K. Holm, the reviewer, was inspired to test out my system on the next three films he saw in theaters. He declared that it worked as predicted for The Shipping News, In the Bedroom, and even Ali, a five-act film. (My claim is that longer films don’t stretch their four acts. They add more, roughly 30 minutes long, for every half hour beyond the standard two-hour feature. Roughly half an hour apparently seems to create a pleasing balance as far as Hollywood practitioners are concerned—whether they’re aware of it or not.)

Real scientists at work

Now James E. Cutting, whom we enjoyed meeting and talking with at last year’s SCSMI conference (David devoted part of his report on the conference to James’s work), has collaborated with two of his graduate students, Kaitlin L. Brunick and Jordan E. DeLong, on a quantitative study related to Storytelling. They set out to test the whether the four large-scale parts of a classical film are reflected on the stylistic level. Specifically, do shot lengths and the use of non-cut shot transitions (fades, dissolves, and the like) vary in a patterned way across or among the parts?

It’s exciting to see cognitive scientists study claims that I’ve made on the basis of close film analysis. David has had his claims about staging and attention in There Will Be Blood confirmed by Tim Smith’s eye-tracking research on the same sequence. Now the results of Cutting, Brunick, and DeLong’s work relating to the four-part structure have been published in Projections (Vol. 5, issue 1, Summer 2011), subscriptions of which are included in SCSMI membership. Luckily I didn’t know this article was on the way, or I might have been in some suspense to find out whether my model worked. (This and other articles on film are also available online, linked from this page that lists the work on film published by James and his colleagues.)

As the authors’ starting point, they accepted my four-part structure. Or as they put it:

We find Thompson’s argument persuasive and her data remarkably clean. Nonetheless, the division into acts is built on a detailed analysis of the narrative, and we work from the physical properties of film. Many observers might agree on act divisions, but these divisions would not necessarily be reflected in any physical measure of a film’s shots and transitions. Thus, without prejudice as to what we might find, we sought data in shot lengths and in shot transitions that might corroborate Thompson’s analysis. (p. 4)

The team used 150 films they had previously studied in their work on editing patterns and their possible correlations to human attention. Of these, some were eliminated because they were too long, so that the statistical work could concentrate on films with four parts. I cannot claim to be able to follow all the statistical manipulations the data from these films were put through—though I can tell from the information in the footnotes that the degree of probability for some of their results indicated an amazingly small chance of error.

The figure at the top is a “scatter plot” of shot lengths for 143 Hollywood films from the period 1935 to 2005. These figures have been statistically adjusted in ways that allow the films to be compared despite their differing lengths. (I have no idea how this sort of thing is done. You’ll have to see the explanation in the article.)

One striking revelation of this chart is that the lengthiest shots in the films occur at the beginning, end, and at the one-quarter, two-quarter, and three-quarter points. Moreover, the “scallop” pattern of rising and falling average shot lengths shows a “tendency toward intensification of films near the middles of acts, where shot lengths become slightly shorter.” The ASLs were found to be as much as 1.1 seconds shorter than in the early and later portions of the acts.

Luckily I was right about the four parts. But beyond that the authors came up with results supporting my claims about the functions of the plot’s third part, the Development. Here’s how.

As part of the study, the authors tackled the question of where non-cut shot changes fall in relation to the quarters of films. Non-cut changes are fades, dissolves, and the like. It turns out, not surprisingly, that quite a few of them come early on, since the exposition may move among time and places setting up the basic premises of the narrative. But the rest tend to come in the third quarter, the Development. That makes sense to me. By my definition, the Development is essentially the stretch of action where most of the major premises, goals, and obstacles have been established. Before the climax can begin, the protagonist usually struggles against obstacles, usually provided by the antagonist. Often relatively little happens in the Development to forward the action, at least in comparison with the other three parts. At the Development’s end, some vital last premise is introduced that allows the action to move into the climax portion, where the plot moves forward relatively quickly and no more major premises are introduced.

The Development might contain extra fades and dissolves for two reasons. First, a surprising number of films have a montage sequence shortly after the middle turning point. The one depicting Michael Dorsey’s rise to fame as Dorothy Michaels in Tootsie is one such. Second, since the Development often is the section where time passes until the point where the climax can begin, one would expect fades or dissolves to cover temporal gaps. I’d have to go back and watch a bunch of films to confirm those hunches, but they seem logical.

My main purpose in writing Storytelling in the New Hollywood was to show that the principles of classical narrative construction formulated in the 1910s and used throughout the studio era are still operative today. Despite film reviewers’ complaints that action and special effects have come to substitute for story appeal in modern mainstream films, the kinds of films they find so shapeless do usually stick to the same four-part structure, contain goals and conflict and the rest of the principles that have been in effect for decades. Cutting, Brunick, and DeLong’s work bolsters this claim about classical narrative’s stability over time. They find that “there are no obvious differences in these patterns for films of different eras.” (p. 8.)

This issue of Projections has other articles applying a cognitive-science approach, and with luck future issues may contain some of the papers we have been unable to attend at this year’s conference.

James, Kaitlin, and Bradford have set up an online overview of their research in progress, “Shot Structure and Visual Activity: The Evolution of Hollywood Film.” The graphics and type are quite small, but right-click on the page and click “Marquee Zoom,” which brings up the little magnifying glass that allows you to enlarge it multiple times.

For online applications of the idea of four-part structure, see David’s essay on Mission: Impossible III and his blog entry on Source Code.

Jordan DeLong, Kaitlin Brunick, and James Cutting at annual convention of SCSMI, Roanoke 2011.

Thursday | June 9, 2011 | Film comments, Film scholarship, Film technique: Editing, Film theory: Cognitivism, Hollywood: Artistic traditions, Narrative strategies, Readers' Favorite Entries, Society for Cognitive Studies of the Moving Image | No Comments »

Madison calling Budapest: Can you read me?

Can you look at this picture without smiling? I think it’s hard, for reasons that relate to one thread of the essay here. Thanks to Levi Buchhuber, age eight months, and Jim Cortada, grandpa.

DB here:

Next week several dozen bright, energetic researchers will be crowding into conference rooms at Elte University in Budapest for the annual meeting of the Society for Cognitive Studies of the Moving Image. Full details on the event are here. As usual, I expect that a hell of a time will be had by all.

I’ve discussed the purposes and projects of the members of this dynamic bunch on earlier occasions. If you want a rundown, I’d suggest reading the items in chronological order:

*I sketch out the SCSMI project in this entry. There are more ruminations in two run-ups to the 2008 Madison SCSMI get-together (here and here), and one a year later summing up that event.

*For a report on the wonderful 2009 Copenhagen convention, go here.

*I try to sum up the wide-ranging 2010 Roanoke powwow here, while a recent blog, “Molly Wanted More,” can be considered an echo of that event.

*For an utterly fun introduction to some of the research on display at SCSMI, head to one of our most popular entries, the guest blog by Tim Smith called “Watching you watch THERE WILL BE BLOOD.”

This year’s paper line-up is especially enticing, and the prospect of seeing so many old friends is even more thrilling. Alas, for reasons beyond our control, Kristin and I aren’t able to attend. As the date draws near, my need to stay home saddens me more than I had expected it would. I must content myself with directing you, with all the fervor I can muster, to the event. Many of our members have told me that their first visit was life-changing, providing them a whole new social network that would encourage their research. Moreover, I notice that every year several participants tell me that they think this one was the best session yet. We’re just getting better, and we’re not going away! I should also alert you to the likelihood that many of the papers will be published in the SCSMI-affiliated journal Projections.

I thought, though, that I might participate a little at long range. So I’ve posted a web essay that sets out, in less technical terms, what my proposed paper for the convention would have tried to say. The essay, “Common Sense + Film Theory = Common-Sense Film Theory?,” reflects an effort to rethink ideas about filmic comprehension that I set out in Narration in the Fiction Film in 1985. This book was one of the first efforts to explore how findings in cognitive science might help us make progress in understanding cinematic storytelling.

I’d stand by much of what I argued there, but in the light of further thinking and later research (much of it conducted by SCSMI members), I wanted to float some ideas that recast and correct my arguments in the book. (Yes, I hope I’ve learned something in twenty-five years.) Some of my more recent notions are available in Poetics of Cinema and under the Film theory: Cognitivism category on this blogsite, but the conference provided a good occasion to submit to the sort of friendly but pointed critique at which my SCSMI colleagues excel.

Of course, give me another twenty-five years and I’ll probably find fault with what I say now. Others won’t need so much time.

Start with this question, which I think is one of the most fascinating we can ask:

What enables us to understand films?

Continue reading here.

To my SCSMI cohort: I wish you a superb gathering. See you next year, at Sarah Lawrence!

Joe Anderson, co-founder of the Society for Cognitive Studies of the Moving Image, opens the 2010 session at Roanoke.

Tuesday | May 31, 2011 | Film comments, Film theory: Cognitivism, Narrative strategies, Society for Cognitive Studies of the Moving Image | No Comments »

Molly wanted more

The Crime of M. Lange.

DB here:

I was watching Snow White and the Seven Dwarfs some years ago with a friend’s three-year-old daughter. Molly hadn’t seen the movie before, and she watched it in a fascinated silence. At the end, Snow White and the prince leave the dwarfs and ride off into the distance.

At this point Molly cried, “More!”

This surprised me. How could she know, on her first pass, that the story was ending?

In and out

It has no name that I’m aware of, but it’s one of the most common conventions of movie storytelling. At the end of the film, the story world closes itself off from us. The characters turn away, perhaps walking into the distance. We may get a distant long shot of the scene. In some cases the camera accentuates this withdrawal by craning or tracking away from the action.

At the end of The Wild One, the biker hero smiles at the woman he’s met and goes out into the street. After exchanging a glance with the ineffectual sheriff, he swings his motorcycle around, his back to us. Cut to an extreme long shot as he rides off.

Once you notice this sort of ending, you’re likely to think about beginnings. Sure enough, we find some symmetry. A movie often visually brings us into the story world. Most common is an inward progression, moving from a large view to the central space of action. As far back as 1919, Griffith started Broken Blossoms with an overall view of the harbor and an explanatory title.

Then we have a gradual entry into the Chinese neighborhood, moving steadily from long shots to closer views. The young ladies we encounter aren’t major characters in the plot, but we’re still slowly drawn into the story world.

At the film’s end, Griffith’s cutting will manage a parallel withdrawal, from the lovers to the temple and back to the waterfront view.

In fact, Snow White starts with a comparable, if somewhat smoother shift inward, from the exterior of the Queen’s castle then, via camera movement and dissolves, to the Queen at her mirror.

This last shot reminds us that alternatively, a film can start with the characters coming forward, as if to meet us. This is the way The Wild One begins.

Any sort of combination is possible. The Silence of the Lambs starts with Clarice Starling climbing up a hill toward us and pausing long enough for us to register her as a protagonist.

She then turns and runs off into the forest. If this were an ending, we might see her go off further and further into the distance. But this is an opening, so we follow her with a tracking shot forward, letting her lead us into the story action. Soon we’re back to the frontality and intimacy of an opening passage, like the shot of the Queen.

At the end, Jonathan Demme gives us a pair of “farewell” shots. The first occurs when Clarice, who’s been talking to Dr. Lecter on the phone, hears the line go dead. We pull away from her.

The second farewell shot takes place on Lecter’s end in a Caribbean island. It shows him rising to follow Chilson and walking away from us, to be swallowed up into the crowd.

The Silence of the Lambs takes leave of its protagonists in two alternative ways: If the camera doesn’t move away from our characters, it seems, then the characters move away from the camera. They may even seal themselves off, as when a flight attendant pinches shut the curtain at the end of The Hunt for Red October.

So when we speak of films’ “openings” or “closings,” it seems that we are often talking about a world that initially invites us in but will finally expell us, however slowly and politely.

See it here!

Researchers play Jeopardy! Their answers always take the form of a question. Most film critics work in the declarative mode (“The performances are gripping”), but ideally the academic film researcher works in the interrogative mode. What is this pattern of beginnings and endings doing in movies? How does it work? How did it become common? And how do we learn it?

Take the first question, about the purposes and functions of the device. At one level its neat symmetry mimics the sort of frame that we find in other sorts of narrative, particularly oral storytelling. Fictional stories need to be set off from the surrounding flow of discourse. When we hear “Once upon a time,” we all know a fairy tale is starting. There are equivalents in other oral traditions, as in “Here is a tale . . . .” (Yoruba) and “See it here!” (Hause).

Likewise, oral storytelling traditions use formulaic final lines. In our fairy tales, it’s often “And they lived happily ever after.” Native American folk tales may finish with the storyteller simply saying, “The end” or “Tied up.” African oral epics sometimes use the formula “That is what I know” or “Let us leave the words right here.”

In fact, I cheated a little with my examples from Snow White. The film actually is bracketed by a literal opening and closing.

In film, however, the symmetrical structure is only part of the answer. While a film can copy the literary idea of opening and closing a written text, the medium as a whole offers something more. Thanks to the visual nature of movies, the widening or closing-off of the story world can mimic the act of our entering or backing out of a tangible situation. That’s what we see in Snow White and my other examples. In a sense we greet the characters, and after spending some time with them we bid them farewell. The sense of entering and leaving their world is harder to capture so concretely in literature or theatre.

So we need a more specific account of the convention. Surely you’ve thought about the most obvious candidate. The entry/ leave-taking pattern mimics our activities as perceiving, socially inclined people. For creatures like us, to encounter a new situation or setting simply involves approaching it or letting it approach us, then becoming part of it and fixing our attention on its details. At a party, we amble closer to a knot of people we want to talk with, or someone comes up to us. Sooner or later that encounter ends or trails off, and we withdraw or turn away or watch when others depart. More poignantly, the extreme long-shot option can recall moments when a car or bus or train carried us away from our loved ones. In any case, thanks to moving images the salient features of this very common experience can be made tangible for viewers.

This convention, we might say, is just natural. Molly, who had already logged three years of social experience, could plausibly make the analogy with ease. She recognized that her encounter with the world of Snow White was coming to an end–the characters were leaving her–and could express her regret.

This common-sense answer is, I think, basically right. But it needs to be fortified against some objections.

For one thing, not all films use this convention. Lots of films start with close-ups of particular items in an environment, plunging us into details without a gradual entry (below, Back to the Future; Muriel; ou le temps d’un retour).

And lots of films don’t end with marks of withdrawal or closure. They end on close views, often of the characters or a significant object (below, Le Silence de la mer; Sorry, Wrong Number).

If this pattern were biased by our natural proclivities, wouldn’t it be more common than it is? And if it comes so naturally, why wasn’t it present at the start of cinema, as soon as people started telling stories? Most of the storytelling techniques we now consider very user-friendly, like cutting and camera movement, emerged after many years of moviemaking. It’s akin to a problem in the history of painting: Why did painters need centuries to learn to imitate the way things look?

Moreover, the naturalness can look pretty unnatural. Nobody lies down in the middle of the road to let a horde of bikers sweep by, as in the beginning of The Wild One. The high angle at the end of the movie doesn’t imitate anybody’s likely point of view; it mimics, if anything, a godlike perspective we almost never get. Similarly, the cuts and dissolves that link the views in Broken Blossoms and Snow White don’t have any parallel in our real experience. Tied to our bodies, we can only move toward or away from situations in real time, step by step. The time-compression and freedom of position we get on film are far from natural. In sum, the concrete ways in which this immersion/ withdrawal pattern shows up in actual movies suggest that the films aren’t imitating the literal vantage points we might assume in the real world. There is a lot of contrivance going on.

Real life, amped up

So the convention has some roots in our perceptual and social experience, but filmmakers have streamlined and sharpened it for our uptake. Another example of this process, which I discuss here, involves actors’ eye behavior. Film actors start with normal patterns of looking and blinking, but they modify them in order to signal emotional states and to concentrate our attention on the drama. Similarly, the schema of entering/ leaving a milieu derives from our common experience, but it can be simplified and stylized through cinematic techniques that have no correspondence in our normal experience. All that matters is that the result preserves the core of that schema.

As filmmakers simplify our perceptual and social experiences, they amplify them as well. Movie characters stare at each other more intently than we do in normal life. The actors are stripping off everyday noise (blinks, averted looks) in order to create cleaner, exaggerated signals of mutual attention to the dramatic situation. Likewise, a steep high angle like that ending The Wild One or the pointedly closed curtains of The Hunt for Red October exaggerates the sense of departure and conclusion, giving the action a weight it wouldn’t have in ordinary life.

Why don’t all films use the entry/ retreat convention? I think we should consider such conventions to be tools. They arise in particular filmmaking traditions, perhaps through trial and error, and get refined for different purposes. Some filmmakers might never use them, or knowingly refuse them to create different reactions, or play games with them (as Resnais does at the start of Muriel). But when filmmakers want certain effects, these tools are ready to hand. If you want to ease your audience into the story world, a slow entry pattern works very well. It’s so familiar that the viewer can overlook its contrivance and concentrate on the story information.

So Jeopardy!-style, we have new questions. Has the opening/ closing device spread because it is a very accessible, spontaneously understood convention? Or is it because filmmakers in other cultures have mechanically copied Hollywood? Perhaps it’s like a tool that originates in particular cultural circumstances but which can be useful all over the world. The Phillips-head screw was devised for the US automotive industry but is now a universal gadget. Over time, such tools become more widespread, perhaps dominant. But a tool can always be refused if the task changes.

There’s still a mystery, though. Assuming that viewers understand these image-clusters as signaling beginnings and endings, how did that understanding come about? How, and when, do people learn the conventions? If they are quickly learned, is that because they play into inclinations of our minds?

My best guess for now is this. We tend to think that all artistic conventions are equally hard to learn, but that’s probably not the case. The conventions of Cubist painting demand more effort and knowledge than the conventions of perspective drawing, and that’s probably not just because we’ve seen more perspective pictures across our lives. Likewise, mastering the conventions of Structural Film demands more time and trouble than mastering those of popular genre filmmaking. It seems likely that some conventions rise to dominance because they fit most viewers’ prior inclinations rather well. For example, we are creatures who interact socially in face-to-face manner. Shot/ reverse-shot editing is unfaithful to our ordinary experience in many ways; cuts provide instant changes of viewpoint we can’t achieve in real time and space. But shot/ reverse-shot cutting preserves the familiar social pattern of turn-taking and conversational flow, so it’s comparatively easy to learn.

Still, I don’t have a full answer. My questions bring us back to Molly. If she had seen Snow White before, we might say that she remembered that the film ended at this point. But she hadn’t seen the film before. So does her demand for more indicate that she generalized from her experience of other films? Did she have already a degree of “narrative competence,” allowing her to recognize certain types of cues? Is her competence full or spotty? (Maybe she gets endings, but not characters’ intentions, or goal orientation.) And how did she acquire that competence? Quickly or slowly?

Maybe some researchers have already explored how children master narrative framing of this sort. (I’d welcome correspondence on the matter.) My hunch is that Molly grasped the cinematic convention, perhaps on limited exposure, because she recognized the core perceptual and social experiences it preserves. Leveraging this insight, perhaps, was some understanding of narrative architecture generally. More basically, it may be that as social animals we’re tuned by evolution to detect fairly quickly some constant features of interactions with our peers.

In any case, I’m betting that Molly wasn’t simply mastering a convention cut off from everyday life, like algebraic equations. To learn anything, you already need to know a lot. Cinematic storytelling isn’t, as some semiologists once thought, a highly arbitrary sign system. It piggybacks on our experience of the world–our knowledge, certainly, but also our most routine ways of sensing and thinking. Not least, understanding movies taps skills associated with our social intelligence. More on this in a future entry, I hope.

This is only an anecdote, but I hope it provokes readers to think about the broader issues I raise. I floated these ideas in my closing address for the convention of the Society for Cognitive Studies of the Moving Image, on 5 June 2010. I thank the members of the audience for their contributions to that discussion. Incidentally, this year’s meeting is coming up, in Budapest this June. For more on SCSMI and the sort of issues broached in this entry, go to this category on this site.

My basic argument for a “moderate constructivism” in understanding film conventions is developed in the 1996 essay “Convention, Construction, and Cinematic Vision,” reprinted in Poetics of Cinema (2008), 57-82. I discuss the entry/ withdrawal pattern as a holistic strategy in a later essay in that collection, “Three Dimensions of Film Narrative”, 94-95. An earlier, more fancily titled piece puts the pattern in the context of a different argument: “Neo-Structuralist Narratology and the Functions of Filmic Storytelling” in Marie-Laure Ryan, ed., Narrative across Media: The Languages of Storytelling (2004), 209.

It’s worth mentioning that, like Snow White, the images at the start and conclusion of Le Silence de la mer are enframed by a book–in this case, shots of the original book by Vercors. This iconography is of course common when a film wants to emphasize the literary origins, though in some films, like Dreyer’s, the device of the film as a replay of an already-written text plays a more complicated narrative role. I talk about that in my book The Films of Carl Theodor Dreyer (1981).

The research on children’s understanding of stories is vast and detailed. Alison Gopnik briskly summarizes the evidence that even very young children have surprising narrative competence in The Philosophical Baby (2009), chapters 1 and 2. The pioneering source here, at least for an amateur like me, was Katherine Nelson, ed., Narratives from the Crib (1989).

The Bad Lieutenant: Port of Call New Orleans. As in many films, the final shot addresses the viewer in a comparatively overt way.

Wednesday | April 27, 2011 | Film comments, Film theory, Film theory: Cognitivism, Narrative strategies, Readers' Favorite Entries, Society for Cognitive Studies of the Moving Image | No Comments »

Now you see it, now you can’t

DB here:

We usually respond to films spontaneously, but afterward we can think about our responses and figure out why we reacted as we did. When we’re fooled by a mystery, for instance, we can re-watch the film and trace exactly how we were misled. Now that Shutter Island is out on DVD, fans will be dissecting its visual gimmicks. Even when a film isn’t a mystery, a lot of critical analysis involves what we might call a rational reconstruction of how the whole shebang works. Novice screenwriters crack open a movie like The Apartment or The Godfather to peer into the fine mesh of plot construction, to tease out all the setups and plants and twists that seem inevitable only after the fact.

This is film research, we might say, at the personal level. Not in the sense of your or my unique identity, but rather at the scale we see the world. We don’t see atoms or gravity. We evolved to sense and think about middle-sized social and physical phenomena, like places and objects and, especially, other humans. We’re aware of the world because we sense ourselves as individual agents, guided by intentions and desires and beliefs. We’re used to talking about films at this level. When we track action and character, note surroundings and time passing, or ask about the purposes of a plot device or theme, we are working at the level of personhood.

But life assigns us to other levels too. There is the subpersonal level. All kinds of things are happening to you now that you can’t be aware of. You can’t watch the cells in your retina detect this sentence, or the neurons in your brain firing to make sense of it, or the flow of signals to your hand urging the mouse to scroll onward. A huge amount of our mental activity takes place behind the scenes that flit through our consciousness. We can’t pay attention to the man behind the curtain—partly because there is nobody there.

There’s also the suprapersonal level, the level of collective behavior patterns. Now we’re talking about people as parts of large-scale forces, like groups and cultures and societies. Historians have traditionally worked at this level. For example, some researchers have traced how film audiences, en masse, have responded to movies.

More strikingly, many scientists now study “self-organization”—the emergence of patterns of order that don’t seem to be willed or intended by individuals or groups. We find impressive instances in nature: fish swim and birds flock in intricate patterns that no one fish or bird could imagine or dictate. Such self-organization is even more striking in human activities like traffic flow or online networks. Is there a sort of “physics of society”? No doubt people have intentions and quirks, but often we can bracket those out and see shapes in the data that no one could have designed. The classic instance is a power law. A remarkable example is Vilfredo Pareto’s discovery that income distribution in any society tends to settle out as 20 percent of the people controlling 80 percent of the wealth. Mark Buchanan sums up the suprapersonal viewpoint this way: “Think patterns, not people.”

We know we can study films at the personal level. How can we study films subpersonally and suprapersonally too?

Reverse-engineering a movie

Mon Oncle.

My answer comes after four days earlier this month at the annual convention of the Society for Cognitive Studies of the Moving Image. We met in Roanoke, Virigina, in a massive nineteenth-century hotel made over into a convention center. You know the place has things under control when every PowerPoint presentation works flawlessly.

What ideas unite the film scholars, psychologists, and philosophers who gathered here? Roughly, the members explore moving-image media through empirical methods. Empirical inquiry can include classic scientific method (hypothesis/ experiment) or methods of aesthetic, historical or quantitative analysis. Typically the goal is not interpretation of a particular film or TV show or videogame but rather understanding of some general aspects of these media. Not explication, we might say, but rather explanation. Further, most members of the Society are interested in ways that film can be illuminated by areas of modern psychological research, such as neuroscience, cognitive science, and evolutionary psychology.

Some of our philosophers would say that they pursue conceptual analysis rather than empirical inquiry. Still, they join our meetings because the sorts of concepts they want to analyze are the ones that the film folk and the psychologists deploy—concepts like artistic intention or the nature of genre. Many of our liveliest sessions have come from disputes between Filmies, Psychos, and Philosophes.

For more on what we do, I’ve offered some background in earlier entries on this site. I previewed the 2008 convention here and here. I previewed the 2009 convention here and here.

Previews, but not followups. The big problem covering these events is that they’re so busy I have no time to blog during them, and by the time they’re over I’m usually en route to Il Cinema Ritrovato in Bologna. So I tended to make those entries introductions to the cognitive perspective, rather than surveys of who said what. This year, because our gathering was earlier than usual, I’m trying to sum up the event reasonably soon after it ended. We had simultaneous sessions, sometimes three at once, so I attended fewer than half of the talks. I’ll try to mention presentations that seem relevant, even if I didn’t hear them. Similarly, I heard some presentations (e.g., Lisa Broad on possible worlds) that were stimulating but don’t quite fit into my thesis here.

There were plenty of talks that developed arguments at the level I called personal. That is, they analyzed how films were designed to achieve certain effects. This calls for “reverse engineering”: starting from plausible viewer responses and then looking for creative choices made by the filmmakers that seemed to fulfill particular functions.

Take as an example Carl Plantinga’s paper on how we strike up moral attitudes toward characters. He was interested in how we achieve what Murray Smith calls “allegiance”—a “pro-attitude” toward certain characters. Is it just a matter of wishing good things for them, or admiring their positive traits? Is it a matter of sympathy for their situation?

Carl argued that all these factors play a role, but they aren’t enough to assure our siding with a character. He argues that allegiance also involves moral judgments, or rather moral intuitions. Films exploit two facts about these moral intuitions: they must be summoned up quickly, without much thought, and they are often driven by emotion rather than ideas.

Oddly enough, our moral intuitions are not necessarily driven by moral standards! Carl drew on Anthony Appiah’s analysis of moral judgments as influenced by how a situation is framed, ordered, and primed—basic cognitive cues that influence responses. For example, Legends of the Fall sets up two brothers, one conventionally moral and the other not. Yet it’s the wild, violent Tristan who earns our sympathies, because he displays vitality, youth, beauty, sensitivity, and closeness to nature. The upshot is that we rationalize a moral judgment on non-moral grounds. Carl got several questions about his conception of morality and the possibility that our moral intuitions are tied to things we value, like beauty.

Malcolm Turvey offered a paper on gags in Jacques Tati. Pointing to prior work on how Tati’s gags are integrated into a shot’s composition (scattered so that we may miss them) and linked to one another (through overlap, Kristin has suggested), Malcolm went on to argue that the gags themselves don’t obey the conventions of classic comedy. They exhibit strategies of misinterpretation, blockage, ellipsis, fragmentation, and concealment that are highly original and unusually challenging. Malcolm is working on a book on “ludic modernism,” and he sees Tati as fitting into this tradition.

Malcolm’s precise and persuasive account was framed by a strong attack on the tendency of cognitive film theory to concentrate on ordinary, even undistinguished films and ignore problematic and avant-garde instances. He pointed to passages in Stephen Pinker’s writings that mock experimental art, and he urged cognitive film researchers to “call Pinker out” for his borderline Philistinism. He remarked that psychological researchers could learn as much from Tati’s work as from ordinary films…and maybe discover new things.

A similar sort of rational reconstruction at the level of personal response was found in many other papers I heard. Jason Gendler dissected the misleading narration in The Blue Gardenia. Rory Kelly wondered why viewers tend to forget the water-utility plotline at the start of Chinatown. James Fiumara considered why modern startle effects are comparatively rare in classic horror films. Torben Grodal isolated a group of “disgust-driven phobic films” (Taxi Driver, Blade Runner, Se7en) that sink so deeply into disgust that melancholy drives out empathy. Lennard Hojberg studied circular camera movements that express the dizziness of love as based on embodied vision.

Some researchers used quantitative procedures to capture a film’s regularities. Monika Suckfuell (right) exposed some very complex patterns in a short film, Father and Daughter. These create a distinct emotional tone through what she calls “distance editing.” The patterns are combinations of thematic units like problem-solving or humor, and the recurring combinations arouse both comprehension and pleasure. In another quantitative study, Tseng Chiaoi and John Bateman sought to tie concrete uses of filmic elements with more abstract aspects of meaning. (Chiaoi and John had done a presentation on film-based discourse semantics at last year’s event.) Concentrating on characters’ action patterns, Chiaoi used computer software to trace their structures in television commercials and the war genre.

Squeezing the stimulus

Bopping after a session: Tseng Chiaoi and Paul Taberham.

The talks I’ve mentioned, along with several others, analyze processes that we can access by re-viewing films, studying their form and materials, and examining the genres and stylistic traditions to which they belong. But other talks concentrated on the subpersonal areas, the parts of our responses that we can’t access so easily.

For instance, how do we mentally stitch together various shots to create a unified space for a scene’s action? Film scholar Todd Berliner and psychologist Dale Cohen presented an account of how we achieve an illusion of spatial continuity. Initially the brain grasps shots as if they were pieces of space selected by the viewer, and then it builds up a model of the whole space—say a porch in front of a house’s front door. When a character moves his arm a certain way toward an offscreen area, our model of porches and doors makes it most probable that he is pressing the doorbell.

In such ways we run “beyond the information given.” Filmmakers count on our doing that, so they give us feedback (say, the sound of a doorbell, or a shot of a door opening) that confirms our model of the space. Most mainstream films build in such redundancy. But the unity of these spaces is undermined by some contemporary exhibition technology. Todd and Dale suggested that the mental models we build of space exclude the movie theatre, so that surround sound and 3D become problematic.

They also got several good questions. How concretely specified are these model spaces? How do they develop in the course of a scene? Might our perception of continuity come down to a lack of perception of discontinuity—that is, maybe we operate on very simple default assumptions and don’t build up many models of the space.

Todd and Dale’s presentation was in the Helmholtz tradition; they even invoked “unconscious inference” as part of the story. Another tradition, represented by SCSMI founders Joseph and Barbara Anderson, invokes James J. Gibson’s ecological approach to perception, which argues that such modeling and inference-making isn’t really happening. Things are much more direct: Perception is data-driven, and needs top-down correction only in rare cases (like nighttime or fog).

Other perceptual researchers try for a more parsimonious research strategy: How much information about the visual world can we squeeze out of the stimulus? This question was raised by Jordan DeLong, who has been exploring how we can identify emotional arousal through very “low-level” information. Using a corpus of 150 films (more on this later), he looked at shot lengths, the distribution of shot-lengths across a film, and a purely physical measure of visual activity (essentially the change from frame to frame) to see if they correlate with genres we associate with high levels of arousal, like action and adventure films. Jordan’s study is preliminary, but there is the possibility that certain purely physical features are reliable indices to levels of arousal—even if people don’t notice those features and are much more fastened on characters and their actions.

The current projects of Tim Smith’s research team exemplify the parsimonious strategy. Tim is a long-time participant in SCSMI, and his talks show how a research program can expand and enrich itself.

We know people look at certain areas of a shot. We also know that our attention is directed, driven by features of the stimulus. What features? We filmies would pick out shot composition, color, movement, lighting, shot scale, etc. We can access those middle-level variables through expert introspection and analysis. But can those features be further decomposed?

Tim thinks so. We can consider any of these technical qualities as made up of luminance, color channels, and other low-level physical aspects of vision. Through signal-detection methods, Tim seeks to pinpoint what the crucial variables are. The results of his work are soon to be published, and I don’t want to give the game away. I’ll just say that he has shown through eye-tracking that certain low-level features are more important than others in engaging viewers’ attention.

One implication of Tim’s findings is that what I’ve called “intensified continuity” seems to have an optimal grip on spectators. This technique, he remarked, “almost paralyzes the eyes,” yielding “an illusion of active vision with passive eyes.” More generally, his work seems to back up James Cutting’s remark that “There is no such thing as voluntary attention sustained for more than a few seconds at a time.” Most of our attention is at the mercy of the outside world, which means that filmmakers need to engage us at every moment—either with narrative, or with something else we’ll find arresting.

Dan Levin, Pia Tikka; in the background William Brown, Carl Plantinga, and Dirk Eitzen.

Dan Levin gave the keynote address at SCSMI two years ago, and he showed how we vastly overrate our ability to spot outrageous changes in the world or on the movie screen. (He’ll have a field day with the tricks in Shutter Island, such as the one surmounting this blog.) This time around Dan talked about Theory of Mind in the movies, the ways that films exploit our (species-specific?) inclination to attribute beliefs, desires, and goals to the creatures we see in the world, and in films. His paper scanned mandatory, bottom-up cues, middle-level activity (the sort of organization of visual space Todd and Dale discussed), and then “controlled cognition,” such as our narrative expectations. So for Dan it’s not all in the stimulus. Once we’ve picked out certain aspects of it, our Theory of Mind system locks on them.

What aspects? Chiefly, signals of intention and marked eye direction. When we think we’re watching an intentional agent, like a movie character, we tend to see the agent’s eye direction as giving us a clue to his or her aims. Using films he has made, Dan tests people for how eyeline-matched cuts are read, and he varies the cues to see how people construe them differently.

Interestingly, when he showed the same shots in a different order, about two-thirds of the subjects didn’t notice that the order was different. That is, whether the object of the glance or the person glancing came first, gaze deflection remained the primary cue to understanding the situation. This suggests to me that storytelling cinema doesn’t absolutely need the classic pattern of person looking/ person looked at/ person looking, but the extra shot makes sure we all understand (redundancy again).

A similar sort of top-down/ bottom-up theory was proposed by Dan Barratt, who gave it a more computational spin. Like Tim, he works on eye movements; like Todd and Dale he’s interested in how we construct space; like Dan, he seeks out intentional factors. It’s not all in the stimulus, but we won’t know how much until we keep squeezing.

Ripples in the flow of films

The Society broke new ground in what I called the “suprapersonal” realm, that of large-scale patterns of activity that aren’t explicitly coordinated by individuals or groups. Several researchers are probing this in relation to films. Films are, in effect, deposits of human behavior; they are artifacts resulting from choice. What if we find patterns of choice that we can’t plausibly trace to coordinated decision-making?

Chris Atherton raised the issue by considering how to study style statistically. Citing Barry Salt as a pioneer in the area, he focused on the work of the Cinemetrics group and offered suggestions on how to better collect data and track patterns at various levels of generality. More sharply, he posed the question of function. How can you measure that? One implication was that in a big body of films we can disclose order that can’t wholly be explained as the sum of individual choices. Those choices matter, but so do forces we have yet to determine, including historical processes. Chris’s reflections chimed nicely with those of our keynote speaker.

James E. Cutting is a distinguished perceptual psychologist at Cornell. He’s written a fine book on motion perception and has done a quantitative historical study of the creation of the canon of French Impressionist painting. He’s a former dancer and a sensitive appreciator of art and music (and film). He’s the ideal person to analyze ticklish aesthetic issues.

You may have run across him recently because his research on Hollywood films was picked up and trumpeted in the press. “Solved: The Mathematics of the Hollywood Blockbuster,” read one headline. Needless to say, James was doing something more subtle. You can read summaries here and here.

In his lively address, “Attention, Intensity, and the Evolution of Hollywood Film,” James explained two of his areas of interest: the ebb and flow of change across a film, and how that change is tied to human pickup. James studies these topics with a big database: 150 movies from 1935 to 2005. He emphasizes widely seen films belonging to five genres and chosen from the highest-rated titles on IMDB. But there’s a micro- side too. He and his research team went through every film frame by frame coding each one along many dimensions.

Naturally, he takes on the vexed issue of Average Shot Length. You can read the ongoing discussions of this concept on Yuri Tsivian’s Cinemetrics site. What interests James is less ASL in itself than the ways in which comparable patterns of shot lengths cluster in certain parts of the film. A film’s ASL may be 8 seconds, but in some passages several shots might have similar lengths, say 12 seconds each. Moreover, those patterns may “ripple through the film,” recurring at certain intervals and different scales (shot clusters, scenes, or other chunks).

Not every film shows such patterns; film noirs seem random in their patterns of shot lengths. But many movies, especially those of the last fifty years do display these patterns—typically clusters of short shots for physical action, clusters of longer ones for conversations. This clustering tendency is on the increase, even outside the action genre.

The finding leads James to ask about the pacing of visual change, which raises the prospect of the sort of tension/ release dynamics we find in music. The patterns he finds don’t look arbitrary because they match the so-called 1/f or pink-noise pattern. This pattern has been detected in the natural world, in heartbeats, and in brain activity. It’s also been discovered in reaction times to tasks. In effect, the 1/f pattern captures not continual attentiveness but rather an alternation of intense concentration, moments of slower pickup, and moments of sheer mind-wandering. A nontechnical explanation of the 1/f pattern is here; a technical one is here.

As for intensity, James is seeking to measure visual activity in the shots. How much change, in effect, can there be from frame to frame? This can be captured by correlating each frame with its mates. James and his team find that from 1930 to 1950, there’s been a steady increase of frame-to-frame visual activity across all genres. The images became busier, with more movement. Today, he suggests, Hollywood is exploring ways to raise the frame-to-frame visual activity—not only through lots of movement of characters (the action film comes to mind) but also through “queasicam” handheld movements.

In combination with decreasing ASLs, Hollywood seems to be asking: How briefly can I show you this and still get the point across? So far, James suggests, films like Mission: Impossible III and The Bourne Ultimatum seem to be the busiest at the visual level. But animated films score about the same.

James has a caveat: The size of the screen matters. Even a big home-theatre screen doesn’t duplicate the breadth of a theatre screen, which activates not only our central vision but our peripheral vision too. That’s why bumpy shots that you can tolerate on a computer monitor may make you queasy at the multiplex. Interestingly, the Bourne films and Cloverfield, James suggests, were more popular on IMDB after the DVD versions came out. Perhaps people were better able to assimilate them on a smaller display.

At one level, James is interested in the subpersonal factors. He grants that you don’t notice cuts but can attend to them if you shift your focus from the story. But the widespread patterns he discloses aren’t easy to ascribe to deliberate planning. Nobody but an avant-gardist would decide to have shots of similar lengths at points 14 minutes or 25 minutes apart throughout the film. But James finds such correlations at levels beyond chance.

I can’t pretend to understand everything mathematical in James’ argument, but I think that his discoveries open up a new way to think about pacing in film. My first impulse is to think about historical causes, as Chris suggests: filmmakers learning from each other, converging on optimum choices. But I also like to entertain the possibility that this optimum carries a resonance even beyond the flux of history. Perhaps like fish and fowl, filmmakers are obeying and viewers are fulfilling, completely unawares, deep rhythms built into nature and numbers.

A tangled databank

Traditional humanists would decry a lot of what goes on at SCSMI meetings. The appeal to general explanations, the recourse to biology and evolution, the use of quantitative and experimental methods would all smack of “scientism.” But more and more, humanists are starting to turn away from the endless reinterpretation of canonical or non-canonical artworks. Many are also quietly defecting from the Big Theory that dominated the 80s and 90s. In film publishing, I’m told, editors have come to an informal moratorium on books on Deleuze. Possibly more people write them than read them.

Committed to a theory of permanent revolution in Theory, humanists are seeking new pastures. Some have discovered neuroscience, others evolutionary psychology. Franco Moretti has launched quantitative studies of the literary marketplace. For many converts, the reconciliation with science is just a bandwagon to hop onto, and they will jump off when a newer one trundles past. But other scholars have been committed from early on. The prospect of “consilience,” the compatibility between the sciences and the arts, is something the literary Darwinists like Brian Boyd, Jonathan Gottschall, Joseph Carroll, and like-minded souls were defending long before it became fashionable.

Film theory, as Joe Anderson is fond of pointing out, has a long and intense fascination with experimental psychology. Hugo Münsterberg, Rudolf Arnheim, and Sergei Eisenstein saw no conflict in studying film art with tools and findings derived from the sciences. That interest was lost in the 1960s, for a variety of reasons. But some of us have persisted. An explicitly “cognitive” perspective has been developing in film studies for the last twenty-five years, and SCSMI has nurtured this tradition since 1997. Our commitment is deep. We’re making headway. We’re not going to go away.

There is grandeur in this view of life, and cinema.

Our Society owes a great debt to Stephen Prince of the Virginia Tech School of Performing Arts and Cinema, who hosted this year’s convention. He also gave a splendid paper on the research traditions behind precinematic optical toys, as a way of thinking about modern CGI.

I found these books on suprapersonal patterns helpful: Mark Buchanan, The Social Atom: Why the Rich Get Richer, Cheats Get Caught and Your Neighbor Usually Looks Like You (London: Cyan, 2007); Steven Strogatz, Synch: The Emerging Science of Spontaneous Order (New York: Theia, 2003); Philip Ball, Critical Mass: How One Thing Leads to Another (New York: Farrar Straus Giroux, 2004).

Todd Berliner and Dale Cohen’s paper, “The Illusion of Continuity: Active Perception and the Classical Editing System,” is scheduled for publication in the Journal of Film and Video in 2011. Tim Smith’s paper is in press at Cognitive Computation as P. J. Mital, T. J. Smith, R. Hill, and J. M. Henderson, “Clustering of gaze during dynamic scene viewing is predicted by motion.” You can check Tim’s DIEM project for video demonstrations, and his blog, Continuity Boy.

The original article by James Cutting, Jordan DeLong, and Christine Nothelfer, “Attention and the Evolution of Hollywood Film,” appeared in the March issue of Psychological Science. Access is through subscription.

As for Shutter Island, thanks to Justin Daering for pointing out the double sleight-of-hand. More thoughts on the film and Scorsese’s expressionist/ impressionist tendencies are in our backfile.