Operation Span Task

Version: v1 (current)

A complex span task measuring working memory capacity through concurrent processing and storage demands.

Overview

The Operation Span (OSPAN) task is one of the most widely used measures of working memory capacity. Unlike simple span tasks (e.g., digit span) that only require storage, complex span tasks like OSPAN require participants to simultaneously process information (solve simple math problems) while remembering items (letters or words) for later recall.

This dual-task design taps into the core executive functions of working memory: maintaining goal-relevant information while performing concurrent processing. Individual differences in OSPAN scores correlate with reading comprehension, fluid intelligence, and academic performance, making it a powerful cognitive assessment tool.

The task is extensively used in cognitive psychology, educational research, and clinical neuropsychology to measure working memory capacity and executive function.

Scientific Background

Classic Findings:

Capacity Limits: Most adults recall 3-5 items correctly under dual-task conditions
Storage-Processing Trade-off: More complex processing reduces storage capacity
Individual Differences: OSPAN scores predict fluid intelligence (r ~ 0.5)
Age Effects: Working memory capacity declines with aging, evident in OSPAN
Training Effects: Working memory training can improve OSPAN scores (though transfer is debated)

Key Mechanisms:

Controlled Attention: Ability to maintain task goals in the face of interference
Dual-Task Coordination: Switching between processing and storage operations
Interference Resistance: Protecting memory traces from overwriting during processing

Seminal Papers:

Turner & Engle (1989): Original OSPAN development
Engle, Tuholski, Laughlin, & Conway (1999): Working memory, short-term memory, and fluid intelligence
Unsworth, Heitz, Schrock, & Engle (2005): Automated OSPAN version

Why Researchers Use This Task

Working Memory Assessment: Gold-standard measure of complex working memory capacity
Intelligence Research: Strong predictor of fluid intelligence and reasoning
Educational Studies: Correlates with reading comprehension and academic achievement
Clinical Assessment: Sensitive to ADHD, schizophrenia, and frontal lobe dysfunction
Cognitive Training: Baseline and outcome measure for working memory interventions

Current Implementation Status

Fully Implemented:

✅ Math operation verification (processing component)
✅ Letter/word recall (storage component)
✅ Adaptive set sizes (2-6 items)
✅ Practice trials for both components
✅ Partial credit scoring (correct items in correct positions)
✅ Processing speed and accuracy tracking

Partially Implemented:

⚠️ Limited to letters as to-be-remembered items (not words or spatial locations)
⚠️ Fixed math operations (addition/subtraction only)

Not Yet Implemented:

❌ Spatial OSPAN variant (locations instead of letters)
❌ Adaptive difficulty adjustment based on performance
❌ Running span variant

Configuration Parameters

Task Structure

Parameter	Type	Default	Description
Set Sizes	array	[3, 4, 5]	Number of items per set (typical: 2-6)
Trials Per Set Size	number	3	How many trials at each set size
Recall Items	string	'letters'	What to remember ('letters', 'words')

Processing Component

Parameter	Type	Default	Description
Operation Type	string	'math'	Type of operation ('math', 'reading')
Operation Timeout (ms)	number	5000	Max time to verify each operation
Min Operation Accuracy	number	0.85	Minimum processing accuracy to prevent ignoring task

Timing Parameters

Parameter	Type	Default	Description
Operation Display (ms)	number	0	Auto-advance operation (0 = self-paced)
Letter Display (ms)	number	800	Duration to show each letter
Isi (ms)	number	200	Inter-stimulus interval

Practice Configuration

Parameter	Type	Default	Description
Practice Mode	string	'mandatory'	Practice availability
Practice Operations Only	number	15	Practice trials for operations alone
Practice Letters Only	number	3	Practice trials for letters alone (set sizes 2-3)
Practice Combined	number	3	Practice trials combining both (set size 2)

Keyboard Shortcuts

Researchers can customize the keyboard bindings used during the task:

Parameter	Type	Default	Description
Show keyboard hint	boolean	True	Display an on-screen hint showing the configured keys
Letter key	key	Any letter (A-Z)	Key accepted for letter recall input
Letter action label	text	"to recall letters"	Label shown in the keyboard hint
Submit key	key	Enter	Key to submit the recalled sequence
Submit action label	text	"to submit"	Label shown in the keyboard hint

The default "Any letter" setting accepts any letter key A-Z for the recall phase.

Data Output

Markers and Responses

The task records high-resolution timestamps in two separate collections. Multiple marker types track the complex trial structure:

Markers (stimulus_shown):

{
  "type": "stimulus_shown",
  "ts": "2024-01-01T00:00:01.000Z",
  "hr": 1234.56,
  "data": {
    "trial_index": 1,
    "stimulus_id": "operation_span_0_1",
    "set_size": 3,
    "operations": [{"equation": "(2 * 3) + 1 = ?", "correct_answer": 7}],
    "letters": ["F", "H", "Q"],
    "block": "main",
    "is_practice": false
  }
}

Markers (math_response):

{
  "type": "math_response",
  "data": {
    "trial_index": 1,
    "item_index": 0,
    "equation": "(2 * 3) + 1 = ?",
    "correct_answer": 7,
    "user_answer": 7,
    "correct": true,
    "latency_ms": 1420
  }
}

Response Data (letter recall):

{
  "trial_index": 1,
  "stimulus_id": "operation_span_0_1",
  "source": "keyboard",
  "raw_key": "Enter",
  "set_size": 3,
  "operations_presented": [{"equation": "(2 * 3) + 1 = ?", "correct_answer": 7}],
  "letters_presented": ["F", "H", "Q"],
  "math_responses": [{"equation": "(2 * 3) + 1 = ?", "correct": true, "latency_ms": 1420}],
  "letter_response": ["F", "H", "Q"],
  "letter_response_correct": true,
  "partial_credit_unit_score": 3,
  "absolute_score": true,
  "math_accuracy": 1.0,
  "meets_math_criterion": true,
  "block": "main",
  "is_practice": false,
  "recall_latency_ms": 4500
}

Summary Artifact

A JSON file (operation_span_summary_<taskIndex>.json) with aggregated statistics:

{
  "task_kind": "operation_span",
  "task_index": 0,
  "total_trials": 15,
  "overall": {
    "total": 15,
    "valid_responses": 14,
    "correct_recall": 9,
    "recall_accuracy": 0.64,
    "meets_math_threshold": 12,
    "math_threshold_rate": 0.86,
    "total_pcu_score": 47,
    "absolute_span": 9,
    "average_math_accuracy": 0.89,
    "timeouts": 1
  },
  "by_set_size": {
    "size_3": { /* statistics */ },
    "size_4": { /* statistics */ },
    "size_5": { /* statistics */ }
  },
  "practice": { /* same structure if enabled */ },
  "trials": [ /* per-trial data */ ]
}

Key metrics:

total_pcu_score: Partial credit unit score (letters recalled in correct positions)
absolute_span: Number of perfectly recalled sets
average_math_accuracy: Proportion of math problems answered correctly
meets_math_criterion: Whether math accuracy meets the threshold (default 85%)

Example Research Configurations

Standard OSPAN (Unsworth et al., 2005)

Set Sizes: 3-7 items
Trials: 3 per set size (15 total)
Operations: Simple math (3-9 range)
Timeout: Self-paced operations, 800ms letter display
Scoring: Absolute (sum of perfectly recalled sets)

Short OSPAN (Clinical/Children)

Set Sizes: 2-4 items
Trials: 3 per set size (9 total)
Operations: Very simple math (1-5 range)
Timing: Generous timeouts
Scoring: Partial credit to reward progress

Research OSPAN (Maximum Sensitivity)

Set Sizes: 3-6 items
Trials: 4 per set size (16 total)
Operations: Adaptive difficulty
Analysis: Partial credit scoring for finer discrimination

Participant Experience

Instructions: Learn that you will solve math problems and remember letters
Practice - Operations: Verify if equations are correct (15 trials with feedback)
Practice - Letters: Remember 2-3 letters (3 trials with feedback)
Practice - Combined: Do both tasks together (3 trials, set size 2)
Main Task: For each set:
- See operation (e.g., "3 + 5 = 9"), verify True/False
- See letter to remember (e.g., "F")
- Repeat operation-letter pairs for set size (e.g., 3 times)
- Recall all letters in order by typing on keyboard (default accepts any A-Z key -- configurable by researcher) or clicking from grid
- Press Enter to submit (default -- configurable by researcher)
Repeat: Complete all sets (typically 9-15 total)
Completion: See overall score and processing accuracy

All keyboard bindings are configurable by the researcher in the study configuration. The keys listed above are the defaults.

Design Recommendations

General Guidelines

Set Sizes: Use range 3-6 for adults; 2-4 for children or clinical populations
Processing Difficulty: Keep operations simple to avoid floor effects
Processing Accuracy: Require ≥85% accuracy to ensure participants process operations
Scoring: Absolute scoring (all-or-none) vs. partial credit depending on goals

Processing Task Design

Use operations requiring 2-4 seconds to solve
Balance true/false answers (50/50)
Avoid patterns (e.g., alternating true/false)
Ensure operations don't facilitate letter encoding (e.g., answers shouldn't spell letters)

Population-Specific Adaptations

Children (8-12 years):

Set sizes: 2-3 items
Simple operations: 1-digit addition/subtraction
Generous timeouts
Visual feedback and encouragement
Partial credit scoring

Older Adults (65+):

Set sizes: 2-5 items
Slower pacing
Larger font for operations and letters
Allow self-paced operation verification
Consider shorter session (fewer trials)

Clinical Populations:

Adapt set sizes to functioning level
May need simplified operations
Extended practice
Monitor processing accuracy carefully (may ignore math)

Common Issues and Solutions

Issue	Solution
Low processing accuracy (<80%)	Emphasize equal importance of both tasks; add practice; slow down
Participants ignore math	Implement minimum processing accuracy requirement; provide feedback
Floor effects (all recall errors)	Reduce set sizes, simplify operations, or add practice
Ceiling effects (perfect recall)	Increase set sizes, add more trials, or speed up operations
Participants write letters down	Emphasize no external aids; monitor if in-person

References

Turner, M. L., & Engle, R. W. (1989). Is working memory capacity task dependent? Journal of Memory and Language, 28(2), 127-154.
Engle, R. W., Tuholski, S. W., Laughlin, J. E., & Conway, A. R. A. (1999). Working memory, short-term memory, and general fluid intelligence: A latent-variable approach. Journal of Experimental Psychology: General, 128(3), 309-331.
Unsworth, N., Heitz, R. P., Schrock, J. C., & Engle, R. W. (2005). An automated version of the operation span task. Behavior Research Methods, 37(3), 498-505.
Conway, A. R. A., Kane, M. J., Bunting, M. F., Hambrick, D. Z., Wilhelm, O., & Engle, R. W. (2005). Working memory span tasks: A methodological review and user's guide. Psychonomic Bulletin & Review, 12(5), 769-786.

Overview​

Scientific Background​

Why Researchers Use This Task​

Current Implementation Status​

Configuration Parameters​

Task Structure​

Processing Component​

Timing Parameters​

Practice Configuration​

Keyboard Shortcuts​

Data Output​

Markers and Responses​

Summary Artifact​

Example Research Configurations​

Standard OSPAN (Unsworth et al., 2005)​

Short OSPAN (Clinical/Children)​

Research OSPAN (Maximum Sensitivity)​

Participant Experience​

Design Recommendations​

General Guidelines​

Processing Task Design​

Population-Specific Adaptations​

Common Issues and Solutions​

References​

See Also​