Operation Span Task
Version: v1 (current)
A complex span task measuring working memory capacity through concurrent processing and storage demands.
Overview
The Operation Span (OSPAN) task is one of the most widely used measures of working memory capacity. Unlike simple span tasks (e.g., digit span) that only require storage, complex span tasks like OSPAN require participants to simultaneously process information (solve simple math problems) while remembering items (letters or words) for later recall.
This dual-task design taps into the core executive functions of working memory: maintaining goal-relevant information while performing concurrent processing. Individual differences in OSPAN scores correlate with reading comprehension, fluid intelligence, and academic performance, making it a powerful cognitive assessment tool.
The task is extensively used in cognitive psychology, educational research, and clinical neuropsychology to measure working memory capacity and executive function.
Scientific Background
Classic Findings:
- Capacity Limits: Most adults recall 3-5 items correctly under dual-task conditions
- Storage-Processing Trade-off: More complex processing reduces storage capacity
- Individual Differences: OSPAN scores predict fluid intelligence (r ~ 0.5)
- Age Effects: Working memory capacity declines with aging, evident in OSPAN
- Training Effects: Working memory training can improve OSPAN scores (though transfer is debated)
Key Mechanisms:
- Controlled Attention: Ability to maintain task goals in the face of interference
- Dual-Task Coordination: Switching between processing and storage operations
- Interference Resistance: Protecting memory traces from overwriting during processing
Seminal Papers:
- Turner & Engle (1989): Original OSPAN development
- Engle, Tuholski, Laughlin, & Conway (1999): Working memory, short-term memory, and fluid intelligence
- Unsworth, Heitz, Schrock, & Engle (2005): Automated OSPAN version
Why Researchers Use This Task
- Working Memory Assessment: Gold-standard measure of complex working memory capacity
- Intelligence Research: Strong predictor of fluid intelligence and reasoning
- Educational Studies: Correlates with reading comprehension and academic achievement
- Clinical Assessment: Sensitive to ADHD, schizophrenia, and frontal lobe dysfunction
- Cognitive Training: Baseline and outcome measure for working memory interventions
Current Implementation Status
Fully Implemented:
- ✅ Math operation verification (processing component)
- ✅ Letter/word recall (storage component)
- ✅ Adaptive set sizes (2-6 items)
- ✅ Practice trials for both components
- ✅ Partial credit scoring (correct items in correct positions)
- ✅ Processing speed and accuracy tracking
Partially Implemented:
- ⚠️ Limited to letters as to-be-remembered items (not words or spatial locations)
- ⚠️ Fixed math operations (addition/subtraction only)
Not Yet Implemented:
- ❌ Spatial OSPAN variant (locations instead of letters)
- ❌ Adaptive difficulty adjustment based on performance
- ❌ Running span variant
Configuration Parameters
Task Structure
| Parameter | Type | Default | Description |
|---|---|---|---|
| Set Sizes | array | [3, 4, 5] | Number of items per set (typical: 2-6) |
| Trials Per Set Size | number | 3 | How many trials at each set size |
| Recall Items | string | 'letters' | What to remember ('letters', 'words') |
Processing Component
| Parameter | Type | Default | Description |
|---|---|---|---|
| Operation Type | string | 'math' | Type of operation ('math', 'reading') |
| Operation Timeout (ms) | number | 5000 | Max time to verify each operation |
| Min Operation Accuracy | number | 0.85 | Minimum processing accuracy to prevent ignoring task |
Timing Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| Operation Display (ms) | number | 0 | Auto-advance operation (0 = self-paced) |
| Letter Display (ms) | number | 800 | Duration to show each letter |
| Isi (ms) | number | 200 | Inter-stimulus interval |
Practice Configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
| Practice Mode | string | 'mandatory' | Practice availability |
| Practice Operations Only | number | 15 | Practice trials for operations alone |
| Practice Letters Only | number | 3 | Practice trials for letters alone (set sizes 2-3) |
| Practice Combined | number | 3 | Practice trials combining both (set size 2) |
Keyboard Shortcuts
Researchers can customize the keyboard bindings used during the task:
| Parameter | Type | Default | Description |
|---|---|---|---|
| Show keyboard hint | boolean | True | Display an on-screen hint showing the configured keys |
| Letter key | key | Any letter (A-Z) | Key accepted for letter recall input |
| Letter action label | text | "to recall letters" | Label shown in the keyboard hint |
| Submit key | key | Enter | Key to submit the recalled sequence |
| Submit action label | text | "to submit" | Label shown in the keyboard hint |
The default "Any letter" setting accepts any letter key A-Z for the recall phase.
Data Output
Markers and Responses
The task records high-resolution timestamps in two separate collections. Multiple marker types track the complex trial structure:
Markers (stimulus_shown):
{
"type": "stimulus_shown",
"ts": "2024-01-01T00:00:01.000Z",
"hr": 1234.56,
"data": {
"trial_index": 1,
"stimulus_id": "operation_span_0_1",
"set_size": 3,
"operations": [{"equation": "(2 * 3) + 1 = ?", "correct_answer": 7}],
"letters": ["F", "H", "Q"],
"block": "main",
"is_practice": false
}
}
Markers (math_response):
{
"type": "math_response",
"data": {
"trial_index": 1,
"item_index": 0,
"equation": "(2 * 3) + 1 = ?",
"correct_answer": 7,
"user_answer": 7,
"correct": true,
"latency_ms": 1420
}
}
Response Data (letter recall):
{
"trial_index": 1,
"stimulus_id": "operation_span_0_1",
"source": "keyboard",
"raw_key": "Enter",
"set_size": 3,
"operations_presented": [{"equation": "(2 * 3) + 1 = ?", "correct_answer": 7}],
"letters_presented": ["F", "H", "Q"],
"math_responses": [{"equation": "(2 * 3) + 1 = ?", "correct": true, "latency_ms": 1420}],
"letter_response": ["F", "H", "Q"],
"letter_response_correct": true,
"partial_credit_unit_score": 3,
"absolute_score": true,
"math_accuracy": 1.0,
"meets_math_criterion": true,
"block": "main",
"is_practice": false,
"recall_latency_ms": 4500
}
Summary Artifact
A JSON file (operation_span_summary_<taskIndex>.json) with aggregated statistics:
{
"task_kind": "operation_span",
"task_index": 0,
"total_trials": 15,
"overall": {
"total": 15,
"valid_responses": 14,
"correct_recall": 9,
"recall_accuracy": 0.64,
"meets_math_threshold": 12,
"math_threshold_rate": 0.86,
"total_pcu_score": 47,
"absolute_span": 9,
"average_math_accuracy": 0.89,
"timeouts": 1
},
"by_set_size": {
"size_3": { /* statistics */ },
"size_4": { /* statistics */ },
"size_5": { /* statistics */ }
},
"practice": { /* same structure if enabled */ },
"trials": [ /* per-trial data */ ]
}
Key metrics:
total_pcu_score: Partial credit unit score (letters recalled in correct positions)absolute_span: Number of perfectly recalled setsaverage_math_accuracy: Proportion of math problems answered correctlymeets_math_criterion: Whether math accuracy meets the threshold (default 85%)
Example Research Configurations
Standard OSPAN (Unsworth et al., 2005)
Set Sizes: 3-7 items
Trials: 3 per set size (15 total)
Operations: Simple math (3-9 range)
Timeout: Self-paced operations, 800ms letter display
Scoring: Absolute (sum of perfectly recalled sets)
Short OSPAN (Clinical/Children)
Set Sizes: 2-4 items
Trials: 3 per set size (9 total)
Operations: Very simple math (1-5 range)
Timing: Generous timeouts
Scoring: Partial credit to reward progress
Research OSPAN (Maximum Sensitivity)
Set Sizes: 3-6 items
Trials: 4 per set size (16 total)
Operations: Adaptive difficulty
Analysis: Partial credit scoring for finer discrimination
Participant Experience
- Instructions: Learn that you will solve math problems and remember letters
- Practice - Operations: Verify if equations are correct (15 trials with feedback)
- Practice - Letters: Remember 2-3 letters (3 trials with feedback)
- Practice - Combined: Do both tasks together (3 trials, set size 2)
- Main Task: For each set:
- See operation (e.g., "3 + 5 = 9"), verify True/False
- See letter to remember (e.g., "F")
- Repeat operation-letter pairs for set size (e.g., 3 times)
- Recall all letters in order by typing on keyboard (default accepts any A-Z key -- configurable by researcher) or clicking from grid
- Press Enter to submit (default -- configurable by researcher)
- Repeat: Complete all sets (typically 9-15 total)
- Completion: See overall score and processing accuracy
All keyboard bindings are configurable by the researcher in the study configuration. The keys listed above are the defaults.
Design Recommendations
General Guidelines
- Set Sizes: Use range 3-6 for adults; 2-4 for children or clinical populations
- Processing Difficulty: Keep operations simple to avoid floor effects
- Processing Accuracy: Require ≥85% accuracy to ensure participants process operations
- Scoring: Absolute scoring (all-or-none) vs. partial credit depending on goals
Processing Task Design
- Use operations requiring 2-4 seconds to solve
- Balance true/false answers (50/50)
- Avoid patterns (e.g., alternating true/false)
- Ensure operations don't facilitate letter encoding (e.g., answers shouldn't spell letters)
Population-Specific Adaptations
Children (8-12 years):
- Set sizes: 2-3 items
- Simple operations: 1-digit addition/subtraction
- Generous timeouts
- Visual feedback and encouragement
- Partial credit scoring
Older Adults (65+):
- Set sizes: 2-5 items
- Slower pacing
- Larger font for operations and letters
- Allow self-paced operation verification
- Consider shorter session (fewer trials)
Clinical Populations:
- Adapt set sizes to functioning level
- May need simplified operations
- Extended practice
- Monitor processing accuracy carefully (may ignore math)
Common Issues and Solutions
| Issue | Solution |
|---|---|
| Low processing accuracy (<80%) | Emphasize equal importance of both tasks; add practice; slow down |
| Participants ignore math | Implement minimum processing accuracy requirement; provide feedback |
| Floor effects (all recall errors) | Reduce set sizes, simplify operations, or add practice |
| Ceiling effects (perfect recall) | Increase set sizes, add more trials, or speed up operations |
| Participants write letters down | Emphasize no external aids; monitor if in-person |
References
- Turner, M. L., & Engle, R. W. (1989). Is working memory capacity task dependent? Journal of Memory and Language, 28(2), 127-154.
- Engle, R. W., Tuholski, S. W., Laughlin, J. E., & Conway, A. R. A. (1999). Working memory, short-term memory, and general fluid intelligence: A latent-variable approach. Journal of Experimental Psychology: General, 128(3), 309-331.
- Unsworth, N., Heitz, R. P., Schrock, J. C., & Engle, R. W. (2005). An automated version of the operation span task. Behavior Research Methods, 37(3), 498-505.
- Conway, A. R. A., Kane, M. J., Bunting, M. F., Hambrick, D. Z., Wilhelm, O., & Engle, R. W. (2005). Working memory span tasks: A methodological review and user's guide. Psychonomic Bulletin & Review, 12(5), 769-786.
See Also
- Digit Span - Simple span task without processing component
- N-Back Task - Alternative working memory updating task
- Visual Short-Term Memory - Visual capacity measure