Research study coordinators from 17 sites participating in a cardiac surgery study were trained to administer and score a brief neuropsychological test battery. Results were sent to the study's centralized laboratory for review and feedback. The average examiner errors on the first six protocols were compared with the average errors on the last six protocols over 12 months for each site. Overall, errors for the first six protocols were 4.42, and errors for the last six protocols were 1.83, representing a significant overall decline. Errors for instruction, administration, and recording showed a significant decrease over time. Despite ongoing feedback to examiners, scoring errors did not decline significantly overall; this suggests that a review of all protocols is necessary to achieve reliable scoring. However, when examiners' number of protocols completed was compared with number of scoring errors per protocol, there was a trend for examiners who had completed more protocols to show more improvement in scoring.