While in graduate school, I worked as a research associate for a university-based research institute. One of my major projects was a cost-benefit analysis on the state’s investment in secondary school occupational education programs. The research team consisted of two professors, both economists, another graduate student and myself. While we worked as a team, my role was to do the cost analysis, while my fellow graduate student worked on the benefits calculations.
This was a data-intensive project. We looked at the costs of the occupational education programs in comparison with the costs of a regular high school program. We found comparable students, based on test scores, and then surveyed both the target group and the control group about their post-high school education and careers. It was a very detailed analysis and generated a ton of data. We finished the project in about 8 months, wrote a final report, and presented the results to state education officials. Doug, my fellow graduate student used his analysis as the basis for his master’s thesis, graduated, and left the university for a job in the real world while I continued my doctoral studies.
Several months later, I got a call from Doug’s thesis advisor (one of the professors we worked for). He asked if I could stop by his office. He had been poring through the final project report (with an attention to detail that only he could bring to the task) and had found a discrepancy – the N in a statistical table on page 30 of the report didn’t agree with the N in another statistical table found on page 204. While these tables were part of Doug’s calculations, Doug was no longer there, and he wanted me to find out why these numbers differed. He handed me a 6-inch stack of computer printouts that Doug had generated and asked me to get back to him when I found the answer.
Many hours later, after poring through Doug’s statistical analysis programs in great detail, I found the problem. While writing the program instructions, at one point in defining the target audience, he used the word “OR” rather than the word “AND.” All of his calculations were wrong because he based those calculations on the wrong group! I ended up having to do the calculations all over again and, of course, the results (that we had already published and presented to the project sponsors several months earlier) were wrong!
I learned several lessons from this experience that have served me well ever since:
1. You have to be very careful in working with statistics.
2. Even though team members have separate responsibilities, it is always good for them to check each other’s work – if I had reviewed Doug’s programming before we wrote the final report, I would probably have found the error and avoided the problem.
3. The crow we all had to eat in going back to the project sponsors with the new results didn’t taste very good.
This was a data-intensive project. We looked at the costs of the occupational education programs in comparison with the costs of a regular high school program. We found comparable students, based on test scores, and then surveyed both the target group and the control group about their post-high school education and careers. It was a very detailed analysis and generated a ton of data. We finished the project in about 8 months, wrote a final report, and presented the results to state education officials. Doug, my fellow graduate student used his analysis as the basis for his master’s thesis, graduated, and left the university for a job in the real world while I continued my doctoral studies.
Several months later, I got a call from Doug’s thesis advisor (one of the professors we worked for). He asked if I could stop by his office. He had been poring through the final project report (with an attention to detail that only he could bring to the task) and had found a discrepancy – the N in a statistical table on page 30 of the report didn’t agree with the N in another statistical table found on page 204. While these tables were part of Doug’s calculations, Doug was no longer there, and he wanted me to find out why these numbers differed. He handed me a 6-inch stack of computer printouts that Doug had generated and asked me to get back to him when I found the answer.
Many hours later, after poring through Doug’s statistical analysis programs in great detail, I found the problem. While writing the program instructions, at one point in defining the target audience, he used the word “OR” rather than the word “AND.” All of his calculations were wrong because he based those calculations on the wrong group! I ended up having to do the calculations all over again and, of course, the results (that we had already published and presented to the project sponsors several months earlier) were wrong!
I learned several lessons from this experience that have served me well ever since:
1. You have to be very careful in working with statistics.
2. Even though team members have separate responsibilities, it is always good for them to check each other’s work – if I had reviewed Doug’s programming before we wrote the final report, I would probably have found the error and avoided the problem.
3. The crow we all had to eat in going back to the project sponsors with the new results didn’t taste very good.
RSS Feed