Bill Wilkerson writes:

I am a semi-regular reader of your blog and I teach our applied research course [Dept of Political Science, College at Oneonta]. I have been teaching for 16 years and am generally pretty good at it, but I still feel like I am new at the course although I have taught it four times. I just don’t have a feel for what my students need to get out of the course. My colleagues are not quantitative and few of our students go on to grad school in political science or public policy. (Most that continue their formal educations go to law school or go for an MPA.) I know what they don’t want: anything with numbers. This is true despite the fact that we require stats 101. Numbers seem to be part of the deal with a social science methods course. :-) And it is a comfort and knowledge with numbers that will set them apart from many of their colleagues in the work force.

Anyway, I am curious about your recent discussion of graphs versus tables. Should I be teaching my undergraduate students to graph this data? If so what tool should they use? I used STATA when writing my diss, but have settled on SPSS at Oneonta as that is what we have a license for. I have never used R in any serious way. I have even toyed with using Excel as that is what they are likely to use in an office setting. Thoughts?

My reply: That’s a good question. I have the luxury of being associated with particular topics that our grad students want to learn: multilevel models, statistical graphics, Bayesian statistics, statistical consulting, sample surveys, decision analysis, and the teaching of statistics. I did teach intro stat to the undergrads for several years (which motivated our Teaching Statistics book), but I became so dissatisfied with what I was covering that I’ve taken a break from teaching it until I can redo the entire course–new textbook, new workbook for the students and T.A.’s, etc.

I think you might be in a slightly better position: at least your students are all from the political science department and have some common interests. I have no firm views on your software question. Stata is an excellent tool in social science research, but Dick De Veaux said he’s had lots of success teaching using JMP-In, which is a version of SAS (fixed; thanks to the commenters). But, yeah, I think they should have to make graphs by hand if necessary. I hate, hate, hate the Excel graphs. More generally, when considering “what students need to get out of the course,” maybe you could survey some alumni. This could be an excellent class project, and something the students could be motivated to do! It would also introduce them to some qualitative research techniques. One of my difficulties when setting up our Quantitative Methods in Social Sciences program at Columbia several years ago is that the students just wanted to download data from the internet rather than use their personal expertise (whatever it was, for each student) to learn something new. (It’s sort of like that anecdote about the creative writing teacher who asks the students to write about what they know, and then gets lots of sub-Tarantino screenplays.)

P.S. That last remark is not meant to mock students, rather to indicate the challenge that anyone–student or practitioner–has in trying to connect coursework to real life.

Pencil (or pen) and graph paper are good tools, as Andrew noted. You might check out Gnuplot as another alternative, Bill W. It lacks the statistical power of SPSS, STATA, or R, but it's pretty easy to use, it produces nice graphs, and it runs on all major platforms.

J can be nice, too (it's what I tend to use), but the learning curve is a bit steeper.

I've had good success in using Excel. They can plot the data easily, they can extract the data using IF statements, etc. And they can "program" by turning on the macro function, which gets the more advanced students into playing with the data.

For example, I used to have students put their transcripts on a spreadsheet: columns for semester, department, course no., credits, and grade. Then they write a multipart IF statement to change letter grades into numbers [I also show them that they can use the chr() function to get the number directly — and use the upper() function to make sure that they are using capital letters].

Then they multiply their hours and grade, compute the average. I show them how to determine their overall average and their average by semester, and then give them homework — compute their GPA in their major, their GPA in upper-division courses, and plot their GPA vs semester.

This takes about 2 class periods and gets them working with numbers that they are very interested in, which paves the way for working on substantive data.

Of course, it won't work for all students, but what does?

Mike Maltz

JMP-IN is a wonderful point-and-click stripped down version of SAS. It is perfect for students and not expensive. I would use this.

Minor point of the "who cares" sort: the software probably is JMP – in, and it's from SAS, not SPSS. It is appropriate only for the least sophisticated undergrads who prefer a push button approach.

Mike, if you have them use spreadsheets, do you point them to the research on how to use spreadsheets well? I've linked to some of those reports and suggest what I'm using instead of most spreadsheets in http://www.facilitatedsystems.com/weblog/2006/03/… and the articles it links to.

You might find that J would treat the assignments you described quite easily and be able to help them better as they want to reason with more complex data-driven issues. (I used to be a heavier spreadsheet user until I saw an organization get an unfavorable surprise from one of the types of errors described in those reports.)

Bill, I'm now retired from teaching and no longer torturing students. I took a look at the URLs you referred me to and agree that there are perils to using spreadsheets in the ways they point out. But one of the things I always did was to get the students to graph the data — always. This obviates many of the problems cited.

Right now I'm engaged in cleaning really lousy data — monthly time series of 7 crime types for 1960-2004 for 18,000 law enforcement agencies. I'm visually inspecting each time series, and have written a bunch of VBA macros to facilitate the chore. Zero crime in a month is the real problem: is it a true zero, or is it made up for by reporting an aggregated figure in a subsequent month, or is it a (truly) missing datum? I don't know if J can do the job, but Excel seems to excel at it.

Mike

Mike, you

hadto get that last pun in, didn't you? :-)As for J, I'm extremely impressed with what it can do and how easy it can be. I've pretty much given up on spreadsheets in favor of J, and I've still got lots of J skill to acquire. http://jsoftware.com/jwiki is the front page of the J Wiki, where you can begin to explore what it can do. Or simply download and install it, and start working through the J Primer (available in the J help command or online at http://www.jsoftware.com/help/index.htm by clicking on "Pri" at the top of the screen).

If you try J, think of it more as learning to speak a new language than learning a new programming language; that approach seems to pay off wonderfully.

I do hope R Gordon didn't say jmp was unfit for real work.

I've seen elegant implementations of very complex realtime QA/QC frameworks done that I'd never attempt in another language. That were the backbone of the process line of a multi-billion-dollar company.

But it's unsophisticated because… why? Because it doesn't make your ears bleed?! Pshaw. It has deep complexity, handles a huge dataspace trivially (think 'all defect reports for every process and wafer in several fabs').

RGordon needs some perspective. Sometimes stuff is easier because it is elegantly-designed, not because it's a toy.