Feedback on my Bayesian Data Analysis class at Columbia

Posted on December 7, 2012 9:11 AM by Andrew

In one of the final Jitts, we asked the students how the course could be improved. Some of their suggestions would work, some would not. I’m putting all the suggestions below, interpolating my responses. (Overall, I think the course went well. Please remember that the remarks below are not course evaluations; they are answers to my specific question of how the course could be better. If we’d had a Jitt asking all the ways the course was good, you’d be seeing lots of positive remarks. But that wouldn’t be particularly useful or interesting.) The best thing about the course is that the kids worked hard each week on their homeworks.

OK, here are the comments and my replies:

Could have been better if we did less amount but more in detail.

I don’t know if this would’ve been possible. I wanted to get to the harder stuff (HMC, VB, nonparametric models) which required a certain amount of preparation. And, even so, there was not time for everything.

And also, needs solution for the homeworks.

Nope. I never hand out homework solutions. I much prefer for students to make corrections on their own homeworks. We discussed the solutions each week when we returned the graded papers to the students, but maybe we should have spent more time on that.

I think the class should have a lab session, in which we spend more time going over the homework and the R/stan code.

I agree! A weekly section would’ve been great. For some reason this rarely seems to be done at Columbia, and it’s always difficult to schedule these. But I’ll try again next time.

Maybe a couple of classes on Welcome to Stan.

Better to do in the (hypothetical) section meetings, I think; see comment immediately above.

More developed calculus on the board.

Maybe a little bit more, but I think that a better solution to this problem (students not following the math) would be to have some additional hwks that are relatively easy in which student are required to derive various formulas that are analogous to computations in the book. For example, working on the 8-schools-like algebra for a Poisson-gamma model.

How to choose different models to suit different real-word questions.

I thought we talked about that a lot! But maybe I didn’t offer enough general guidance. Also we didn’t have enough followup on the applied homework questions.

I hope to know more about the history and philosophical consideration of Bayes analysis. How Bayes conflict and interact with frequantists’ methods.

The revised Chapter 4 will help with this.

cannot review ppt in time because ppt is not uploaded in time. cannot write the detailed algorithms. cannot understand all the reading material and do not have time to understand them.

No! I will never post the slides before the class meetings. That’s a disaster, as it encourages students to read the slides rather than the book. You have to read the book, you can’t learn the material from the slides. If the book is too hard for you, maybe you should be taking an easier class.

It would have been better if we can have a chance to make up some missing homework questions.

Overall I was happy with students’ effort level on the homework assignments and would be reluctant to change what we did regarding due dates etc.

Although some questions are repeated later in the other problem sets, I sometimes missed some parts but didn’t have any chance or motivation to recover them.

I had time allocated at the end of class each week to discuss the upcoming homework. Often I didn’t get to that, though. In retrospect I should’ve skipped whatever during the lecture to make sure to spend a few minutes each week on this preparatory overview.

While the subject matter was very interesting, I would have preferred to have a few more derivations and examples in class. It felt like there was a disconnect between the lectures and the homework.

As noted above, I think the course could benefit from the addition of some more basic homeworks in which students learn the key derivations in each chapter.

Some homework solution or example of R-code were given, it would be better since although I fnished homework, I want to know the proper answer or example for future study

I think the solution here is not to hand out homework solutions but to do a better job with the in-class homework review. Also if we had weekly sections, the T.A. could review the past homework more carefully.

The book draft is not quite clear on some of the materials in the last few lectures and understandable information online is hard to obtain, e.g. variational Bayes and Bayesian spline regression. But overall, I learned tons of stuff and it’s a great experience.

The final version of the book should be cleaner.

difference between bayesian method and non-bayesian method.

The new Chapter 4 has some of this. But we try not to spend too much time on this; the assumption is that if you’re taking a course on Bayesian data analysis, that you’re wanting to learn this stuff for itself rather than as a comparison to anything else.

More about stan!

Stan is great but we already have so much on computation and we only have a fixed amount of time in the class.

Taking a closer look at why Metropolis and HMC work; but I guess we’ll do that in the computing course.

Again, it’s a tough tradeoff. I wanted to make sure all the students learned to program these methods but there’s no time for everything. Given the choice of only the theory or only the programming experience, I wanted to give the latter.

I think this course pretty much cover almost all the important and technical part in applied Bayesian analysis. (Probably it would be nice to have a short analytic works that we attempt to apply in our own field based on what we have learned so far.)

I did find that we did better job at teaching Bayes than with teaching the principles of applied statistics more generally. There was an applied homework assignment each week but we really have a chance to explore these problems in depth. Maybe it would be better to have fewer, more serious, applied assignments.

Time for digesting the things we learn

Now that the semester is over, you have lots of time to digest!

I think that the homeworks could have built upon each other much better. One simple way to do this is to suggest ways to design our functions for future use, so that by the end of the course we had a library of Bayesian tools.

Good idea. This could also be done in conjunction with weekly section meetings.

I felt like I was missing a lot of background, and I would have appreciated guidance about what I could do to gain the background knowledge that I felt was missing. I would also have appreciated greater help with R. It was easy to learn the basics, but very, very difficult to do more advanced things.

Again, this could happen in the section meetings.

Basic computational and regression methods.

A couple years ago I taught a two-semester applied stat PhD sequence out of ARM in the fall and BDA in the spring. I thought this made sense, but the first-year stat PhD students seemed to feel that ARM was too easy.

I don’t think preview a chapter and do the homeworks before that class is efficient.

Adding some easier homeworks each week to get students to understand the basic derivations, that would help.

nothing, just too much is covered

Yeah, I understand . . . but I would’ve felt awkward not covering topics such as VB and nonparametrics; these are not my area of expertise but they are important, especially going forward in the research direction.

I would have appreciated 3 things. First, more of a focus on selecting priors. Second, a focus on coding. The appendices on coding HMC and metropolis were incredibly helpful. Considering clean, correct code was never posted for any problem set, any errors I made over the semester kept snowballing and I was never able to learn the proper code. It is not feasible to ask detailed questions about code in class or in office hours. Finally, I would have appreciated more of a focus on hierarchical regressions. Chapter 5 was the most important to me in the course and was briefly considered. The chapter on Hierarchical regression models was completely skipped. There is far too much attention to computation and far too little paid to the substantive benefits and uses of bayesian analysis.

For the first part, I think the focus should be on constructing models, not selecting priors. I say “constructing” (rather than “selecting”) to denote an active process, and of course it is the whole model that must be built (or chosen), not just the prior distribution. I’ll need to make this point clearer next time. Maybe also it could be made clearer in the book. There’s a tendency in statistics teaching to just accept a data model without question and then go on to prior distributions and estimations from there, but that’s not right. It’s rare that a data model is known a priori.

Regarding the student’s other comments, I think we could do better by covering more of the details of coding during section meetings. Beyond that, I’m wary of talking too much about “the substantive benefits and uses of bayesian analysis.” You have to learn by doing.

21 thoughts on “Feedback on my Bayesian Data Analysis class at Columbia”

Bob S on December 7, 2012 12:39 PM at 12:39 pm said:

Any chance that this material makes its way to a MOOC like Coursera or EdX? There’s a distinct lack of Bayesian courses in the online world.
- Willy on December 7, 2012 4:31 PM at 4:31 pm said:
  
  Yes please! I tried going through parts of your books on my own this summer, and as a non-statistician with a technical background, I found it difficult. In particular, not having answers to the questions with each chapter (and not having a TA or prof to grade them) I couldn’t tell if I was doing things correctly or not in many cases. While I sympathize with your concerns with realealsing answers, not having them makes self study very challenging. Perhaps you could give answers to a subset of the questions? Or point people to another set of exercises for which answers are available? Your books are still great, but it would be helpful if they were more self-contained, or if there were an online course to help fill in the details.
  - Andrew on December 7, 2012 4:53 PM at 4:53 pm said:
    
    Willy:
    
    A long time ago, I wrote solutions for about 50 of the exercises. They’re in a nice pdf document you can find on the website for the book.
    - Willy on December 8, 2012 12:34 PM at 12:34 pm said:
      
      Thank you!
- andreas on December 8, 2012 5:29 PM at 5:29 pm said:
  
  I second the call for a MOOC. There are benefits to both students and the teachers. In a lot of these MOOCs the teachers get to present their material in a way they wouldn’t be able to do in a traditional university. For example “Neural Networks for Machine Learning” essentially explained neural networks from the personal (and very enlightening) perspective of Geoffrey Hinton, including even unpublished results.
  
  But I know that these courses are a very large investment in terms of time and effort. Bayesian inference shines through on several of them, but the “Bayesian Data Analysis” perspective hasn’t been explored yet.
  - Andrew on December 8, 2012 8:18 PM at 8:18 pm said:
    
    Andreas:
    
    I’d be happy to do a mook, but I’d need a collaborator. I don’t think I could put such a course together by myself.
  - bcnc on December 10, 2012 12:28 PM at 12:28 pm said:
    
    I was in the Coursera Neural Networks course and thought of BDA as well. I don’t know how well this would work given Prof. Gelman’s preference for not giving out HW solutions. I realize that NN didn’t have solutions either (just very large hints) but the TAs invested a lot of time on the discussion forums trying to guide us through them. I am also curious how the programming assignments for BDA would go given how up in arms some of us were over one of the Neural Networks programming hw. One of the things that MOOC participants have come to expect is instant feedback on hw – whether it is incorrect and why and what is the correct answer.
    
    But I agree that Bayesian statistics is one area that is obviously (to me) absent and all I’m (superficially) learning about Bayesian statistics is the computer science perspective of it – Naive Bayes classifiers, Bayesian Networks, etc.
Wesley on December 7, 2012 12:52 PM at 12:52 pm said:

When you talk about “the book”, are you updating BDA, or writing a new book on the topic?
Antonio Pedro on December 7, 2012 1:42 PM at 1:42 pm said:

Just to provide some context: do you have a syllabus or home page for this class?
- Andrew on December 8, 2012 8:18 PM at 8:18 pm said:
  
  No syllabus, I just go through BDA3, section by section.
JSB on December 7, 2012 5:19 PM at 5:19 pm said:

As someone thinking of using Stan for future research (ecology), I am interested how Stan went over with the students.
- Andrew on December 8, 2012 8:19 PM at 8:19 pm said:
  
  They liked it. I also had them program Metropolis and HMC on their own, which made them appreciate Stan that much more.
  - jsb on December 10, 2012 12:27 AM at 12:27 am said:
    
    That is encouraging. I can also provide feedback from my STAN experiences after I use the program in research and course development.
Basil on December 8, 2012 12:31 AM at 12:31 am said:

It’s great to see you reading and reflecting on your evaluations. I think so many teachers just throw these out the window beliving that student opinions don’t matter. I like the idea of posting your responses to the evaluations on the blog also. Some of your students may not like your responses, but I believe they feel like their voice was heard. I recently completed an action research project in my algebra II class (Common Core statistics standards) by implementing Garfield and Ben-Zvi’s statistical reasoning learning environment and “The Pit and Pendulum”. I hoped to see some increase the affective domain after my study, but there was no difference in responses. I’m hoping to present my findings from implementing the study at a conference in North Carolina next year.
Anonymous on December 8, 2012 9:29 AM at 9:29 am said:

Rather off-topic but is “homeworks” becoming standard usage in the USA? I have never heard of homework in the plural.
- Willy on December 8, 2012 12:42 PM at 12:42 pm said:
  
  My understanding is that the plural would be used for multiple homework assignments and the the singular is used for individual assignments. So: “The homework is due Tuesday” but: “Homeworks for this course are due every Tuesday.” The singular would also be correct here though I think. It maybe a spoken/blog post vs. NYT situation. My spell check doesn’t approve of “homeworks” for what it’s worth.
  - jrkrideau on December 11, 2012 9:10 AM at 9:10 am said:
    
    Thanks. It seems to be a USA usage perhaps. My informal poll of 6-7 university students and a wandering prof here in Canada received blank looks when I suggested a plural.
Lord on December 8, 2012 5:18 PM at 5:18 pm said:

You could circumvent the limitation on class time by putting some portion of your class online and devoting some of the class towards discussing it, problems, and homework, gradually shifting towards the Kahn model.
- Andrew on December 8, 2012 8:20 PM at 8:20 pm said:
  
  Lord:
  
  I do much of this already, except that the class is in the book rather than online. The students all want to have copies of my slides but I don’t want to give them that. Reading the slides is no substitute for working your way through the book.
Riley on December 10, 2012 8:09 AM at 8:09 am said:

Kind of want to see how the Harvard students react to your course…
Christian Kleineidam on December 10, 2012 9:18 AM at 9:18 am said:

>Nope. I never hand out homework solutions. I much prefer for students to make corrections on their own homeworks.

How good is your approach at encouraging students to make corrections on their own homework? How many of them do it?

When you do programming, yes you learn by doing. The solution of someone who just started with R is however nearly always supoptimal.
Having a written solution that demostrates the proper way to solve the problem in R is highly valuable.

Comments are closed.