I received the following in the email:
I had a look at the dataset on speed dating you put online, and I found some big inconsistencies. Since a lot of people are using it, I hope this can help to fix them (or hopefully I did a mistake in interpreting the dataset).
Here are the problems I found.
1. Field dec is not consistent at all (boolean for a big chunk of the dataset, in the range 1-10 later). Should this be the field of the decision and dec_o be the decision of the partner? dec and match should be the same thing? I tried to used match instead of dec but then I get the following problem
2. I tried to see if matches are consistent (if my partner decided yes it should mean that in his record I see a match): if I look at the record with iid x and pid y, dec_o=1 should mean that in the record with iid y and pid x I should see a match (in match or dec). This is not in general true. So dec_o is not consinstent with the matches.
3. Same thing for like and attr_o (or attr and attr_o)
I sent this to Ray Fisman, the source of the data, who replied:
Saurabh Bhargava used the underlying files and has posted data in a replication file for a study in the Review of Economics and Statistics.
I’m glad somebody put those data in the freezer.