Selecting people you believe are qualified reviewers may increase the quality of your reviews, and ultimately of your paper. If a qualified reviewer with good intentions disagrees with your paper, then listen! My first cross-cultural paper was reviewed by Harry Triandis, the great man in the field. He was very instructive, told me what books to read. He reviewed the revised paper, now a confirmatory study, and published it without revision. I and my colleagues learned a lot, and went on to publish lots of great cross-cultural research.

The point is not to get an easy review, but to get a professional, competent review. People who hate and fear new methods, and know nothing about them but diss them anyway, are an impediment to progress… Washington isn’t the only swamp making life irrational…

Among the top causes of mortality in the US is taking a prescribed medication. ALL safety analysis is *mandated* to be conducted by regression models. More and more really simple demonstrations are coming on-line that demonstrate that regression models are not at all accurate (here is a little article on regression, logistic is no better–there are many examples in indexed journals as well as in ODA journals, type logistic in the search box on the journal home page).

a) You can select from many data sets already analyzed

I have already published many, many data sets in the ODA journal, so that they can be re-analyzed. These are free to anyone in the Universe. If you want one of mine, please select and use it–the ODA analysis is already published.

If you want me to select, use the article comparing scores on MMPI taxons for many different samples. Or, the data on inter-rater reliability of plant health. I recall that I found the results of the analyses interesting.

If you upload a dataset and tell us results that you already have, I will plug into xgboost and report back… it should take a couple of minutes (maybe a bit longer depending on what format the data is in, etc). BTW, if you are familiar with R or python it should take you no more than a couple hours to get xgboost going.

It occurred to me that, if you wish to collaborate on a comparison of ODA and other methods, perhaps you would be interested in crafting a follow-up to a recent article: https://www.ncbi.nlm.nih.gov/pubmed/26805004

]]>In all seriousness, I would put Bayesian data analysis (as expressed in our book) as a paradigm that is distinct from all the paradigms you listed just there, and at least as important as most.

I don’t use my personal finite time doing anything other than ODA. In gestalt I am interested in finding the model that presents the best combination of predictive accuracy (normed against chance) and parsimony, as indexed by the D statistic. However, models of different complexity may be appropriate based upon statistical power (an exclusion criterion) and theoretical clarity or pragmatic significance (inclusion criteria). I know that the best any present model can do is explicitly identified, so there is no guesswork. That is,

1. If accuracy is defined as in the ODA paradigm, then in training analysis ODA will find the best model.

2. If accuracy is defined as in the ODA paradigm, and (as in novometrics) if one is only interested in validity performance, then CTA software allows the operator to set either of two criteria: (a) find the best model that has identical training and jackknife (or any other validity criterion) performance; or (b) find the best model that has highest jackknife (or whatever) performance with experimentwise (or whatever) p<0.05 (or whatever). The software allows operator control of many constraints, there are ODA articles on this, and of course the book synthesizes the matter…

I look in books or articles for data sets. When I find a data set, sometimes it is analyzed using XYZ method. So, I summarize the findings reported in the article using XYZ, and then re-analyze the data using ODA. If you have such a data set, we certainly could talk about a collaborative paper–I do this for fun, and to learn more about my trade, other methods, applied results. Right now I am a bit swamped–why I must return to work.

If I find a data set that was not analyzed, I only use ODA to take a look.

I can't do everything. I know, I tried, I failed…

]]>Mike, the entire book is about correct fitting, the entire paradigm! I can’t re-write the book here. Perhaps a brief response will satisfy your request, I hope so. :-)

The final Axiom of novometrics mandates replication/validation in order to estimate predictive accuracy–training results are not used as estimates. The most common validation methods are various jackknife, K-fold, Monte Carlo, bootstrap, hold-out, and multi-sample methods (AFAIK, only ODA software performs many of these methods for ALL statistical analyses). The novometric D statistic norms model quality as a function of accuracy and parsimony (I cited an article on this in another response in this thread–IMO it may address all of your concern in two pages–Theoretical aspects of the D statistic).

These are described and used throughout the book. These validation methods are also discussed in a forest of other books and a sea of other articles. It is easiest to read the book, it covers all the bases.

Training is for practice–validation is for real…

]]>Clearly computers are needed to elucidate the exact distribution for non-directional analysis–but all the computers in the world couldn’t solve the problem for even a moderate N:

Yarnold, P.R., & Soltysik, R.C. (1991). Theoretical distributions of optima for univariate discrimination of random data. Decision Sciences, 22, 739-752.

However for directional hypotheses there is an closed-form solution:

Soltysik, R.C., & Yarnold, P.R. (1994). Univariable optimal discriminant analysis: One-tailed hypotheses. Educational and Psychological Measurement, 54, 646-653.

Carmony, L., Yarnold, P.R., & Naeymi-Rad, F. (1998). One-tailed Type I error rates for balanced two-category UniODA with a random ordered attribute. Annals of Operations Research, 74, 223-238.

]]>a. Select an example involving binary data that was analyzed by ODA.

b. Construct the data set.

c. Do whatever you wish. The ODA analysis was already done.

]]>Please describe validation procedures for ODA/ CTA.

We cannot read a whole book for understanding validation procedures of your method.

]]>To test the hypothesis that one’s manuscripts with fewer pages are more likely to be published…

“To conduct an optimal data analysis, the ODA software would begin by arranging all of the manuscripts (i.e.,

observations) along a continuum formed by page length, with each manuscript represented by a 0 or 1 depending on its

publication status. ODA would then examine all possible cutpoints along the continuum (i.e., midpoints between two

successive observations that have different values on the class variable) and would separately evaluate the classification

performance achieved across all observations, using each cutpoint that conforms to the directional hypothesis (i.e., for

which the lower score on the page-length continuum is associated with acceptance and the higher score on the pagelength

continuum is associated with rejection). The final ODA model would consist of the cutpoint that matches the

directional hypothesis and produces the greatest overall percentage of accurate predictions across both categories of the

class variable. For example, the optimal model might be, “If page length ≤ 25.5, then predict the manuscript is accepted

for publication; otherwise, predict the manuscript is rejected.” This particular model would be considered optimal

because no other cutpoint consistent with the directional hypothesis could achieve a greater overall percentage of

classification accuracy with these data.”

]]>We ran our first-ever *large* experimental MultiODA on a CRAY-2 (NCSA, Urbana). Exponential in N, the problem had a binary class (dependent) measure and three ordered attributes (independent variables) for N=39 (thirty nine), and it red-lighted the CPU forcing a cold boot.

Years later we were able to solve MultiODA problems for uniform random data involving five attributes and N=1,000,000 in several CPU seconds using an IBM3060-400VF supercomputer (UI, Chicago).

Today we get better nonlinear answers to problems involving four attributes and N=3,000,000 in CPU seconds using a 64-bit PC.

