Secret Information Encore

My post "Secret data" on replication provoked a lot of comment in addition to emails,  more reflection, in addition to roughly additional links.

This isn't nigh rules

Many of my correspondents missed my primary request -- I am non advocating to a greater extent than in addition to tighter rules past times journals! This is non nigh what y'all are "allowed to do," how to "get published" in addition to thus forth.

In fact, this extra rumination points me fifty-fifty to a greater extent than strongly to the thought that rules in addition to censorship past times themselves volition non work. How to brand enquiry transparent, replicable, extendable, in addition to thus forth varies past times the sort of work, the sort of data, in addition to is bailiwick similar everything else to inventiveness in addition to technical improvement.  Most of all, it volition non run if nobody cares; if nobody takes the sort of actions inward bullet points of my concluding post, in addition to it's just an number nigh rules at journals. Already, (more below) rules are non that good followed.

This isn't just nigh "replication." 

"Replication" is much besides narrow a word. Yes, many papers receive got non documented transparently what they genuinely did, thus that fifty-fifty armed with the information it's hard to gain the same numbers. Other papers are based on hush-hush data, the job with which I started.

But inward the end, most of import results are non only due to outright errors inward information or coding. (I hope!)

The of import number is whether small-scale changes inward instruments, controls, information sample, mensuration error handling, in addition to thus forth gain unlike results, whether results grip out of sample, or whether collecting or recoding information produces the same conclusions. "Robustness" is a ameliorate overall descriptor for the job that many of us suspect pervades empirical economical research.

You demand replicability inward fellowship to evaluate robustness -- if y'all larn a unlike resultant than the master authors', it's essential to hold out able to rails downward how the master authors got their result. But the existent number is that much larger one.

The splendid replication wiki (many practiced links) quotes Daniel Hamermesh on this divergence betwixt "narrow" in addition to "wide" replication
Narrow, or pure, replication agency root checking the submitted information against the primary sources (when applicable) for consistency in addition to accuracy. Second the tables in addition to charts are replicated using the procedures described inward the empirical article. The aim is to confirm the accuracy of published results given the information in addition to analytical procedures that the authors write to receive got used. 
Replication inward a broad feel is to consider the empirical finding of the master newspaper past times using either novel information from other fourth dimension periods or regions, or past times using novel methods, e.g., other specifications. Studies with major extensions, novel information or novel empirical methods are oftentimes called reproductions.
But the to a greater extent than of import robustness query is to a greater extent than controversial. The master authors tin complain they don't similar the replicator's alternative of instruments, or procedures. So "replication," which sounds straightforward, speedily turns inward to controversies.

Michael Clemens writes nigh the number inward a weblog post here, noting
...Again in addition to again, the master authors receive got protested that the critique of their run got unlike results past times construction, non because anything was objectively wrong nigh the master work. (See Berkeley’s Ted Miguel et al. secret information post would hold out read every bit criticism of people who gain large-data work, proprietary-data work, or run with regime agencies that cannot currently hold out shared.  The network is pretty snarky, thus it's worth stating explicitly that is non my intent or my view.

Quite the opposite. I am a huge fan of the pioneering run exploiting novel information sets. If these pioneers had non establish dramatic results in addition to possibilities with novel data, it would non affair whether nosotros tin replicate, cheque or extend those results.

It is only now, that the pioneers receive got shown the way, that nosotros know how of import the run tin be, that it becomes vital to rethink how nosotros gain this sort of run going forward.

The special problems of confidential regime data

The regime has a lot of neat information -- IRS, in addition to census for microeconomics, SEC, CFTC, Fed, fiscal production security commission inward finance. And at that topographic point are obvious reasons why thus far it has non been easily shared.

Journal policies allow exceptions for such data. So only a key demand from the residue of us for transparency tin convey nigh changes. And has begun to gain so.

In add-on to the suggestions inward the concluding post, to a greater extent than in addition to to a greater extent than people are going through the vetting to work the data. That leaves opened upward the possibility that a total replication machine could hold out stored on site, ready for a replicator with proper access to force a button. Commercial information vendors could allow similar "free" replication, controlling direct how replicators work the data.

Technological solutions are on the way too.  "Differential privacy" is an illustration of a engineering that allows results to hold out replicated without compromising the privacy of the data. Leapyear.io is an illustration of companies selling this sort of technology. We are non alone, every bit at that topographic point is a rigid commercial demand for this sort of data. (Medical information for example.)

Other institutions: Journals, replication journals, websites,

There is roughly debate whether checking "replication" should count every bit novel research, in addition to I argued if nosotros desire replication nosotros demand to value it. The larger robustness query for certain is "new" research. Xs resultant does non grip out of sample, is sensitive to the precise alternative of instruments in addition to controls, in addition to thus forth, is genuine, publishable, follow-on research.

I originally opined that replications should hold out published past times the master magazine to give the best incentives. That agency an AER replication "counts" every bit an AER publication.

But with the thought that robustness is the wider issue, I am less inclined to this view. This broader robustness or reexamination is genuine novel research, in addition to at that topographic point is a continuum betwixt replication in addition to the normal job organisation of examining the basic thought of a model with novel information in addition to also roughly novel methods. Each newspaper on the permanent income hypothesis is non a "replication" of Friedman! We don't desire to only value every bit "new" enquiry that which uses novel methods -- in addition to then nosotros locomote dry out methodologists, non fact-oriented economists. And i time a newspaper goes beyond pointing out unproblematic mistakes, to questioning specification, a query which itself tin hold out rebutted, it's beyond the responsibleness of the master journal.

Ivo Welch argues that a 3rd of each magazine should hold out devoted to replication in addition to critique.  The Critical Finance Review, which he edits asks for replication papers.  The Journal of Applied Econometrics has a replication section, in addition to forthwith invites replications of papers inward many other journals. Where journals fright to tread, other institutions stair in. The replication network is i interesting novel resource.

Faculties

H5N1 correspondent suggests an of import additional bullet request for the "what tin nosotros do" list

  • Encourage your faculty to adopt a replicability policy every bit component division of its standards of conduct, in addition to every bit component division of its standards for internal in addition to exterior promotions. 

The precise wording of such standards should hold out fairly loose. The of import thing is to transportation a message. Faculty are expected to brand their enquiry transparent in addition to replicable, to furnish information in addition to programs, fifty-fifty when journals gain non require it.  Faculty upward for advertisement should await that the commission reviewing them volition facial expression to come across if they are behaving reasonably. Failure volition probable Pb to a piddling chat from your subdivision chair or dean. And the policy should soil that replication in addition to robustness run is valued.

Another correspondent wrote that he/she advises junior faculty not to post programs in addition to data, thus that they gain non locomote a "target" for replicators. To say nosotros disagree on this is an understatement. H5N1 clear phonation on this number is an splendid lawsuit of crafting a written policy.

From Michael Kiley's splendid comment below

  • Assign replication exercises to your students. Assign robustness checks to your to a greater extent than advanced students. Advanced undergraduate in addition to PhD students are a natural reservoir of replicators. Seeing the nuts in addition to bolts of how good, transparent, replicable run is done volition gain goodness them. Seeing that non everything published is replicable or right mightiness gain goodness them fifty-fifty more.   

Two practiced surveys of replications (as good every bit journals) 

Maren Duvendack, Richard  Palmer-Jones, in addition to Bob Reed receive got an splendid survey article, "Replications inward Economics: H5N1 Progress Report"
...a survey of replication policies at all 333 economic science journals listed inward Web of Science. Further, nosotros analyse a collection of 162 replication studies published inward peer-reviewed economic science journals. 
The latter is specially good, starting at p. 175. You tin come across hither that "replication" goes beyond just can-we-get-the-author's-numbers, in addition to maddeningly oftentimes does non fifty-fifty inquire that question
 a piddling less than two-thirds of all published replication studies assay to just reproduce the master findings....A frequent ground for non attempting to just reproduce an master study’s findings is that a replicator attempts to confirm an master study’s findings past times using a unlike information set
"Robustness" non "replication "
Original Results?, tells whether the replication study re-reports the master results inward a way that facilitates comparing with the master study. H5N1 large portion of replication studies gain non offering slow comparisons, perchance because of express magazine space. Sometimes the lack of direct comparing is to a greater extent than than a kid inconvenience, every bit when a replication study refers to results from an master study without identifying the tabular array or regression number from which the results come.
Replicators demand to hold out replicable in addition to transparent too!
Across all categories of journals in addition to studies, 127 of 162 (78%) replication studies disconfirm a major finding from the master study. 
But rather than just the park alarmist headline, they receive got a practiced insight. Replication studies tin endure the same significance bias every bit master work:
Interpretation of this number is difficult. One cannot assume that the studies treated to replication are a random sample. Also, researchers who confirm the results of master studies may human face upward difficulty inward getting their results published since they receive got cipher ‘new’ to report. On the other hand, magazine editors are loath to scandalise influential researchers or editors at other journals. The Journal of Economic & Social Measurement in addition to Econ Journal Watch receive got sometimes allowed replicating authors to study on their (prior) difficulties inward getting disconfirming results published. Such firsthand accounts particular the reticence of roughly magazine editors to pose out disconfirming replication studies (see, e.g., Davis 2007; Jong-A-Pin in addition to de Haan 2008, 57).
Summarizing
.. nearly fourscore pct of replication studies receive got establish major flaws inward the master research
Sven Vlaeminck in addition to Lisa-Kristin Hermmann surveyed journals in addition to study that many journals with information policies are non enforcing them. 
The results nosotros obtained advise that information availability in addition to replicable enquiry are non with the altitude priorities of many of the journals surveyed. For instance, nosotros establish 10 journals (i.e. 20.4% of all journals with such policies) where non a unmarried article was equipped with the underlying enquiry data. But fifty-fifty beyond these journals, many editorial offices gain non genuinely enforce information availability: There was only a unmarried magazine (American Economic Journal: Applied Economics) which has information in addition to code available for every article inward the 4 issues. 
Again, this observation reinforces my request that rules volition non substitute for people caring nigh it. (They also hash out technological aspects of replication, in addition to the impermanence in addition to obscurity of zip files posted on magazine websites.) 

Numerical Analysis

Ken Judd wrote to me,
"Your advocacy of authors giving away their code is non the dominion inward numerical analysis. I request to the “market test”: the numerical analysis community has done an splendid undertaking inward advancing computational methods despite the lack of whatever requirement to percentage the code....
Would y'all require Tom Doan to give out the code for RATS? If not, in addition to then why gain y'all advocate journals forcing me to freely distribute my code?...
The number is non replication, which just agency that my code gives the same reply on your estimator every bit it does on mine. The number is verification, which is the work of tests to verify the accuracy of the answers. That I am willing to provide."
Ken is I scream upward reading to a greater extent than "rule in addition to censorship" rather than "social norms" inward my views. And I scream upward it reinforces my preference for the latter over the former.  Among other things, rules designed for i role (extensive statistical analysis of large information sets) are poorly adapted to other situations (extensive numerical analysis.)

Rules tin hold out taken to extremes.  Nobody is talking nigh "requiring" bundle customers to distribute the (proprietary) bundle source code. We all empathize that stair is non needed.

For heavy numerical analysis papers, using author-designed software that the writer wants to market, the verification proposition seems a sensible social norm to me.  If I'm refereeing a newspaper with a heavy numerical component, I would hold out happy to come across the extensive verification, in addition to happier notwithstanding if I could work the programme on a few bear witness cases of my own. Seeing the source code would non hold out necessary or fifty-fifty that useful. Perhaps inward extremis, if a verification failed, I would desire the right to contact the writer in addition to empathize why his/her code produces a unlike result.

Some other examples of "replication" (really robustness) controversies:

Andrew Gelman covers a replication controversy, inward which Douglas Campbell in addition to Ju Hyun Pun dissect Enrico Spolaore in addition to Romain Wacziarg's "the Diffusion of Development" inward the QJE. There is no accuse that the estimator programs were wrong, or that i cannot gain the published numbers. The arguing is exclusively over specification, that the resultant is sensitive to specification in addition to controls.

Yakov Amihud in addition to Stoyan Stoyanov Do Staggered Boards Harm Shareholders? reexamine Alma Cohen in addition to Charles Wang's Journal of Financial Economics paper. They come upward to the reverse conclusion, but could only reexamine the number because Cohen in addition to Wang shared their data. Again, the issues, every bit far every bit I tin tell, are non a accuse that programs or information are wrong.

Update: Yakov corrects me:

  1. We gain non come upward to "the reverse conclusion". We just cannot spend upward the zero that staggered board is harmless to theatre value, using Cohen-Wang's experiment. 
  2. Our resultant is also obtained using the publicly-available ISS database (formerly RiskMetrics). 
  3. Why is the divergence betwixt the results? We used CRSP information in addition to did non include a few delisted (penny) stocks that are inward Cohen-Wang's sample. Our newspaper states which stocks were omitted in addition to why. We are re-writing the newspaper forthwith with to a greater extent than detailed analysis.

I scream upward the request that replication slides inward to robustness which is to a greater extent than of import in addition to to a greater extent than contentious remains clear.

Asset pricing is specially vulnerable to results that gain non grip out of sample, inward particular the powerfulness to forecast returns. Campbell Harvey has a number of practiced papers on this topic.  Here, the number is i time to a greater extent than non that the numbers are wrong, but that many practiced in-sample return-forecasting tricks halt working out of sample. To know, y'all receive got to receive got the data.

Subscribe to receive free email updates:

0 Response to "Secret Information Encore"

Posting Komentar