Papers for discussion

An algorithm that finds truth even if most people are wrong [Prelec]: "Crowd" predictions are not necessarily good, but analyzing meta-knowledge of individual predictors can help you pick out the best predictors in the crowd.
Extracting the Wisdom of Crowds When Information is Shared [Palley]: Like Prelec's paper, but uses prediction of crowd's average as proxy for meta-knowledge, instead of prediction of crowd that would agree with you.
Clustering Similar Stories Using LDA: Good mash-up of ideas, including LDA (Latent Dirilecht Allocation), automatic dimensionality reduction, clustering.
Attacking machine learning with adversarial examples: Particular mention of image-classifying ANNs, which are especially prone to adversial noise that's imperceptible to humans.
Good Management Predicts a Firm’s Success Better Than IT, R&D, or Even Employee Skills: An NBER study that appears to be done in R.
SimHash for question deduplication: Very easy intro to SimHash. See also the Wikipedia entry.
Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing
Generalized Autoregressive Score models: a way to fit time-series with a variety of distributions
Inferring and Executing Programs for Visual Reasoning: ML programs generating other ML programs
Deep reinforcement learning from human preferences: Aims to minimize how much time a human must give feedback to the system for the system to train itself correctly
ProjectionNet: Learning Efficient On-Device Deep Networks Using Neural Projections: Trains a simpler ANN "next to" a more traditional ANN for image recognition, getting good results from the simpler ANN with reduced memory requirements.
When Correlation Is Not Causation, But Something Much More Screwy
Deep Image Prior
Lessons from Optics, The Other Deep Learning: Phenomena noticed in training deep ANNs, with an analogy to optics.
AI2 (Abstract Interpretation for AI Safety): Using absint to guard against adversarial attacks.
INFERNO: Inference-Aware Neural Optimisation
A New Angle on L2 Regularization
KiloGrams: Very Large N-Grams for Malware Classification: Clever use of multi-passing over a large dataset with an approximating first pass, to reduce computational and memory requirements.
Quantifying the evolution of individual scientific impact: Digs into the distributions found in the data to produce a really excellent model; somewhat contrary to anything we'd currently expect from machine-learning approaches.