Lecture notes
Scribing? Get the template.
- Jan 4 - Introduction [PDF]
- Jan 6 - Online Learning/Optimization #1: Weighted Majority, Hedge [PDF]. Also see [L.1]
- Jan 11 - Online Learning/Optimization #2: Online Convex Programming [PDF]. Also see [L.6]
- Jan 13 - Online Learning/Optimization #3: Online SVM [PDF]
- Jan 20 - Online Learning/Optimization #4: More data => faster learning [PDF]. Also see [L.9]
- Jan 25 - Dimensionality Reduction Techniques [PDF]
- Jan 27 - Dealing with Partial Observability/Feedback #1:
epsilon-Greedy and UCB1. [PDF]. Also see [A.9] - Feb 1 - Dealing with Partial Observability/Feedback #2:
unbiased estimators for fun and profit. [PDF] - Feb 3 - Active Learning #1: Background and Pool-based learning [PDF]
- Feb 8 - Active Learning #2: the myopic version-space shrinking strategy [PDF] Also see [B.1]
- Feb 10 - Active Learning #3: When precisely does active learning help? Also see [B.6] [PDF]
- Feb 17 - Nonparametric learning #1: Kernelization [PDF] Also see [N.1]
- Feb 22 - Nonparametric learning #2: RKHS [PDF] Also see [N.1]
- Feb 24 - Nonparametric learning #3: Gaussian Processes [PDF] Also see [N.2]
- Mar 1 - Nonparametric learning #4: Gaussian Process Regression [PDF] [N.2]
- Mar 3 - Nonparametric learning #5: Hyperparameter Optimization and Classification [PDF] Also see [N.2]
- Mar 8 - Nonparametric learning #6: Active Learning in GPs [PDF] [N.2]
- Mar 10 - Nonparametric learning #7: GP Optimization; Active Set methods (coming soon) [N.2]
Relevant Readings
Online Learning and Optimization (in the full information model)
- [L.1] N. Littlestone, M. Warmuth. The weighted majority algorithm. Information and Computation 108(2):212-261, 1994.
- [L.2] Cesa-Bianchi et al. How to Use Expert Advice, JACM 1997. More detail on binary classification in the experts paradigm.
- [L.3] Y. Freund, R. Schapire. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting, JCSS 1997. The paper that introduced the Hedge algorithm.
- [L.4] Freund et al. Using and combining predictors that specialize STOC 1997. Addresses the sleeping experts problem
- [L.5] A. Blum, Y. Mansour. From External to Internal Regret, JMLR 2007. Generalizes from sleeping experts to time-selection functions, among other results
- [L.6] M. Zinkevich. Online Convex Programming and Generalized Infinitesimal Gradient AscentICML 2003.
- [L.7] Logarithmic Regret Algorithms for Online Convex Optimization by Hazan et al. Improved analysis of Zinkevich's algorithm and two other algorithms, showing logarithmic (in T) regret for strictly convex cost functions
- [L.8] A. Kalai, S. Vempala. Efficient Algorithms for Online Decision Problems, JCSS 2005. Another powerful approach to online decision problems, perturbed-follow-the-leader, is presented here.
- [L.9] S. Shalev-Shwartz, N. Srebro. SVM Optimization: Inverse Dependence on Training Set SizeICML 2008
Online Learning and Optimization (in the bandit feedback model)
- [A.1] S. Pandey, D. Chakrabarti, D. Agrawal. "Multi-armed Bandit Problems with Dependent Arms" ICML 2006
- [A.2] S. Bubeck, R. Munos, G. Stoltz, C. Szepesvari. "Online Optimization in X-Armed Bandits" NIPS 2008.
- [A.3] P. Auer et al. The non-stochastic multi-armed bandit problem. The paper that introduced the Exp3 family of algorithms.
- [A.4] J. Langford and T. Zhang. "The Epoch-Greedy Algorithm for Contextual Multi-armed Bandits" NIPS 2007.
- [A.5] R. Kleinberg, A. Niculescu-Mizil, and Y. Sharma. "Regret bounds for sleeping experts and bandits" COLT 2008.
- [A.6] D. Lizotte, T. Wang, M. Bowling, D. Schuurmans. "Automatic Gait Optimization with Gaussian Process Regression" IJCAI 2007
- [A.7] R. Martinez-Cantin, N. de Freitas, A. Doucet and J. Castellanos. "Active Policy Learning for Robot Planning and Exploration under Uncertainty" RSS 2007.
- [A.8] B. McMahan, A. Blum. "Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary", COLT 2004.
- [A.9] P. Auer, N. Cesa-Bianchi, P. Fischer. "Finite-time Analysis of the Multiarmed Bandit Problem", Machine Learning, 2002. The paper that introduced the UCB family of algorithms.
- [A.10] P. Auer, R. Ortner. "Logarithmic Online Regret Bounds for Undiscounted
Reinforcement Learning", NIPS 2006 - [A.11] A. Flaxman, A. Kalai, H. B. McMahan, "Online convex optimization in the bandit setting: gradient descent without a gradient", SODA 2005. Presents an algorithm similar to Zinkevich's but for the partial information (bandit) feedback model. Also see [A.8].
Dimension Reduction
- [D.1] L. Saul, K. Weinberger, F. Sha, J. Ham, D. Lee. "Spectral Methods for Dimensionality Reduction"
- [D.2] J. Tenenbaum, V. de Silva, J. Langford. "A Global Geometric Framework for Nonlinear
Dimensionality Reduction", Science 290, 2000 - [D.3] S. Roweis, L. Saul. "Nonlinear Dimensionality Reduction by Locally Linear Embedding", Science 290, 2000
Active Learning
- [B.1] S. Dasgupta. "Analysis of a greedy active learning strategy" NIPS 2004
- [B.2] M. Balcan, S. Hanneke, J. Wortmann. "The true sample complexity of Active Learning", COLT 2008.
- [B.3] S. Dasgupta, D.J. Hsu. "Hierarchical sampling for active learning", ICML 2008.
- [B.4] R. Castro, C. Kalish, R. Nowak, R. Qian, T. Rogers, X. Zhu. "Human Active Learning", NIPS 2008.
- [B.5] R. Castro, R. Nowak. "Minimax Bounds for Active Learning", COLT 2007
- [B.6] S. Dasgupta. "Coarse Sample complexity bounds for active learning", NIPS 2005.
- [B.7] S. Tong, D. Koller. "Support Vector Machine Active Learning with Applications to Text Classification." JMLR 2001.
- [B.8] S. Dasgupta, D. Hsu, C. Monteleoni. "A General Agnostic Active Learning Algorithm", NIPS 2007.
- [C.1] M. Streeter, D. Golovin "An Online Algorithm for Maximizing Submodular Functions" NIPS 2008
- [C.2] F. Radlinski, R. Kleinberg, T. Joachims. "Learning diverse rankings with multi-armed bandits" ICML 2008.
- [C.3] S. Hoi, R. Jin, J. Zhu, M. Lyu "Batch mode active learning and its application to medical image classification", ICML 2006.
- [C.7] M. Seeger, H. Nickisch "Compressed Sensing and Bayesian Experimental Design" ICML 2008.
- [C.9] A. Krause, C. Guestrin, A. Gupta, J. Kleinberg. "Near-optimal sensor placements in Gaussian Processes: Maximizing Information while Minimizing Communication Cost", IPSN 2006
- [C.11] J. Williams, J. Fisher III, A. Willsky. "Performance guarantees for information-theoretic active inference", AISTATS '07.
Nonparametric Learning
- [N.1] B. Schölkopf, A. Smola. "Learning with Kernels" MIT Press 2002. Chapters 2 and 4
- [N.2] C. Rasmussen, C. Williams. "Gaussian Processes in Machine Learning" MIT Press 2006.Available online, free of charge
Some other courses with overlapping content
- Avrim Blum's introductory graduate level and advanced machine learning courses.
- Robert Kleinberg's course on Learning, Games, and Electronic Markets
No comments:
Post a Comment