/
/
/
Here are some of my contributions. Other papers are under review, others are in progress...
2014
-
Selecting Near-Optimal Approximate State Representations in Reinforcement Learning.
Ronald Ortner, Odalric-Ambrym Maillard, Daniil Ryabko.
To appear in Algorithmic Learning Theory, 2014.
[Bib][Pdf](Discuss) -
Sub-sampling for multi-armed bandits.
Akram Baransi, Odalric-Ambrym Maillard, Shie Mannor.
To appear in Europeean conference on Machine Learning, 2014.
[Bib][Pdf](Discuss) -
Concentration inequalities for sampling without replacement.
Rémi Bardenet, Odalric-Ambrym Maillard,
To appear in Bernoulli, 2014.
[Bib][Pdf](Discuss)
2013
-
Latent Bandits.
Odalric-Ambrym Maillard, Shie Mannor.
In International conference on Machine Learning, 2014.
[Bib][Pdf](Discuss) -
Robust Risk-averse Multi-armed Bandits.
Odalric-Ambrym Maillard.
In Algorithmic Learning Theory, 2013.
[Bib][Pdf](Discuss) -
Kullback-Leibler Upper Confidence Bounds for Optimal Sequential Allocation.
Olivier Cappé, Aurélien Garivier, Odalric-Ambrym Maillard, Rémi Munos, Gilles Stoltz.
In The Annals of Statistics, 2013.
[Bib][Pdf](Discuss) -
Competing with an Infinite Set of Models in Reinforcement Learning.
Phuong Nguyen, Odalric-Ambrym Maillard, Daniil Ryabko,Ronald Ortner.
In International Conference on Artificial Intelligence and Statistics, 2013.
[Bib][Pdf](Discuss)
2012
-
Optimal regret bounds for selecting the state representation in reinforcement learning.
Odalric-Ambrym Maillard, Phuong Nguyen, Ronald Ortner, Daniil Ryabko.
In Proceedings of the 30th international conference on machine learning, ICML 2013, 2013.
[Bib][Pdf](Discuss) -
Hierarchical Optimistic Region Selection driven by Curiosity.
Odalric-Ambrym Maillard.
In Proceedings of the 25th conference on advances in Neural Information Processing Systems, NIPS '12, 2012.
[Bib][Pdf](Discuss) -
Online allocation and homogeneous partitioning for piecewise constant mean-approximation.
Alexandra Carpentier, Odalric-Ambrym Maillard.
In Proceedings of the 25th conference on advances in Neural Information Processing Systems, NIPS '12, 2012.
[Bib][Pdf](Discuss) -
Linear regression with random projections.
Odalric-Ambrym Maillard, Rémi Munos.
In Journal of Machine Learning Research 2012.
[Bib][Pdf](Discuss)
2011
-
Apprentissage Séquentiel : Bandits, Statistique et Renforcement.
Odalric-Ambrym Maillard.
PhD thesis, Université de Lille 1, October 2011. [AFIA PhD Prize 2012]
[Pdf] -
Selecting the State-Representation in Reinforcement Learning.
Odalric-Ambrym Maillard, Daniil Ryabko, Rémi Munos.
In Proceedings of the 24th conference on advances in Neural Information Processing Systems, NIPS '11, pages 2627–2635, 2011.
[Bib][Pdf](Discuss) -
Sparse recovery with Brownian sensing.
Alexandra Carpentier, Odalric-Ambrym Maillard, Rémi Munos.
In Proceedings of the 24th conference on advances in Neural Information Processing Systems, NIPS '11, 2011.
[Bib][Pdf](Discuss) -
Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences.
Odalric-Ambrym Maillard, Gilles Stoltz, Rémi Munos.
In Proceedings of the 24th annual Conference On Learning Theory, COLT '11, 2011.
[Bib][Pdf](Discuss) -
Adaptive bandits: Towards the best history-dependent strategy.
Odalric-Ambrym Maillard, Rémi Munos.
In Proceedings of the 14th international conference on Artificial Intelligence and Statistics, AI&Statistics 2011, volume 15 of JMLR W&CP, 2011.
[Bib][Pdf](Discuss)
2010
-
Finite-Sample Analysis of Bellman Residual Minimization.
Odalric-Ambrym Maillard, Rémi Munos, Alessandro Lazaric, Mohammad Ghavamzadeh.
In Proceedings of the Asian Conference on Machine Learning, ACML 2010, volume 13 of JMLR W&CP, pages 299-314, 2010.
[Bib][Pdf](Discuss) -
Scrambled Objects for Least-Squares Regression.
Odalric-Ambrym Maillard, Rémi Munos.
In J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R.S. Zemel, and A. Culotta, editors, Proceedings of the 23rd conference on advances in Neural Information Processing Systems, NIPS '10, pages 1549–1557, 2010.
[Bib][Pdf](Discuss) -
LSTD with Random Projections.
Mohammad Ghavamzadeh, Alessandro Lazaric, Odalric-Ambrym Maillard, Rémi Munos.
In J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R.S. Zemel, and A. Culotta, editors, Proceedings of 23th conference on advances in Neural Information Processing Systems, NIPS '10, pages 721–729, 2010.
[Bib][Pdf](Discuss) -
Online Learning in Adversarial Lipschitz Environments.
Odalric-Ambrym Maillard, Rémi Munos.
In Proceedings of the 2010 European Conference on Machine Learning and Knowledge Discovery in Databases: Part II, ECML PKDD'10, pages 305–320, Berlin, Heidelberg, 2010. Springer-Verlag.
[Bib][Pdf](Discuss)
2009
-
Compressed Least Squares Regression. [see Linear regression with random
projections, 2012 for corrections]
Odalric-Ambrym Maillard, Rémi Munos.
In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors, Proceedings of the 22nd conference on advances in Neural Information Processing Systems, NIPS '09, pages 1213–1221, 2009.
[Bib][Pdf,Pdf](Discuss) -
Complexity versus Agreement for Many Views.
Odalric-Ambrym Maillard, Nicolas Vayatis.
In ALT 2009, pages 232–246, 2009.
[Bib][Pdf](Discuss)
2005
-
Parallelization of the TD(lambda) Learning Algorithm.
Odalric-Ambrym Maillard, Rémi Coulom, Philippe Preux.
In Proceedings of the 7th European Workshop on Reinforcement Learning, EWRL7, 2005.