Tuesday, May 27, 2008

Measuring Reinforcement Learning

When I started my PhD in Andy Barto's lab in 1988, there were perhaps a handful of folks doing research in the field of modern RL. There was an outpost of ex-students from Andy's lab at GTE Laboratories including folks like Rich Sutton, Chuck Anderson and Judy Franklin. By 1990 or so, there were a few others including Leslie Kaelbling and Peter Dayan. It was a pretty lonely field back then.

But how big is the field of RL now? If I had to guess I would say that there are several hundreds of researchers around the world who would self-identify themselves as being RL researchers.

So, the objective of this post is to solicit ideas and more importantly effort in measuring the size of the field of RL. Here are some ideas of how to collect some data. Any volunteers?
  1. Collect a list of conferences that publish substantial number of RL papers. These include (in no particular order) ICML, NIPS, AAAI, UAI, COLT, IJCAI, AAMAS, and ECML. What other major venues am I missing?

  2. Establish some simple and noisy methodology for determining when a paper is an RL paper.

    • If the paper publishes keywords then look for phrases from some list that pretty clearly indicates an RL paper, e.g., reinforcement learning, Q-learning, TD, temporal differences (What others am I missing?). I think it will introduce too much noise to include MDPs and POMDPs.

    • Look for the same list of keywords identified above in the title and abstract.

    • Do we need to do more sophisticated things?

  3. Gather the data for each conference separately by year. An interesting use of this data will be to just get a sense of the publication rate of RL papers in the different conferences. If I could get this data somehow, I would happily create graphs and put them up on this blog. But to serve the main purpose of this post, one would just create and count a list of the unique authors of such papers.
Does anyone have scripts that could easily do this? Not sure this is worth a lot of effort but it sure would be fun to have this data.

7 comments:

Szityú said...

Hi,

I've seen something similar on Yaroslav Bulatov's blog. He used Google Scholar, and measured the density of search phrases like "neural network", "expert systems" or "support vector machine". As you might expect, the popularity of the first two phrases declines, while SVMs thrive. I know it's not exactly what you were looking for, but results for "reinforcement learning" would be pretty interesting

Unfortunately, the original Python script has passed away a long time ago, but there is a modified version by Konstantin Tretyakov among the comments: here

Maybe you can modify that so that it fits your purposes :-)

best regards,
Istvan Szita

Satinder Singh said...

Hi Istvan,

Thanks for the pointer to Yaroslav's work and the script. I will look into this and see if it can be easily adapted.

One of the interesting challenges I want discussion on in this thread is what would be a reasonably fair (and convenient) way of deciding who is an RL researcher?

Satinder

Unknown said...

Do you want to include people from the neuroscience/psychology community who use RL to model the brain and behaviour and may even self-identify as RL researchers, but do not (or at best rarely) publish in AI conferences?

-Elliot.

Satinder Singh said...

In Response to Elliot.

Yes, I want to include RL folks in neuroscience, psychology, operations research, adaptive control and even further out fields such as economics (behavioral economics folks often use RL algorithms to model human game-playing behavior). What is not clear is how to do all this in a feasible manner. Any ideas anyone?

Satinder

Unknown said...

I wouldn't call AAMAS a major RL venue, but there is a steady stream of publications there from the community.

Satinder Singh said...

Matt, Yes we should include AAMAS. I will update the original post to include it. Thanks. Satinder

Anonymous said...

The other possibilities that I can think is IJCNN and IEEE Trans. Automatic Control and System, Man, and Cybernetics.
By the way, I don't know any easy way to collect all these information.