Monday, November 21, 2011

More Game Theory

Check it out, Stanford is offering a free online game theory class next semester.

Saturday, November 19, 2011

Readings: Anytime Algorithm for Optimal Coalitions

This Monday we will be discussing An anytime algorithm for optimal coalition structure generation by T Rahwan, SD Ramchurn, NR Jennings, and A Giovannucci.

Thursday, November 10, 2011

Readings for Next Week: Electric Vehicles and Robots

Next week we will be discussing Online Mechanism Design for Electric Vehicle Charging by Enrico H. Gerding, Valentin Robu, Sebastian Stein, David C. Parkes, Alex Rogers, Nicholas R. Jennings, as well as Who Goes There? Selecting a Robot to Reach a Goal Using Social Regret by Meytal Traub, Gal A. Kaminka, Noa Agmon.

Friday, November 4, 2011

Final Project

For the final project you have the option of a programming-oriented or a research-oriented final project. Whichever project you decide to do you must first meet with me to get it approved.

Programming Project

Choose a paper from the AAMAS proceedings and implement the algorithm they describe. In many cases you will need to make some simplifying assumptions. The papers discuss in class are especially good choices.

Or, you can also choose to implement any of the algorithms referenced in the our textbook that are not already implemented, or provide better visualizations of existing algorithms (for didactic purpose).

A final option is to build a NetLogo simulation of a multiagent problem in a separate domain, one which you are already familiar with, say, because you are doing a PhD thesis on that problem.

Research Project

Your deliverable will be a paper that summarizes exisiting research on a specific topic, with appropriate citations, and
either presents a new algorithm/protocol/etc. along the same lines or organizes these results in a novel way (think "survey paper").

The final paper should be at least 10 pages long. I will be looking to see that you understand the topic at hand and how the various contributions relate to each other. You will be looking at papers beyond the above conferences.

The final projects are due Wednesday May 1.

Next Week's Readings: Cooperation and Teamwork

Next week I will be presenting, and you should be reading, The Evolution of Cooperation in Self-Interested Agent Societies: A Critical Study by Hofmann, Chakraborty and Sycara, and Empirical Evaluation of Ad Hoc Teamwork in the Pursuit Domain by Barret, Stone, and Kraus. Both of these paper present models that would be perfect for implementing in NetLogo in your final project.

Saturday, October 29, 2011

Next Week's Readings: Smart Grid and Propagators

On Monday we will be discussing Agent-Based Control for Decentralised Demand Side Management in the Smart Grid by Sarvapali D. Ramchurn, Perukrishnen Vytelingum, Alex Rogers, Nicholas R. Jennings from the AAMAS 2011 proceedings. You can follow that link to find all the other papers from that conference. Remember that one option for your final project is to implement and/or improve upon one of the algorithms in one of these papers so, start reading.

On Wednesday we will be discussing The Art of the Propagator by Sussman, Gerald Jay and Radul, Alexey. You will also want to view Sussman's talk on this topic.

No class on Friday.

Tuesday, October 25, 2011

HW4: Learning Correlated Equilibria

For this homework you will implement the learning algorithm in this paper by Cigler and Faltings. The algorithm in question allows the agents to reach a correlated equilibrium by using only a learning mechanism, that is, without the need for a separate coordination device.

The game is a broadcasting coordination game. There are N agents and C channels (both sliders), where N >= C. At each time step all agents choose a channel for broadcasting but, due to physical limitations, only one agent can actually transmit on each channel at a time. Thus, if an agent uniquely chooses a channel then he gets a utility of 1, if 2 or more agents choose the same channel they all receive a utility of 0.

You will first solve this problem by having all the agents implement the Q-learning algorithm. Note that there is only 1 state in this game, so the Q-learning will only happen over the actions of the players. Run it for a while and plot the sum of the utilities to see how well they do as a group.

Then, you will implement the algorithm from the paper (use the same NetLogo model, just add a toggle switch so the user can choose between the two). The algorithm is described in Section 2 and works roughly as follows (but, see the paper):

  1. At each tick, there is a randomly generated integer (signal), from 1 to K, that all agents can observe. They use this signal in their learning.
  2. Each agent keeps a table f(k), for k = 1..K and where f(k) is either the channel the agent will transmit if it sees signal k, or 0 if will not transmit at all when it sees that signal. Table f(k) is initialized to random channels.
  3. At each tick, with signal k, the agent transmits on channel f(k). If f(k) = 0 then the agent chooses a random channel to monitor.
  4. The agents get their utility based on the collisions. If the agent gets 0 utility then it sets f(k) = 0 with probability p (slider).
  5. If the agent was monitoring channel c then if no one transmitted on it the agent will set f(k) = c.

The .nlogo model is due on Monday, November 21 @9am.

Wednesday, September 21, 2011

HW3: Security Circumvention Games

For this homework you will build an airport security simulation loosely based on the model presented in GUARDS-Game Theoretic Security Allocation on a National Scale. We simplify their model by using only a few simple types of security circumvention techniques but also extend it by taking into account the effects that extra screening has on passenger waiting times.

Your model will have a passenger breed of agent. Each tick corresponds to a minute. The number of passengers that arrive each minute is given by random-poisson 1, a Poisson arrival rate of mean 1. Each passenger has random 3 luggage items with him, where luggage is also a breed. When a passenger arrives he is at the Boarding Pass area of the airport.

There are five areas in the airpot: Boarding Pass (B), Luggage (L), Security scan (S), Gate (G), and Plane (P). The passengers will traverse these in the order, B-S-G-P, while the luggage goes B-L-P.

At each tick the earliest random-poisson 1 turtles to arrive (smallest who) passenger or luggage that is not currently being screened in each area is moved to the next area. When there are 30 passengers and their luggage in P then they leave (die).

Each passenger is really a terrorist with probability terrorist-prob (a slider from 0 to 0.1). Each terrorist can carry out an attack of one of three types: 1, 2, 3. Also, he will do it in one of the areas (B,L,S,G,P) (if L then consider his luggage agent as the one that will perform the attack). The terrorist activates his attach 10 minutes after reaching his target area.

There are also 5 screener agents. The user has the choice of where to place each screener agent and which type of attack they screen for. That is, each screener is at one of the locations and can only identify one type of attack. Use 10 dropdown boxes to let the user pick which area (B-L-S-G-P) and which attack (1-2-3) to give each screener. At each time tick each screener randomly chooses a passenger/luggage in his area and screens him. If the agent is a terrorist with the same attack as the screener then he is marked as caught and removed. Every screening takes random-exponential 5 minutes, during this time the agent being screened cannot move to the next area.

Your simulation will let the user choose which types of screeners to put where and will then simulate a whole day (24 hours) of activity, keeping track of how many terrorists are successful and how many planes take off without incident (if 10 minutes go by on a full plane then it was OK, you can delete all those agents).  This simulation forms the first part of the homework.

In the original GUARDS paper, the terrorists are assumed to know where the screeners are placed. For the second part you will change the behavior of the terrorists so they know where the screeners are placed, and their type, and calculate their best response given this knowledge. In our simplified game this simply means that your terrorists should set their type of attack to be the one that is less often used by the screeners.

This homework is due Monday, October 10 @9am.

Tuesday, September 20, 2011

Learning in Multiagent Systems

This week we will be discussing the chapter on learning in multiagent systems. The first test is scheduled for Monday and will cover up to this chapter.

Wednesday, September 7, 2011

HW2: Graph Coloring

For this homework you will implement a simple distributed hill-climbing algorithm and test its behaviors on various graphs. The goal is for you to become proficient at using the NetLogo link primitives and to gain first-hand experience with the problems and benefits of distributed hill-climbing algorithms.
  1. Read the link primitives section of the NetLogo manual.
  2. Implement a NetLogo model that generates a random graph, by using a simple algorithm: first create num-nodes nodes (a slider), then create num-nodes * edge-ratio edges, each one connected to two randomly chosen nodes.
  3. Implement another button which instead generates the graph using preferential attachment. Specifally: first create num-nodes nodes, then create num-nodes * edge-ratio edges, each one created by picking two nodes chosen with a probability proportional to their degree (number of incident edges). For example, if there are 3 nodes, one with two edges and the other two with one edge each (the graph is a line) then the one with two edges gets chosen with probability 2 / 4 = 1/2, while each other two nodes is chosen with probability 1/4. The denominator is always the total number of edges and the numerator is the degree for that node.
  4. Implement a num-colors slider and randomly color the nodes using that many colors.
  5. Implement a layout button which calls one of the built-in NetLogo layout methods to make the graph look pretty.
  6. Implement a basic hill-climbing algorithm. On each tick every node looks at the colors of its neighbors and changes its color to one that does not conflict with any. If there is no such color then it will change to one that minimizes the number of constraint violations with its neighbors (min-conflict heuristic). If at any tick none of the nodes changes its color then we stop since a coloring has been found.
  7. Add a test button which performs a more extensive test. Specifically, for the given number of nodes and edge ratio, it will generate 100 graphs of random and preferential attachment types and run the hill climbing algorithm, plotting the number of time steps it took to find a solution in a histogram, one for random and one for preferential attachment. You will need to set an arbitrary large number for the 'does not stop' case.

Add your name and describe your results in the model description tab. Email me you .nlogo file by Wednesday September 21 @9am.

Tuesday, September 6, 2011

Game Theory in Practice

The Economist has a fun article on Game Theory in Practice about a few companies using agent-based modeling approaches to make predictions about real-world events. These are the same type of models that you will be building, except not as large, of course. Check out the video. Still, I think there is a large chasm to cross from saying that your model predicts the general dynamics of a population, to predicting the fall of Hosni Mubarak within a year. Until someone puts up his 'annual predictions' up on the web every year, for several years, so everyone can verify them, I shall remain a skeptic on our ability to make such fine-grained predictions.

Oh, and that same issue features pmarca's article on how software is eating the world, which he first mentioned in his WSJ piece. Indeed. Even hardware is software now (cf. 3D printers).

Game Theory for Agent Systems

This week we start our study of game theory as used for building multiagent systems.

Thursday, August 25, 2011

Distributed Constraints

On Friday we will start the next chapter which is concerned with constraint satisfaction and optimization. A lot of real-world problems can, in part, be reduced to a distributed constraint satisfaction problem.

Monday, August 22, 2011

Farmers and Cows

You can grab a copy of the model I built in class. For fun, see if you can make the cows die of starvation and reproduce when well fed.

Saturday, August 20, 2011

HW1: Trackers

Our first project is a simple multiple-tracker multiple-target distributed allocation problem. Since you are learning NetLogo as you do this, I am breaking it up into steps that you can do sequentially while learning different aspects of NetLogo. The final project consists of randomly moving targets that emit 'utility signals' which are captured by the trackers that are near the target.

  1. Start by creating a target breed. You will have a slider to choose the number of targets we want to create (100 by default). The setup button will create that many targets. The targets are all circles (set shape "circle")and red.
  2. Create the go button and procedure. You will have a regular and a forever version of the go button. The go button makes the targets move randomly. They do this by changing their heading by a random amount (+- 45 degrees) and then moving forward 1.
  3. Change the field so that it does not wrap-around (right-click on it then 'edit'). Make the targets bounce off the walls. Search the "Programming Guide" for this bit of code:
    if patch-at dx 0 = nobody [
      set heading (- heading)
    if patch-at 0 dy = nobody [
      set heading (180 - heading)
    which implements bouncing.
  4. On startup, all patches turn white except 100 random patches turn green. The targets bounce off the green patches.
  5. Create a tracker breed, and a slider to chose how many to create (100). The tracker can only see targets that are less than 5 away and follows the nearest one but can only move forward by .8 (towards min-one-of in-radius). Run an experiment with 100 trackers and 100 targets.
  6. At every tick, each target emits 1 unit of utility which gets divided equally among all the trackers that are within 5 of it. Add a plot to show the mean utility received by the trackers over time. Run the experiment again for several thousand steps. You should see the mean utility decrease over time.
  7. Finally, an open problem, change the behavior of your trackers so the mean utility does not decrease so much. Try not to violate the constraints that the trackers can only see up to a distance of 5 around them, or communicate with anything outside that radius, or any of the other 'physical' constraints of the problem as described.

Notice that, as a system designer, you might be concerned with utility loss, the total utility emitted by the targets that is not harvested by anyone, or you might be concerned with social welfare, that all trackers receive roughly the same utility. On the other hand, as the owner of a tracker you might only be concerned with how much utility your tracker consumes. So, if you want to minimize utility loss but cannot control the actions of the trackers, what do you do? You don't have to answer that for this homework, just think about it.

Update: changed date:This homework is due Wednesday, 7 September @9am, just email me the .nlogo file. Make sure to use "Info" tab to include your name and email, as well as a short explanation of the strategy you used for 7.

Friday, August 19, 2011

Agent Based Models and the Open ABM

Agent Based models are used in many other scientific disciplines and the techniques, both programming and mathematical, that you will learn in this class can be used in many domains. The Open ABM consortium is a place where scientists from various disciplines can share their models and experiences in building models. They can also learn from each other. You should browse their model selection to see the range of ABMs that others have built, and maybe get some inspiration for your final project.

Thursday, August 18, 2011

Multiagent Models

We start the semester with a quick historical overview of the field and move on to multiagents models. We use the word model in its scientific sense to mean a mathematical abstraction that captures some aspects of real-world phenomena. You should read the first chapter (Models) of the textbook.

Tuesday, August 16, 2011

Fall 2011 Class

This blog will now become the official site of my multiagent systems class.

Check out our syllabus for grading and other information. The class will follow my textbook but you can also consult the Shoham and Leyton-Brown textbook for another view of the same material. Once we finish the book we will spend a few weeks going over recent publications in the field.

I like to focus on algorithms and agent-based modeling so, to give you a hands-on experience, we will be using NetLogo. Namely, most of the problem sets involve implementing a multiagent model in NetLogo. You can see my own gallery of netlogo models for the type of work we will be doing.

If you are interested in this type of research, check out my papers.