Add Me!Close Menu Navigation

Problem Solving with Lua

Chapter 11. The project team problem part 1

Let’s pretend for a moment that you’re working as an undergraduate research assistant for one of the most brilliant professors at your school, which makes you more or less a glorified secretary/coffee dispenser. On the first day of class, you are tasked to divide up his CS101 class into pairs of student who will be working together in order to implement an awesome Pokemon clone. Great, yet another boring and unamusing problem to solve. However, soon you find that the problem wasn’t as easy as you originally thought it to be.

The first part of this problem solving series will deal with problems that may not seem intuitive at first, but will have extremely simple and elegant algorithms.

This is part 3 of the Lua tutorial series

Chapters 1–5 → An introduction to Lua.
Chapters 6–10 → The Table Tutorial.
Chapter 11 → The project team problem part 1

1. Introduction

It’s the beginning of a glorious new school year and you’re doing your usual thing as professor Kleinberg’s assistant, sitting in the fluorescent lit basement of a building oddly reminiscent of a grounded Titanic while waiting for the coffee machine across from you to finish buzzing. Suddenly, you hear a flurry of footsteps in the hallway and you perk up, anxiously awaiting for someone to shout “quick, follow me, there’s not enough time to explain,” only to realize that other people exist in the world who need to use the computer lab and that your life isn’t and will never be an action movie.

When the coffee finally finishes brewing, you found yourself inside a spacious room on the fourth floor. You quietly reassure yourself yet again that this is indeed the correct room, for you’ve already made the mistake of confusing the Kleinberg brothers earlier that week that resulted in a lot of embarrassment. While you were daydreaming, the professor came into the room and announced that he has a very special job for you. You immediately snapped back into the present. “Damn,” you thought to yourself, “I should have thought twice before giving up hope on an action movie life.” Feeling extremely badass all of a sudden, you decide to try on the pair of sunglasses that you’ve always carried around in your jacket in case your life ever becomes badass.

“What’s the problem doc?”

“Err…” the professor begins with a quizzical expression “well, I need you to organize my intro CS class into student pairs to work together on their projects, but I can see you’re a bit busy, I’ll just ask insert name of your rival

*sigh* “No, it’s all good”, you sullenly take off your shades. “I’m really not busy right now”

Problem Statement

“Anyways,” explains the professor, “I’ve collected a single time interval (due to the limitation of the survey software) during which each student is free to work, and from the doodle poll I’ve sent out, all students prefer to work in pairs rather than by themselves, so I want you to find as many pairs of students that have at least some overlapping free time in order to work together.”

More formally, if we’re given a set of students S such that for each student S_i, that student is free to work during the time interval [a_i, b_i] (start at a, end at b). Two students can be paired together as long as their time intervals overlap. You are asked to find as many such disjoint pairs as possible.

2. Let’s start with an easy problem first

In order to solve this problem, let’s consider an easier problem first, since we need to be able to start off somewhere along the way. Let’s consider the problem of organizing these student pairs, only that we don’t have to find the maximum number of pairs. Instead, we’re only interested in finding some/any pairing such that the remaining students cannot be paired up with any of the other remaining students. Now the reader may get a bit confused here. If we come up with a way to match students such that the remaining students cannot be matched up with any of the other remaining students, then wouldn’t that matching be maximum anyways?

Not exactly, and bear with me here because working through a few examples will almost always help clear away enough problems to give you a pretty good intuition of how to solve the problem. Take for example, the following four students Bob, Alice, John, and Walt.

Bob Alice John Walt
1:00 to 4:00 3:00 to 4:00 1:00 to 2:00 1:30 to 2:30

Well, one way of doing this is to just match Bob with Alice, and John with Walt. Since Bob is free from 1 to 4 while Alice from 3 to 4, they can easily be matched. Similarly, since John is free from 1 to 2 while Walt from 1:30 to 2:30, they can work together from 1:30 to 2.

How did I come up with that pairing? I just took a single unmatched student (Bob), stuck him onto a list, took another unmatched student (Alice) and check whether she can be paired with anyone already on the list (Bob’s all like, me me me). I then did the same thing with John, who unfortunately can’t join the Bob-Alice team since Bob and Alice are already on the team, so he starts his own team that’s way better than Bob-Alice and his team gets put onto the list. Finally, Walt comes waltzing in (just shoot me) like he owns the place and he finds that, hurrah, John’s already there waiting for him.

This then gives a rather natural method to pair up the students:

while there are still students that haven't been looked at
    pick a student S arbitrarily (and mark him so that he has been looked at now)
    if there exists any teams that lacks a student and can have student S in the team then
        put student S into that team
    otherwise
        create a new team with only student S

report only the full teams 

One of the fundamental tools of problem solving is the ability to rationalize why your method is correct. Here, we may want to show that this method satisfies the two properties that we’re seeking in order to convince ourselves that our method is sound.

A quick sketch of proof of correctness

We usually don’t need to be so formal, I will discuss other forms of weaker verification that are ideal for larger projects. For now, we need to show that

  • Each team consists of only two partners
  • There can be no more available pairings left
(2.1.1): Each team consists of only two partners

This is actually really easy to see. Our method only allows the teams to have two configurations (because if a team already has two partners, the method will never add any further students into that team), a team can be either a pair of student or a partial team with only a single student. Furthermore, at the end of the run, we only return those teams that have two students in them, hence it can be easily seen that after our method returns, the reported teams can only consist of pairs of students. \Box

(2.1.2): There are no eligible pairs left at the conclusion of our method

This is slightly less obvious than (2.1.1) (since 2.1.1 is from the definition of how we defined our method) but is still pretty easy to see. Let’s pretend just for the moment (humor me at least) that after running through our method, there are a few unmatched (“free”) students left, let’s just for now call two of these students Ash and Gary. In order for Ash or Gary to be unmatched, only two possible scenarios could have occurred. We either examined Gary first, found that he has no overlapping free time with anyone else who were already “free”, and in the end, no one else had any overlapping free time with Gary, or if they did, we put them on a different team. This suggests that of the people we examine after Gary, only those that do not have overlapping time with Gary can possibly be unmatched at the end of the run. Alternatively, the exact same thing happens for Ash. Since we can’t simultaneously examine both Gary and Ash at the same exact time, it must be the case that we examined Gary first and then Ash, or we examined Ash first and then Gary. In either case, by examining Gary and proclaiming Ash free, or by examining Ash and proclaiming Gary to be free, then it must be the case that Ash doesn’t have overlapping free time with Gary or Gary doesn’t have overlapping free time with Ash. Since these are the only possible order of examining the students (Gary then Ash or Ash then Gary), these are the only possible outcomes, and in both, we can conclude that Gary doesn’t have overlapping free time with Ash (Since Gary doesn’t have overlapping free time with Ash is the same as claiming Ash doesn’t have overlapping free time with Gary), for all Gary and Ashes in the remaining pool of unmatched students (can you blame poor Oak for asking for Gary’s name?). \Box

Translating the method into Lua

We’ll forgo the efficiency analysis for now, with the observation that we obviously don’t have to look at every single possible partition of the input in order to terminate, so by some standards commonly shared by theoretical computer scientists, our solution is pretty darn efficient.

Anyways, we now need to figure out a way to represent our problem in Lua. One way of doing this is to have a simple list of time intervals such that the index of each interval within that list are the identifiers of the students. For example, an instance of the above example can be written as

students = {{1,4},{3,4},{1,2},{1.5,2.5}}

We now translate the algorithm into concrete Lua line by line.

function create_random_group(students)
    local teams = {}
    local id = 0
    ...
    return teams
end

The first line

while there are still students that haven't been looked at

We can make this check using a while loop, checking if the students table is empty yet or not.

    while #students ~= 0 do

In this analogy, the act of “looking at” a student is equivalent to removing that student from the students list.

Next

        pick a student S arbitrarily (and mark him so that he has been looked at now)

Here, we can define what arbitrary is, the easiest of which is just the first student in the list. As we’ve previously noted, marking a student as having been looked at is equivalent to removing him from the student list.

        local S = table.remove(students,1) -- table.remove returns the first element and then removes it
        id = id + 1 -- since we remove first, id just increments

Next

if there exists any teams that lacks a student and can have student S in the team then

This is slightly misleading, as we don’t need just a single if check, we need to go through every single team so far and check each individually for whether the above condition holds.

for i,team in ipairs(teams) do
    if team lacks a student and our S can be matched in team then
        ...
    end
end

We now need to translate the conditions in the if statement

team lacks a student

This is equivalent to saying that team only contains a single student

#team == 1
and S can be matched in team

We can write this as its own function after we’ve figured out how to represent each team.

match(team[1], {id, S})

We now need to figure out how to represent each team. Since we’re removing students away from the students list, we need to keep track of both the student’s index and his time interval. We could easily do this as a triple. For example, if our students consisted of {{1,4},{3,4},{1,2},{1.5,2.5}}, and a team consisted of student 1 and 2, this team will look like

team = {{1, {1,4}}, {2, {3,4}}}

We can now write the match function. A student matches with another student if and only if their time interval overlaps.

function match(a, b)
    -- a and b looks like {id, {start, end}}
    a = a[2]
    b = b[2]
    -- make a always be the one with the earlier start time
    if b[1] < a[1] then
        local t = b
        b = a
        a = t
    end
    -- if a always starts before b, then in order for a and b to have overlapping interval
    -- it must be that a.end >= b.start
    return a[2] >= b[1]
end

and there we have it, I’ll leave it as an exercise to the reader to finish the full implementation, including the removal of partial teams before returning.

3. Let’s now consider the full problem

Well, we’ve gone on for quite a bit longer than expected so I’ll same the majority for the next part, but let’s see why, as claimed previously, that just randomly choosing students to pair up until it becomes impossible to find any more matches will not give you the maximum number of student pairs.

Why doesn’t arbitrarily choosing students work?

Consider the same exact problem as before

Bob Alice John Walt
1:00 to 4:00 3:00 to 4:00 1:00 to 2:00 1:30 to 2:30

but for some reason, we decided to look at the students in this order {Bob, John, Alice, Walt}. Convince yourself that using the original method, we are no longer guaranteed to find the optimum.

Hint: Bob gets matched to John, what happens next?

Let’s consider a few false starts

The example above suggests that the processing order has something to do with the optimality of the solution. You should try a few of the following methods and find ways under which they do not fulfill being the optimum.

  • Sort the students by decreasing start time
  • Sort the students by decreasing length of interval
  • Sort the students by decreasing finish time
  • Same as above, but with increasing order
  • Sort the students by increasing start time or interval, and for each student, if there are already full teams that overlaps and if any of them start earlier, then break that match and match the current student to it

We’ll look at a method that will find the optimum next time.

Posted By Lee

Woosh

  • Pingback: Chapters 1–5 → An introduction to Lua. | CodingSnake()

  • Jake

    These are really useful thanks, just starting the problem solving one! it would be awesome if you could continue these :D

  • Paul Rivieras

    We’ll look at a method that will find the optimum next time.
    23 May, 2012

    Am, I the only one still waiting for the continuation after 4 years?