Thursday, December 30, 2010

Face Detection and Face Matching

We took a sample of recording by making 4 people sit and shake their heads :-) it was fun! This was done in the computer vision laboratory of DSCE college with the help of Prof. Ramesh Babu.

I am now in a stage thinking how to grab the co-ordinates of the faces of the students in the classroom.

1) We could sample the frames at 1 frame per second (I think this should be more than enough).

2) After that, we should detect the faces. This is the stage where we may have to give up on some faces which are either concealed or isn't clearly visible. We cannot expect to have all the faces in our frames.

3) We may want to do both face detection and face matching here (the reason is obvious, or aint it?)

I hit on this nice package to face detect.


and upload a picture and see what happens ( you can either upload a picture or paste the link)


This algorithm does a great job provided all the faces are clearly visible and if the picture is of good resolution. The algorithm is available as a Python API client library which I today figured out how to use.

I am planning to assign this task to two of my bright friends Vijay and Vijesh :

 

Thursday, December 23, 2010

The Plan so Far



We are going to split our work into three components:

1) Capturing the data from a live classroom

2) Running a face detection algorithm on all the frames in the video and generating the matrix of locations of audience.

3) Analyse the data thus obtained.

We are going to capture a video of a set of 40 people listening to a lecture of 60 minutes duration. We will provide our audience with a sheet of paper and ask them to rate the lecture every 5 minutes for 60 full minutes.

We will then see whether there exists a correlation between the audience rating every 5 minutes (average rating of all audience) and the head movement captured in the camcorder.

There are a couple of issues involved in this experiment:

a) How do we ensure that our audience rate the speaker every minute without fail?

b) Will there be a lot of head movement when the audience put their head down to rate the speaker every 5 minutes?

Your thoughts and inputs invited....

Follow up Posting Based on Comments

This is a follow up posting based on the comments that I received for my previous posting. I have made this as an FAQ:

Q) What is Captivation Quotient?

A: We measure this on a 10 point scale. I define this the reverse way, I say that the CQ is 0 if everyone is listening without any head movement and if there is a lot of head movement then the CQ score increases towards 10. I am still not sure what situation should amount to a 10 on 10. Maybe you guys must suggest me :-)

Q) What if students are sleeping or what if a student is nodding his head just because he is understanding every bit?

A: In a given classroom, it is very improbable that everyone is sleeping. So, we can discount this case. The way audience nod their head when they understand the lecture is way different from the way they move their head when the class gets boring. I believe we must be able to detect the difference very easily.

Q) What if students are moving their heads constantly just because they are taking notes from the slides?

A: Again the same reasoning holds good. There is a pattern in which we move heads when we write down notes or when we nod our heads acknowledging that we are understanding what the speaker is speaking. Our motive is to silence such patterns in our data.

Monday, November 15, 2010

The Current Team


Punitha Swamy : punithaswamy@gmail.com



Ramesh Babu : bobrammysore@gmail.com














Sudarshan Iyengar : sudarshaniisc@gmail.com

Gauging the Reception of a Lecture

(This blog article is to be polished further, this is version 0.0)

The audience are anxiously waiting for the commencement of the Nobel laureate's lecture at IISc. The lecture finally begins and everyone is captivated to see how the speaker introduces the topic. The first five minutes of the talk --any talk for that matter-- receives the most attention from the audience (thanks to our minuscule attention span). We keep our spine erect and chin up with our eyes focused on the PowerPoint slides, evaluating the oratory skills of the speaker coupled with the analysis of his body language and his accent. We wouldn't like it if the person next seat was meddling with his PDA, let alone the phone ringing. I always wished there was a lecture-sensing-technology on our mobile which would put all the mobiles on silent mode (more particularly in a movie theatre) and another leap in engineering namely, boring-lecture-sensing-technology which would make all the mobiles ring in unison once the lecture gets buggingly boring - Man I have high hopes from the 2010 technology, but how I wish!!!!!!.

I normally occupy the last seat in a lecture (More so when the lecture is at IISc and particularly when it is given by a prof from IISc, and am all the more particular about my love for the last seat if it is Prof._____________________, have left blank for us to fill our fav prof's name).

As I sip the tea and munch the biscuits and listen to the speaker, the first five to ten minutes, I am spinerect-chinup-eyefocused and there is Nil-head-movement. But 15 minutes into the talk, I feel I am lost (damn! the speaker neither feels nor senses the reception quotient of his audience.). I open up my mobile phone, thanking that I can atleast use my mobile phone contrary to what I wished at the beginning of the lecture quoting some xy-sensing-technology. I start smsing, playing, browsing or taking the picture of the speaker (as though I will never get another chance to attend his lecture again in my life).

Rajeev would then start speaking to his friend sitting next to him cracking a joke, Anjaneyulu with a dish-antenna spectacles would be discussing his ideas with the person sitting next to him, flaunting that he is the only one who is understanding the lecture. Miss. Stella would turn around and see whether people are listening or looking at the roof sipping the free tea with biscuits, she would then giggle and feel happy that she isn't the only one not understanding.

At this stage of the talk the auditory cortex of our audience would get completely shut and their visual cortex will get hyperactive. This is the situation when, if someone takes a snapshot from the stage, of the entire audience (the way Deepika Padukone takes the panorama shot of her audience in a TV ad) , one would wonder after seeing the photo whether there was anyone at all on the stage presenting something.

Cut the Crap, Is there a possibility of we tracking the head movement of our audience and rating the captivation quotient (CQ: how well is our speaker keeping our audience captivated) of our audience? Can this be done live? With a camera kept on the stage which would continuously monitor the head movement of the audience and finally plot the captivation quotient Vs Time?

Let us use a 10 point scale, if there is absolutely no head movement then the captivation quotient is 0, if there is a whole lot of head movement then the captivation quotient is 10.

If the talk is for an hour's duration, if we could plot and see the captivation quotient Vs time, this would be a great yard-stick to measure the reachability of the talk...


Simply stated:


If there is a lot of head movement, then that indicates restlessness and that would be a result of a boring lecture. If there is no head movement, then that signifies that everyone is concentrating, which means that the lecture has captivated the interest of the audience.


Any ideas? You think the hypothesis that "The interest of audience in the content of the talk is  correlated to their head movement" is right?