Course Information	Lectures/Calendar	Quizzes	Labs

Course Information

Lectures: Thu - 11:00 AM, Fri - 12:00 AM
Labs: Fri - 5:00 PM

Objectives

This course lays the foundation for students to build multimedia systems. Multimedia systems involve automated analysis and fusion of multiple types of data such as text, images, video, audio, and various sensors. The course covers state-of-the-art tools and techniques for multimedia content processing, compression, fusion, summarization, search and retrieval applicable to different areas such as social media, homeland surveillance and privacy. The objective of this course is to prepare students to develop systems using multi-source information commonly and readily available in the form of Big Data in Internet of Things and Smart Cities paradigms.

Outcomes

By taking this course, the students will be able to find answer to the following questions:
How to capture, analyse, and compress multimedia (text, audio, and image, and video) data?

How fuse multimedia data data to build multimedia systems?

Prerequisite

CSL201 (Data Structures) for CSE B. Tech Students

Course Requirements

Student are required to attend two lectures per week. In addition, there will be weekly lab sessions. During lab sessions, the students are required to solve and implement programming assignments.

Grading Policy

There will be lab exercises, homework assignments, quizzes, a mid-semester exam, a final exam and project. The tentative grade distribution is as follows:

Quizzes (top n-1): 10%
Lab Exercises (top n-1): 20%
Mid-semester exam: 20%
Final exam: 20%
Project: 30%
A student must score at least 33% marks to pass the course.

Attendance Requirement

There is no attendence requirement; however, students with more than 75% attendance would be considered punctual for future recommendations. During lectures:
BE SHARP ON TIME

STAY THROUGH THE LECTURE (DON'T LEAVE IN-BETWEEN THE LECTURE)
It is advised to not indulge in any activity during the lecture that might disturb other students or the instructor.

Code of Ethics & Professional Responsibility

Discussions that help the student understand a concept or a problem are encouraged. However, each student must submit original work. Plagiarism/copying of any form will be dealt with strict disciplinary action. This involves copying from the Internet, textbooks and any other material for which you do not own the copyright. Copying/Lending the code (or part of the code) to others will be considered plagiarism too. If authorized by the instructor, eode reuse is allowed with explicit reference to the source. Students who violate this policy will directly receive an F grade in the course. Remember - Your partial submission can fetch you some points, but submitting other's work as your own can result in you failing the course. Please talk to the instructor if you have questions about this policy. All academic integrity issues will be handled in accordance with institute regulations.

Textbooks

Primary Textbook

There is no single textbook for the course. We will rely heavily on the web sources for the content. Few possible reference books are given below:

Reference Books

Fundamentals of Multimedia, Authors: Li, Ze-Nian, Drew, Mark S., Liu, Jiangchuan, Publisher: Springer, Year 2014. [Link].

Language/Tools

Primarily Python

Teaching Assistant

Pratibha Kumari (2017csz0006@iitrpr.ac.in)

Contact Me

By appointment at Room No. 319, S. Ramanujan Block, Permanent Campus, IIT Ropar [offline] or mukesh@iitrpr.ac.in [online].

Tentative Topics

Audio: Sampling, quantization, time-domain audio features (ZCR, Energy), frequency-domain audio features (MFCC, Spectral), windowing and spectrogram, pitch detection, speaker recognition (GMM and HMM), audio fingerprinting and alignment.
Text: Bag-of-words (BoW), TF-IDF, Text clustering, Bottom-up and top-down clustering, n-grams, sentiment analysis.
Image: Image representation, HoG, SIFT, SVM, ANN, CNN.
Video: Motion vectors, Foreground detection using Adaptive Gaussian Mixture Model (AGMM), Object tracking using particle filters.
Information Fusion/Case Studies
Compression [if time permits]: MP3, MP4, JPEG, Text compression

*The topics may not be covered in the same order.

Quizzes

There will be 3 quizzes, the top 2 will count towards your grade.

Projects

Projects are to be done individually or in a group of two max. A list of topics will be added soon. Project requirements:

You should have at least one data fusion component in the project. You need to clearly justify and demonstrate advantage of using fusion.
The reports must be prepafed in ACM Multimedia LaTeX format. Good quality English is expected in the report.
The code should be submitted through GitHub or Bitbucket repository. You can make a private repository and show me with your login. I will observe the activities on repository (commits, etc.) to check the progress.
Dataset can be submitted through Pen Drive of Google Drive.
You are free to use resources (code) available on the Internet with proper references. However, during evaluation you need to explicitly mention parts with your work.
There will be marks for creativity/novelty in the project.
There will be weekly evaluations of the project after the mid-sem exam.

Lectures and Calendar

Lectures	Topics	Events
L1-2	Introduction, Python basics	Lab1: Python basics
L3-4	Machine Learning Basics: KNN, K-Means, Naive Bayse, SVM
L5-6	Signals and Systems, Audio Basics, Time domain features	Lab: Audio classification
L7-8	Audio spectral features, MFCC, Artificial Neural Networks
L9-10	Audio Spectrogram, Audio Alignment, and Fingerprinting	Evaluation 1
L11-12	Speaker recognition using Gaussian Mixture Model (GMM) and Hiddem Markov Model (HMM)	Lab: Audio classification using ANN
L13-14	Text representation, Bag of Words
L15-16	Clustering, LBG, agglomerative clustering	Lab: Document clustering
L17-18	Sentiment Analysis, g-grams, topic modelling	Lab evaluation 2
L19-20	Image basics, HoG, SIFT
L21-22	Image analysis using Convolutional Neural Networks	Lab: Classification using CNN
L23-24	Video foregreound detection using Adaptive Gaussian Mixture Model, Object Tracking using Particle Filters	Lab evaluation 3
L25-26	Information fusion techniques	Lab: BG/FG classification using AGMM
L27-28	Data compression techniques: MP3, JPEG, MPEG	Lab: Audio-visual fusion, Lab evaluation 4

*This is a tentative schedule. The schedule can change according to the need at the discretion of the instructor.

Scroll to top

Lab Exercises

There will be weekly lab sessions. The weekly lab assignments will contain practice component and graded component. The TA will help with the practice component. The student has to complete the graded component independently and submit within two days of the lab session. Any plagiarism incident will be reported to the academic section immediately. There will be four evaluations of 5 marsk each, all through VIVA. Each evaluation may cover one or two lab submissions.

CS507: Multimedia Systems Semester I, 2021-22