Email:
add [CS698F] to the subject, else ignored.
Class location: Wed & Fri 15:30--17:00 @ KD-314
Office hours: By appointment.
- Answersheets showing 19-Nov-2017, 20:00 -- 20:30 @ KD314.
- End semester exam 18-Nov-2017, 16:00 -- 19:00 @ KD314.
- Course project presentation and demo 15, 17 Nov, 2017 in class.
- Course project report and code due 14-Nov-2017, 23:59.
- Assignment-2 presentations 1, 3 Nov 2017 in class.
- Assignment-2 paper selection due 20-Oct-2017, 23:59.
- Course project topic with brief proposal due by Aug 25, 11:59pm by email.
- Assignment-1 papers and topics due by Aug 30, 11:59pm by email.
- Sept 6, 8 class presentations.
- Class location changed from KD 101 to KD 314.
With the growth of the web, problems surrounding "big data" have
become central to many of the "industrial strength" solutions (which tackle the scale of the data that goes beyond the capabilities
of prototypical solutions, e.g.,
billions of data items, such as
graphs, images, videos, documents etc). A lot of this data is
semi-structured (graphs,
e.g., Facebook, Twitter, LinkedIn networks) or
unstructured (videos, images, text
documents etc), or a mixture of the semi-structured and unstructured data.
Hence the challenges of storage and query processing over this data are
different from the traditional relational database systems which focused on strictly
structured data, even though many of the robust database features of storage and indexing
are utilized as the core base of the new solutions.
Following are the broad objectives of this course:
- This course will first take an overview of the traditional data management and
query optimization techniques.
- Then it will focus on methods used for query processing over mainly "graph shaped"
data, including centralized and distributed solutions (Hadoop, SPARK, and others).
- We will read research papers from the top conferences.
- The course will carry a large project component.
- The instructor will also introduce open (challenging) problems in both theoretical
as well as practical domains of the "graph data management", to give directions of further
research -- with a purely theoretical, or practical, or a combination focus --
especially useful for developing a postgraduate research plan.
UG level course in DBMS, knowledge of UG level data structures and algorithms.
Good knowledge of programming.
- Assignments (2 nos) -- 20%
- Assignment-1: first week of September
- Assignment-2: last week of October
- Mid-semester: 20%
- Presentation of literature survey for course project
and course project intermediate demo
- Course project implementation: 30%
- Course project report: 10%
- End-semester (in-class written): 20%
- Questions on papers read through the semester
and topics covered in the class
There is no dedicated textbook or reference material. Lecture slides/notes will cover the required material, and the instructor
will provide pointers to any other material that is needed to be read by the students.
Day, Date |
Topic |
Notes |
Readings/References |
Wed, Aug 02 |
Introduction |
Lecture1 |
|
Fri, Aug 04 |
Recap |
Lecture2 |
|
Wed, Aug 09 |
Relational Algebra and Query rewriting |
Lecture3 |
Relevant chapters in std DBMS book. |
Fri, Aug 11 |
Query plans and Indexes |
Lecture4 |
Relevant chapters in std DBMS book. |
Wed, Aug 16 |
Types of Joins, Cost estimation |
Lecture5 |
Relevant chapters in std DBMS book. |
Fri, Aug 18 |
Cost estimation remaining, Intro to graphs |
Lecture6 |
|
Wed, Aug 23 |
Graph pattern queries |
Lecture7 |
|
Fri, Aug 25 |
Graph pattern queries continued |
Lecture8 |
|
Wed, Aug 30 |
Cyclicity, Acyclicity, Compression |
Lecture9 |
BitMat,
RDF3X,
Compressed bitmaps,
Semi-joins proofs
|
Fri, Sept 1 |
Finish Q proc, Intro to Distributed Data Management |
Lecture10 |
|
Wed, Sept 6 |
Distributed Data Management, Assign-1 presentation |
Lecture11 |
Presentations |
Fri, Sept 8 |
Distributed Data Management.. contd, Assign-1 presentation |
Lecture12 |
Presentations |
Wed, Sept 13 |
Assign-1 remaining presentations |
|
Presentations |
Fri, Sept 15 |
Distributed joins.. contd |
Lecture13 |
|
Wed, Sept 20 |
Midsem exam presentations |
|
Presentations |
Wed, Oct 4 |
Recap of things till midsem |
Lecture14 |
|
Fri, Oct 6 |
Cancelled (instructor unwell) |
|
|
Wed, Oct 11 |
Map Reduce framework |
Lecture15 |
|
Wed, Oct 13 |
Joins with Map Reduce |
Lecture16 |
|
Wed, Oct 18 |
Multi-Joins with Map Reduce |
|
|
Fri, Oct 20 |
Special topic -- Reachability queries |
Lecture18 |
|
Fri, Oct 25 |
Special topic -- Reachability queries contd... |
Lecture19 |
|
Wed, Nov 1 |
Assignment-2 presentations |
|
Presentations |
Fri, Nov 3 |
Assignment-2 presentations |
|
Presentations |
Wed, Nov 8 |
Keyword search on database/graphs |
Lecture20 |
|
Fri, Nov 10 |
Review |
|
|
Wed, Nov 15 |
Course proj pres and demo |
|
|
Fri, Nov 17 |
Course proj pres and demo |
|
|
Sat, Nov 18 |
Endsem Exam @ KD314 |
16:00 -- 19:00 |
|