GSoC 2017 - Week 5

Ah, I’m writing this blog post after a little delay. Unfortunately, I got flu and fever. Now I’m feeling a lot better, though a bit weak. I’m feeling excited to share my designing journey of the project.

Writing a software is easy, writing an efficient software is tough. To me writing a good and efficient software is important. Thus before starting the implementation for hierarchy storage and querying over the same data, I checked out various implementations for hierarchy storage in the database system.

Considering the use case of our MediaWiki application, where our tables get recreated each time the template description is changed or more pages (with specific template call), I shortlist two approaches.

  1. Adjacency list
  2. Nested set (ruled out nested set with floating values)

Both techniques have their own positives and negatives. Adjacency list being easier to create, but the queries become expensive as compared to nested set implementation for the 4 operations as discussed in the last blog post.

My mentor, Mr Yaron, suggested me to create a document comparing the two techniques. For an exhaustive analysis, and to ensure that I don’t miss out little caveats, I decided to implement both the models of storage and perform those specific 4 tasks and evaluate their performance. This task took 2 days to complete, but it was worth the effort. Coding both the implementations helped me clear some of the misconceptions regarding perceived supreme performance of Nested Set. I had first thought that I could do without creating a tree data structure in case of Nested Set for implementing some of those tasks - like “HOLD WITHIN” and “Drilldown”. But while implementing I realised that I cannot. I also tried to substitute recursive operations with the use of stack and queues for preorder traversal and breadth first search involved in several operations of hierarchy structure table and queries. To improvise the performance of the methods, I used SplStack and SplQueue class to prevent recursive functions. It was difficult to find that these classes exist. For a moment I thought I will have to use some other libraries from Github, which are not the part of core PHP.

I have summarised my finding in the spreadsheet.

comments powered by Disqus