Dr Michael ShekelyanLecturer in Computer Science (T&R)Email: m.shekelyan@qmul.ac.ukRoom Number: Peter Landin Building, 4th FloorWebsite: https://shekelyan.scienceOffice Hours: Email me to arrange an appointment.ProfileTeachingResearchProfileI am a Lecturer (Assistant Professor) in Computer Science. I am part of the theory group and aim to develop practical algorithms & data structures that can be used in a wide range of applications or theoretical foundations that promise substantial practical impact. I participate in many research communities that are interested in learning from data. Most of my works have been presented in the go-to data management conferences (VLDB, ACM SIGMOD/PODS, IEEE ICDE, EDBT), but I also regularly serve as a reviewer for leading machine learning and data mining conferences (NeurIPS, ICML, ICLR, AISTATS, ACM KDD) as well as many journals. My main goal is to progress technology towards reliably and responsibly capturing crucial (factual) insights from data, i.e., reactive interfaces for data exploration (better algorithms, data structures & data approximations), and privacy-preserving analysis & machine learning (better theory, hypothesis tests & randomised algorithms). Open Position (PhD Studentship) Privacy-Preserving Algorithms: Unlocking Data Sharing for Medical Sciences & Machine Learning PhD studentship: covers 3 years stipend (currently £20,662 per year in 2023/2024) and fees Supervisor: Dr Michael Shekelyan Location: London, UK Applications: open to UK home students (see TGC 5.2 Student eligibility) till 31st January 2024 More details and how to apply Expected start date: September 2024 Description: Statistical analysis & machine learning approaches can only be as good as the underlying data and harder tasks call for a larger collection of accurate data. Collectively, health organisations, companies and service providers collect massive amounts of data from patients, employees and users, but on an individual level, a single hospital, company or service provider only accesses their smaller subset. Data sharing partnerships are challenging due to distrust, varying quality standards, legal obligations and ethical considerations aimed to protect the privacy of the person whose confidential information is collected. Naive anonymisation techniques may remove names, dates and identifiers, but overlook quasi-identifiers leveraging contextual knowledge, i.e., rare combinations of facts pointing to a particular person when considering additional information. Sophisticated frameworks like differential privacy & federated learning employ randomised algorithms that satisfy formal privacy guarantees & privacy-enhancing technologies that safely aggregate data in a decentralised fashion using cryptographic techniques. The studentship can take many directions such as mathematical foundations for privacy-preserving algorithms, intuitively explainable formal privacy guarantees, federated learning systems satisfying legal requirements, and other practical or theoretical solutions towards privacy-preserving data sharing. Contact m.shekelyan@qmul.ac.uk for further information or questions about potential directions.Undergraduate TeachingSemester A Lecturer for ECS519U Database Systems (Mile End campus) Semester B Module Organiser for ECS518U Operating Systems (Mile End campus) Module Organiser for IOT518U Operating Systems (London City Institute of Technology) Further Material for Operating Systems Accompanying Quizzes and Links: https://fork.science Recommended textbook: http://ostep.org/ (goes beyond lecture) Further Reading for Database Systems Introduction [Wikipedia articles: Tabulating machine, Flat-file database, Comma-separated values, Database, Outline of databases.] Hollerith, H. (1889). U.S. Patent No. 395,781. Washington, DC: U.S. Patent and Trademark Office. https://www.census.gov/history/pdf/hollerith_patent_01081889.pdf Idreos, S., Alagiannis, I., Johnson, R., & Ailamaki, A. (2011). Here are my data files. here are my queries. where are my results?. In Proceedings of 5th Biennial Conference on Innovative Data Systems Research. https://stratos.seas.harvard.edu/sites/scholar.harvard.edu/files/cidr2011.pdf Hector Garcia-Molina, Jeffrey D. Ullman, and Jennifer Widom. 2001. Database Systems: The Complete Book. Prentice Hall PTR, USA. Chapter “The Worlds of Database Systems”. http://infolab.stanford.edu/~ullman/fcdb/ch1.pdf Abiteboul, S., Hull, R., & Vianu, V. (1995). Foundations of databases (Vol. 8). Reading: Addison-Wesley. Chapter 1 - Database Systems. http://webdam.inria.fr/Alice/pdfs/Chapter-1.pdf Connolly, T. M., & Begg, C. E. (2005). Database systems: a practical approach to design, implementation, and management. Pearson Education. Chapter “Introduction to Databases”. Navigational (IDS) & Hierarchical (IBM IMS) Database Models (1960s) [Wikipedia articles: Navigational database, Integrated Data Store, Hierarchical database model, IBM information Management System.] Bachman, C. W. (1969). Data structure diagrams. ACM SIGMIS Database: The DATABASE for Advances in Information Systems, 1(2), 4-10. https://dl.acm.org/doi/pdf/10.1145/1017466.1017467 Bachman, C. W. The origin of the integrated data store (IDS): The first direct-access DBMS. IEEE Annals of the History of Computing, 31(4), 42-54. https://tschwarz.mscs.mu.edu/Classes/DB23/HW/bachmanIDS.pdf IBM. Introduction to IMS - History of IMS: Beginnings at NASA. https://www.ibm.com/docs/en/zos-basic-skills?topic=now-history-ims-beginnings-nasa Connolly, T. M., & Begg, C. E. (2005). Database systems: a practical approach to design, implementation, and management. Pearson Education. Chapter “Introduction to Databases” - Section “History of Database Management Systems”. Entity-Relationship (ER) Model [Wikipedia articles: Entity–relationship model, Database design.] Chen, P. P. S. (1976). The entity-relationship model—toward a unified view of data. ACM transactions on database systems (TODS), 1(1), 9-36. https://dl.acm.org/doi/pdf/10.1145/320434.320440 Chen, P. P. S. (1997). English, Chinese and ER diagrams. Data & Knowledge Engineering, 23(1), 5-16. https://bit.csc.lsu.edu/~chen/pdf/ER_C.pdf Chen, P. (2002). Entity-relationship modeling: historical events, future trends, and lessons learned. Software pioneers: contributions to software engineering, 296-310 https://bit.csc.lsu.edu/~chen/pdf/Chen_Pioneers.pdf. Connolly, T. M., & Begg, C. E. (2005). Database systems: a practical approach to design, implementation, and management. Pearson Education. Chapter “Entity-Relationship Modeling” & Appendix “Alternative ER Modeling Notations”. Relational Model, Relational Algebra & Relational Schema [Wikipedia articles: Relational model, Candidate key, Foreign key, Relational algebra, Database schema.] Codd, E. F. (1970). A relational model of data for large shared data banks. Communications of the ACM, 13(6), 377-387. https://dl.acm.org/doi/pdf/10.1145/362384.362685 Astrahan, M. M., Blasgen, M. W., Chamberlin, D. D., Eswaran, K. P., Gray, J. N., Griffiths, P. P., ... & Watson, V. (1976). System R: Relational approach to database management. ACM Transactions on Database Systems (TODS), 1(2), 97-137. https://dl.acm.org/doi/pdf/10.1145/320455.320457 Codd, E. F. (1990). The relational model for database management: version 2. Addison-Wesley Longman Publishing Co., Inc.. https://dl.acm.org/doi/pdf/10.5555/77708 Hector Garcia-Molina, Jeffrey D. Ullman, and Jennifer Widom. 2001. Database Systems: The Complete Book. Prentice Hall PTR, USA. Chapter “Relational Database Modeling”. http://infolab.stanford.edu/~ullman/fcdb/ch2.pdf Abiteboul, S., Hull, R., & Vianu, V. (1995). Foundations of databases (Vol. 8). Reading: Addison-Wesley. Chapter 3 - The Relational Model http://webdam.inria.fr/Alice/pdfs/Chapter-3.pdf Connolly, T. M., & Begg, C. E. (2005). Database systems: a practical approach to design, implementation, and management. Pearson Education. Chapters “Relational Model” and following. Structured Query Language (SQL) [Wikipedia articles: SQL, SQL:2023, Category:SQL, Join (SQL).] Chamberlin, D. D., & Boyce, R. F. (1974, May). SEQUEL: A structured English query language. In Proceedings of the 1974 ACM SIGFIDET (now SIGMOD) workshop on Data description, access and control (pp. 249-264). https://dl.acm.org/doi/pdf/10.1145/800296.811515 Oracle (2023). Oracle® Database - SQL Language Reference. https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf Abiteboul, S., Hull, R., & Vianu, V. (1995). Foundations of databases (Vol. 8). Reading: Addison-Wesley. Chapter 7 - Notes on Practical Languages http://webdam.inria.fr/Alice/pdfs/Chapter-7.pdf Guagliardo, P., & Libkin, L. (2017). A formal semantics of SQL queries, its validation, and applications. Proceedings of the VLDB Endowment, 11(1), 27-39. https://www.vldb.org/pvldb/vol11/p27-guagliardo.pdf Connolly, T. M., & Begg, C. E. (2005). Database systems: a practical approach to design, implementation, and management. Pearson Education. Chapters “SQL: Data Manipulation” and following. Database Normalisation [Wikipedia articles: Database normalization, First normal form, Second normal form Third normal form.] Kent, W. (1983). A simple guide to five normal forms in relational database theory. Communications of the ACM, 26(2), 120-125. https://dl.acm.org/doi/pdf/10.1145/358024.358054 Newcomer, L. (2005). Normalization for Normal people Understanding Algorithms for Getting to 3NF. ASEE CoED Journal Volume 15 / Number 1. https://coed.asee.org/wp-content/uploads/2020/09/6-Normalization-for-Normal-people-Understanding-Algorithms-for-Getting-to-3NF.pdf Abiteboul, S., Hull, R., & Vianu, V. (1995). Foundations of databases (Vol. 8). Reading: Addison-Wesley. Chapter 11 - Design and Dependencies. http://webdam.inria.fr/Alice/pdfs/Chapter-11.pdf Connolly, T. M., & Begg, C. E. (2005). Database systems: a practical approach to design, implementation, and management. Pearson Education. Chapters “Normalisation” & “Advanced Normalisation”. Theory behind Normalisation: Lossless Join Decomposition & Chase Algorithm [Wikipedia articles: Lossless Join Decomposition, Chase (algorithm).] Aho, A. V., Beeri, C., & Ullman, J. D. (1979). The theory of joins in relational databases. ACM Transactions on Database Systems (TODS), 4(3), 297-314. https://dl.acm.org/doi/pdf/10.1145/320083.320091 Maier, D., Mendelzon, A. O., & Sagiv, Y. (1979). Testing implications of data dependencies. ACM Transactions on Database Systems (TODS), 4(4), 455-469. https://dl.acm.org/doi/pdf/10.1145/320107.320115 Abiteboul, S., Hull, R., & Vianu, V. (1995). Foundations of databases (Vol. 8). Reading: Addison-Wesley. Chapter 11 - Design and Dependencies. http://webdam.inria.fr/Alice/pdfs/Chapter-11.pdf Connolly, T. M., & Begg, C. E. (2005). Database systems: a practical approach to design, implementation, and management. Pearson Education. Chapter “Normalization” - Section “Modification Anomalies”. Enhanced Entity-Relationship (E-ER) Model & Unified Modeling Language (UML) [Wikipedia articles: Enhanced Entity–relationship model, Unified Modeling Language.] Markowitz, V. M., & Shoshani, A. (1992). Representing extended entity-relationship structures in relational databases: A modular approach. ACM Transactions on Database Systems (TODS), 17(3), 423-464. https://dl.acm.org/doi/pdf/10.1145/132271.132273 Saiedian, H. (1997). An evaluation of extended entity-relationship model. Information and software Technology, 39(7), 449-462. https://people.eecs.ku.edu/~saiedian/Pub/Journal/1997-Saiedian-IST-ER.pdf Marcos, E., Vela, B., & Cavero, J. M. (2003). A methodological approach for object-relational database design using UML. Software and Systems Modeling, 2, 59-72. https://link.springer.com/content/pdf/10.1007/s10270-002-0001-y.pdf Connolly, T. M., & Begg, C. E. (2005). Database systems: a practical approach to design, implementation, and management. Pearson Education. Chapter “Enhanced Entity-Relationship Model”. Object-Relational Database Management Systems (ORDBMS) [Wikipedia articles: DB-Engines ranking, Object-Relational database, Comparison of object-relational database management systems.] Akhtar, A. (2023). Popularity Ranking of Database Management Systems. arXiv preprint arXiv:2301.00847. https://arxiv.org/pdf/2301.00847.pdf Oracle (2023). Oracle® Database - Object-Relational Developer's Guide. https://docs.oracle.com/en/database/oracle/oracle-database/21/adobj. Oracle (2023). Oracle® Database - Database Concepts. https://docs.oracle.com/en/database/oracle/oracle-database/21/cncpt Connolly, T. M., & Begg, C. E. (2005). Database systems: a practical approach to design, implementation, and management. Pearson Education. Chapter “Object-Relational DBMSs”. Transaction Processing & Benchmarks [Wikipedia articles: ACID, Database transaction, Transaction processing, Two-phase locking, Transaction Processing Performance Council.] Haerder, T., & Reuter, A. (1983). Principles of transaction-oriented database recovery. ACM computing surveys (CSUR), 15(4), 287-317. https://dl.acm.org/doi/pdf/10.1145/289.291 Gray, J. (1981, September). The transaction concept: Virtues and limitations. In VLDB (Vol. 81, pp. 144-154). https://people.eecs.berkeley.edu/~kubitron/courses/cs262a-F21/handouts/papers/theTransactionConcept.pdf Barthels, C., Müller, I., Taranov, K., Alonso, G., & Hoefler, T. (2019). Strong consistency is not hard to get: Two-Phase Locking and Two-Phase Commit on Thousands of Cores. Proceedings of the VLDB Endowment, 12(13), 2325-2338. https://www.vldb.org/pvldb/vol12/p2325-barthels.pdf Boncz, P., Neumann, T., & Erling, O. (2013, August). TPC-H analyzed: Hidden messages and lessons learned from an influential benchmark. In Technology Conference on Performance Evaluation and Benchmarking (pp. 61-76). Cham: Springer International Publishing. https://homepages.cwi.nl/~boncz/snb-challenge/chokepoints-tpctc.pdf Connolly, T. M., & Begg, C. E. (2005). Database systems: a practical approach to design, implementation, and management. Pearson Education. Chapter “Transaction Management”. NoSQL (Key-Value Store, Graph Database etc) & Semistructured (XML, JSON) [Wikipedia articles: NoSQL, Neo4j, Bigtable, Semi-structured Data.] Stonebraker, M. (2009). The 'No SQL' Discussion Has Nothing to Do With SQL. In BLOG@CACM https://cacm.acm.org/blogs/blog-cacm/50678-the-no-sql-discussion-has-nothing-to-do-with-sql Strauch, C., Sites, U. L. S., & Kriha, W. (2011). NoSQL databases. Lecture Notes, Stuttgart Media University, 20(24), 79. https://www.christof-strauch.de/nosqldbs.pdf Lissandrini, M., Brugnara, M., & Velegrakis, Y. (2018). Beyond macrobenchmarks: microbenchmark-based graph database evaluation. Proceedings of the VLDB Endowment, 12(4), 390-403. https://www.vldb.org/pvldb/vol12/p390-lissandrini.pdf Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., ... & Gruber, R. E. (2008). Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems (TOCS), 26(2), 1-26.https://research.google/pubs/pub27898/ Connolly, T. M., & Begg, C. E. (2005). Database systems: a practical approach to design, implementation, and management. Pearson Education. Chapter “Semistructured Data and XML”. Distributed Systems & Basically Available Soft-state Eventually consistent (BASE) [Wikipedia articles: CAP theorem, Eventual consistency.] Brewer, E. (2017). Spanner, truetime and the cap theorem. https://storage.googleapis.com/pub-tools-public-publication-data/pdf/45855.pdf Vogels, W. (2009). Eventually consistent. Communications of the ACM, 52(1), 40-44. https://dl.acm.org/doi/pdf/10.1145/1435417.1435432 Connolly, T. M., & Begg, C. E. (2005). Database systems: a practical approach to design, implementation, and management. Pearson Education. Chapter “Replication and Mobile Databases”. Data Warehousing, Online Analytical Processing (OLAP), Main Memory Database Systems [Wikipedia articles: Data warehouse, Online analytical processing, In-memory database, Column-oriented DBMS.] Garcia-Molina, H., & Salem, K. (1992). Main memory database systems: An overview. IEEE Transactions on knowledge and data engineering, 4(6), 509-516. https://pages.cs.wisc.edu/~jhuang/qual/main-memory-db-overview.pdf Kemper, A., & Neumann, T. (2011, April). HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In 2011 IEEE 27th International Conference on Data Engineering (pp. 195-206). IEEE. https://cs.brown.edu/courses/cs227/archives/2012/papers/olap/hyper.pdf Chaudhuri, S., & Dayal, U. (1997). An overview of data warehousing and OLAP technology. ACM Sigmod record, 26(1), 65-74. https://dl.acm.org/doi/pdf/10.1145/248603.248616 Armenatzoglou, N., Basu, S., Bhanoori, N., Cai, M., Chainani, N., Chinta, K., ... & Terry, D. (2022, June). Amazon Redshift re-invented. In Proceedings of the 2022 International Conference on Management of Data (pp. 2205-2217). https://assets.amazon.science/93/e0/a347021a4c6fbbccd5a056580d00/sigmod22-redshift-reinvented.pdf Hellerstein, J. M., & Stonebraker, M. (Eds.). (2005). Chpter 4: New DBMS Architectures. Readings in database systems. MIT press. http://www.redbook.io/pdf/ch4-newdbms.pdf Connolly, T. M., & Begg, C. E. (2005). Database systems: a practical approach to design, implementation, and management. Pearson Education. Chapters “Data Warehousing Concept” and following. Database Administration, Ethics & General Data Protection Regulation (GDPR) [Wikipedia articles: Database administration, Big data ethics, General Data Protection Regulation.] Agarwal, A., George, M., Jeyaraj, A., & Schwarzkopf, M. (2021). Retrofitting GDPR compliance onto legacy databases. Proceedings of the VLDB Endowment, 15(4). https://vldb.org/pvldb/vol15/p958-george.pdf Shastri, S., Banakar, V., Wasserman, M., Kumar, A., & Chidambaram, V. Understanding and Benchmarking the Impact of GDPR on Database Systems. Proceedings of the VLDB Endowment, 13(7). https://www.vldb.org/pvldb/vol13/p1064-shastri.pdf Stoyanovich, J., Howe, B., & Jagadish, H. V. (2020). Responsible data management. Proceedings of the VLDB Endowment, 13(12). https://www.vldb.org/pvldb/vol13/p3474-asudeh.pdf Connolly, T. M., & Begg, C. E. (2005). Database systems: a practical approach to design, implementation, and management. Pearson Education. Chapters “Security and Administration” & “Professional, Legal, and Ethical Issues in Data Management”. Query Processing & Query Optimisation [Wikipedia articles: Hash join, Worst-case optimal join algorithm, Query plan, Query optimization.] Barber, R., Lohman, G., Pandis, I., Raman, V., Sidle, R., Attaluri, G., ... & Sharpe, D. (2014). Memory-efficient hash joins. Proceedings of the VLDB Endowment, 8(4), 353-364. https://www.vldb.org/pvldb/vol8/p353-barber.pdf Ngo, H. Q., Porat, E., Ré, C., & Rudra, A. (2018). Worst-case optimal join algorithms. Journal of the ACM (JACM), 65(3), 1-40.https://dl.acm.org/doi/pdf/10.1145/3180143 Abiteboul, S., Hull, R., & Vianu, V. (1995). Foundations of databases (Vol. 8). Reading: Addison-Wesley. Chapter 6 - Static Analysis and Optimization http://webdam.inria.fr/Alice/pdfs/Chapter-6.pdf Hellerstein, J. M., Stonebraker, M., & Hamilton, J. (2007). Architecture of a database system. Foundations and Trends® in Databases, 1(2), 141-259. https://perspectives.mvdirona.com/content/binary/ArchitectureOfDatabaseSystem.pdf Moerkotte, G. (2020). Draft for Book on “Building Query Compilers”, Chapter 2 “Textbook Query Optimization”. https://pi3.informatik.uni-mannheim.de/~moer/querycompiler.pdf Vu, T. (2019, June). Deep query optimization. In Proceedings of the 2019 International Conference on Management of Data (pp. 1856-1858). https://dl.acm.org/doi/pdf/10.1145/3299869.3300104 Ding, J., Marcus, R., Kipf, A., Nathan, V., Nrusimha, A., Vaidya, K., ... & Kraska, T. (2022). SageDB: An Instance-Optimized Data Analytics System. Proceedings of the VLDB Endowment, 15(13), 4062-4078. https://www.vldb.org/pvldb/vol15/p4062-ding.pdf Connolly, T. M., & Begg, C. E. (2005). Database systems: a practical approach to design, implementation, and management. Pearson Education. Chapter “Query Processing”. Database Index Structures [Wikipedia articles: Database index, B+ tree, Sargable, Category:Database index techniques.] Comer, D. (1979). Ubiquitous B-tree. ACM Computing Surveys (CSUR), 11(2), 121-137.https://dl.acm.org/doi/pdf/10.1145/356770.356776 Idreos, S., et al. (2018). The periodic table of data structures. IEEE Data Eng. Bull https://stratos.seas.harvard.edu/files/stratos/files/periodictabledatastructures.pdf. Kraska, T., Beutel, A., Chi, E. H., Dean, J., & Polyzotis, N. (2018, May). The case for learned index structures. In Proceedings of the 2018 international conference on management of data (pp. 489-504).https://dl.acm.org/doi/pdf/10.1145/3183713.3196909 Connolly, T. M., & Begg, C. E. (2005). Database systems: a practical approach to design, implementation, and management. Pearson Education. Chapter “Methodology—Physical Database Design for Relational Databases”. Turing Awards related to Databases [Wikipedia articles: Turing Award, Navigational database, Relational database, Parallel database, Ingres (database), PostgreSQL.] Bachman, C. W. (1973). The programmer as navigator. Communications of the ACM, 16(11), 653-658. https://dl.acm.org/doi/pdf/10.1145/355611.362534 Codd, E. F. (1981). Relational Database: A Practical Foundation for Productivity. Communications of the ACM, 25(2), 109-117. https://dl.acm.org/doi/pdf/10.1145/1283920.1283937 Gray, J. (1999). What next? A dozen information-technology research goals. In ACM Turing award lectures (p. 1998). https://dl.acm.org/doi/pdf/10.1145/1283920.2159561 Stonebraker, M. (2014). Development of INGRES, Legacy of INGRES, Postgres. https://amturing.acm.org/award_winners/stonebraker_1172121.cfmResearchResearch Interests:See Michael Shekelyan’s research profile pages including details of research interests, publications, and live grants.