Late days may NOT be used for this assignment.NOTE: Completion of this proposal
Late days may NOT be used for this assignment.
NOTE: Completion of this proposal requires substantial work and installation of software that might take some time. Start this early. Plan. Do not wait until the last minute! Project Objectives:
Gain practical experience working with data management tools and real data.
Translate the concepts learned in class to actual data and to use of the data management tools, themselves
Think about data in the context of real world problems.
Communication of methods and results to a varied audience.
Learning outcomes for Part 2:
Practice working with NoSQL databases
Be able to articulate the key differences between NoSQL and relational databases in the context of real world data problems by integrating concepts learned in class
NoSQL tools: You may use either MongoDB, or Neo4J (you may feel free to use another NoSQL tool of a similar type, but to be certain the tool meets the requirements, please consult a member of the teaching team before proceeding with a different NoSQL tool). MongoDB Atlas (Links to an external site.) is the cloud version of MongoDB and you can create an account and “sandbox” to play with some small datasets for free in this service. Take the same data set you loaded into your relational databases in Part I and load this data into your NoSQL data tool of choice. You will then try to perform the same queries and see if you get the same results.
Take the paper you wrote initially and expand it now to also include both your work with the relational databases and the NoSQL data tool. You Discussion section should also now compare and contrast the NoSQL tool with the relational tool
Introduction or background. Describe your data generally. What is in the data. Why is it interesting? Dataset: you will use. Describe the data and what is in the dataset — imagine you are describing it to someone who doesn’t know anything about the data. Describe who generated the data (if known). Such things as attributes, how many entries are in the data, and limitations or special features of the dataset would be helpful here, especially in the context of the problem areas you have said you are interested in from your background section. Methods: Describe your NoSQL data tool. What kind of database is it? How is it different from a relational database tool? Describe how you loaded the data and describe, briefly, any challenges you had. Show some screenshots and describe your schema and show some summary statistics and/or initial rows in your data tables.
Results: Show your test queries. Were you able to pull the same results from each table? Describe any differences or anomalies that you noticed in querying the two data tools.
Discusson: Describe the differences, conceptually and practically between the two relational database tools you used and the NoSQL tool. Think about which might be better for different circumstances and discuss briefly.
Next Steps: How might this work scale to very large datasets? Which principles and features of NoSQL data tools come into play when working with very large datasets?
Collaboration report: Who was responsible for which components of the activities described in this report? It is not acceptable to say that both members contributed to all parts or to be general in the description — describe tasks as specifically as you can or show them in a bulleted list. Each individual should be responsible for specific tasks, and for full credit, there should be some combination of technical and non-technical tasks for both members even if one member did more of the technical work.