DataAnalytics2014 week3b

Start the MongoDB server (with privileges); Use the Mongo command line: mongo or mongodb. > help. > db. > use local #(or test). At this point ...
展开查看详情

1. Lab exercises: datasets and data infrastructure Peter Fox Data Analytics – ITWS-4963/ITWS-6965 Week 3b, February 7, 2014 1

2. Assignment 2 – graded in lab. • General assignment – read EPI_data, specify a new data subset, create data frames in R and save them into a database • In R Studio – Install package – “rmongodb” (activate it) – http://www.r-tutor.com/r-introduction/ • MongoDB - http://www.mongodb.org/ – http://kkovacs.eu/cassandra-vs-mongodb-vs-couc hdb-vs- redis 2

3. Install • Start the MongoDB server (with privileges) • Use the Mongo command line: mongo or mongodb > help > db > use local #(or test) At this point you could insert new items in the db but we’ll leave that for later > for (var i = 1; i <= 25; i++) db.local.insert( { x : i } ) 3 > db.local.find()

4. Then switch to R or Rstudio Read EPI_data in (again!) Using the slide from the tab in the Excel spreadsheet (reminder in next slide) Create two data frames with: 1. The key indicators from EPI_data (and one independent variable) – I call this dfEPI 2. The weights assigned to each indicator in the EPI “model” and “sub-models” (you can choose the data structures) http://www.r-tutor.com/r-introduction/data-frame 4

5.The model and sub-models 5

6. Then switch to R or Rstudio Connect to the running DB (MacOS): > mongo = mongo.create(host="127.0.0.1",db="local") > mongo.is.connected(mongo) > bEPI <- mongo.bson.from.list(as.list(dfEPI)) > mongo.insert(mongo,'dfEPI.EPI',bEPI) #same for the second data frame (weights) > mongo.get.databases(mongo) # similar cmds in mongo > mongo.get.database.collections(mongo,"dfEPI") > cdf=mongo.find.all(mongo,"dfEPI.EPI") > cdf > mongo.destroy(mongo) #when done only 6

7. Back to MongoDB $ mongodb # command line show dbs <results> use <dbname> show collections 7

8. For Assignment • Show creation of data frames (in R) • Export into MongoDB (in R) • Verify that it is there (in mongo) • Future – query/read it back, use it. 8

9. Admin info (keep/ print this slide) • Class: ITWS-4963/ITWS 6965 • Hours: 12:00pm-1:50pm Tuesday/ Friday • Location: SAGE 3101 • Instructor: Peter Fox • Instructor contact: pfox@cs.rpi.edu, 518.276.4862 (do not leave a msg) • Contact hours: Monday** 3:00-4:00pm (or by email appt) • Contact location: Winslow 2120 (sometimes Lally 207A announced by email) • TA: Lakshmi Chenicheri chenil@rpi.edu • Web site: http://tw.rpi.edu/web/courses/DataAnalytics/2014 – Schedule, lectures, syllabus, reading, assignments, etc. 9