- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
DataAnalytics2014 week3b
展开查看详情
1 . Lab exercises: datasets and data infrastructure Peter Fox Data Analytics – ITWS-4963/ITWS-6965 Week 3b, February 7, 2014 1
2 . Assignment 2 – graded in lab. • General assignment – read EPI_data, specify a new data subset, create data frames in R and save them into a database • In R Studio – Install package – “rmongodb” (activate it) – http://www.r-tutor.com/r-introduction/ • MongoDB - http://www.mongodb.org/ – http://kkovacs.eu/cassandra-vs-mongodb-vs-couc hdb-vs- redis 2
3 . Install • Start the MongoDB server (with privileges) • Use the Mongo command line: mongo or mongodb > help > db > use local #(or test) At this point you could insert new items in the db but we’ll leave that for later > for (var i = 1; i <= 25; i++) db.local.insert( { x : i } ) 3 > db.local.find()
4 . Then switch to R or Rstudio Read EPI_data in (again!) Using the slide from the tab in the Excel spreadsheet (reminder in next slide) Create two data frames with: 1. The key indicators from EPI_data (and one independent variable) – I call this dfEPI 2. The weights assigned to each indicator in the EPI “model” and “sub-models” (you can choose the data structures) http://www.r-tutor.com/r-introduction/data-frame 4
5 .The model and sub-models 5
6 . Then switch to R or Rstudio Connect to the running DB (MacOS): > mongo = mongo.create(host="127.0.0.1",db="local") > mongo.is.connected(mongo) > bEPI <- mongo.bson.from.list(as.list(dfEPI)) > mongo.insert(mongo,'dfEPI.EPI',bEPI) #same for the second data frame (weights) > mongo.get.databases(mongo) # similar cmds in mongo > mongo.get.database.collections(mongo,"dfEPI") > cdf=mongo.find.all(mongo,"dfEPI.EPI") > cdf > mongo.destroy(mongo) #when done only 6
7 . Back to MongoDB $ mongodb # command line show dbs <results> use <dbname> show collections 7
8 . For Assignment • Show creation of data frames (in R) • Export into MongoDB (in R) • Verify that it is there (in mongo) • Future – query/read it back, use it. 8
9 . Admin info (keep/ print this slide) • Class: ITWS-4963/ITWS 6965 • Hours: 12:00pm-1:50pm Tuesday/ Friday • Location: SAGE 3101 • Instructor: Peter Fox • Instructor contact: pfox@cs.rpi.edu, 518.276.4862 (do not leave a msg) • Contact hours: Monday** 3:00-4:00pm (or by email appt) • Contact location: Winslow 2120 (sometimes Lally 207A announced by email) • TA: Lakshmi Chenicheri chenil@rpi.edu • Web site: http://tw.rpi.edu/web/courses/DataAnalytics/2014 – Schedule, lectures, syllabus, reading, assignments, etc. 9