1.CS5412 / Lecture 9 Machine Learning For smart Farms Ken Birman Spring, 2018 http://www.cs.cornell.edu/courses/cs5412/2018sp 1

2.How will I o T Reshape the way machine learning is done? Machine learning for IoT settings has demanding time deadlines not seen in traditional cloud systems. Moreover, the amount of data on the IoT devices could be vastly more than we can hope to download. Our goal today? To understand the resulting flow of data/computing. Data sets are so large in these settings that only really smart management of flows can yield a good solution. This shapes a view focused on the pattern of computation in IoT settings. http://www.cs.cornell.edu/courses/cs5412/2018sp 2

3.Why must the cloud evolve? Until now, big data computations have run in big “back-end” systems like the famous MapReduce /Hadoop framework, or high-performance supercomputers. Big data processing was mostly done in batches, offline. IoT model demands instantaneous mobile intelligence, vision, speech understanding, control of devices. A batched, offline model won’t work. 3

4.Today: a very “long” pipeline Data acquisition…. Global File System… Hadoop jobs 4 Machine learning typically lives here, at the back GFS Delay: milliseconds… Seconds…. Hours

5.New: Move ML to the edge Of the cloud Data acquisition…. Global File System… Hadoop jobs Machine learning typically lives here, at the back GFS Delay: milliseconds… Seconds…. Hours ML was at the back We move data classification and some aspects of learning here Delay: milliseconds…

6.Concrete example: FarmBeats Can we map farming tasks to Azure’s cloud IoT model? http://www.cs.cornell.edu/courses/cs5412/2019sp 6

7.Smart monitoring of Crops Field of oats, or hay How is the crop growing? Are there signs of drought / insect / virus / fungal / bacterial issues? If so, can we diagnose the exact problem? If we can, what treatment is needed, and exactly where to apply it? Can we learn from this and improve our seed choice for next year? Where should we fertilize or irrigate? http://www.cs.cornell.edu/courses/cs5412/2019sp 7

8.Smart herd management Dairy: Cow health and monitoring Which way should we point the camera? When to take photos/video? How much milk did each cow produce, and of what quality? What did it eat, and how was its appetite? How much time did it spend ruminating, or sleeping? Which cows need routine medical attention? Is a cow close to giving birth? Is it likely to need emergency help? http://www.cs.cornell.edu/courses/cs5412/2019sp 8

9.Smart Dairy Milk processing, yoghurt and cheese making Must monitor temperature and pH Need to sterilize properly using correct strength of product, rinse off Watch for stuck or runaway fermentations Check samples for unwanted bacteria, like Listeria (very dangerous!) Maintain a secure and tamperproof audit trace ( BlockChain ?) http://www.cs.cornell.edu/courses/cs5412/2019sp 9

10.Smart waste disposal What about all the runoff and farm waste? Why not collect it, reprocess it for valuable secondary products? Manure contains nitrogen and phosphorus can be used to create fertilizer Waste water can be captured and used for irrigation Undigested material can be transformed to “bio oil” by heating at high pressure Residual material after treatment can be composted and plowed back on fields Much of the problem with algae blooms could be eliminated by such steps, and farms could also earn more (or spend less) by doing so! http://www.cs.cornell.edu/courses/cs5412/2019sp 10

11.Geeky Stuff Recognize cow moods, relate cow emotional state to milk production Optimize drone flights over complex terrain to “sail on the wind” & save power Develop a multispectral image analysis to interpret signs of crop damage Programming a drone to “look more closely” if needed, like underside of leaves or closeups of blighted ears of oats Machine learning to estimate crop maturity and schedule equipment for harvesting Predict the best choice of crop and the specific choice of seeds to plant next year in each parcel of a large field http://www.cs.cornell.edu/courses/cs5412/2019sp 11

12.Do no harm! Smart f arming also raises issues of privacy and security: Banks and insurance companies might be eager to “see” private data There are more and more laws governing food-supply auditing ? If farms became dependent on IoT , how can we make the technology robust enough for a wide range of conditions (weather, dust, …) Farmers aren’t hi-tech specialists. How hard will IoT be to maintain? Can we create versions for very poor rural areas? http://www.cs.cornell.edu/courses/cs5412/2019sp 12

13.We need to drill down on a concrete task representative of these. In most of these tasks we see a shared structure: Start with a problem posed in a real world, like a farm or dairy Work to understand the various dimensions, especially scalability issues tied to big data. If we design without scalability in mind, our solution will fail! Deploy sensors, then design a state machine that understands the sensor events, platform events, and uses functions to perform tasks. Perhaps, develop new elastic -services your system will require. Debug this on a real system, like Azure IoT … not an easy job! http://www.cs.cornell.edu/courses/cs5412/2019sp 13

14.Crop Monitoring Let’s focus initially on just one case: monitoring a field using drones. What major subsystems would we need? Mapping system to pull up a topographical map of the field to scan Basic drone flight control system to “follow” a flight plan Wind sensing and mapping subsystem, to “sail on the breeze” (not fight it) Image analysis: “Are these plants healthy or diseased?” Close-examination: Visit diseased plants, diagnose issue, document it. Data archive: Downloads interesting images/video/ etc and retains it. http://www.cs.cornell.edu/courses/cs5412/2019sp 14

15.Functions? Or -Services? Recall that we have a choice: some tasks should run as state machines, keeping their state in a Azure key-value store. Other tasks should be implemented by one (or many)  - Services that would understand our goals and send instructions to our drones . This would feel more like a standard “control center” approach. In a scaled-out IoT setting, a solution needs elements of both kinds. http://www.cs.cornell.edu/courses/cs5412/2018sp 15

16.Functions? Or new -Services? Why is it so obvious that this isn’t a case for a “pure function” solution? What we’ve described would require an elaborate state machine. It might be very hard to debug such a complex function application. The logic for each state might be complicated, since everything will be event driven. As we “learn current conditions” we run into a big-data problem. A function server isn’t intended for such cases. http://www.cs.cornell.edu/courses/cs5412/2018sp 16

17.Should everything be in -Services? Historically this was a common approach: people built specialized control systems and viewed devices as dumb. But few have the skills to pull it off. In an IoT setting, massive scale brings massive loads! Any -services will need to be sharded , fault-tolerant, highly responsive , and may have to leverage special hardware accelerators. If we think of a function layer as a kind of intelligent “cache ” that can shield the  -services from overload, we are approaching this the right way. http://www.cs.cornell.edu/courses/cs5412/2018sp 17

18.Approach this leads us to? We will use Azure functions for “lightweight” tasks and actions Ideal for read-only actions like making a quick decision OK for reporting events that go into some kind of record or log But not for serious computing with heavy computation, big data, accelerators, or complex state machine sequences. Then build new -services for the heavy-weight tasks, like learning a new machine-learned model, or computing the optimal search path with wind. http://www.cs.cornell.edu/courses/cs5412/2018sp 18

19.There won’t be just one! Divide the set of knowledge tasks into groups. Don’t ask one server to do everything. Instead build distinct servers for each category of knowledge tasks. So we would want One -service just for “flight planning”, or even two (one for “collision avoidance”) One for “sailing on a breeze”, One for “drone health management”, One for “deciding which photos are worth downloading,” One for “identifying possible crop damage areas.” http://www.cs.cornell.edu/courses/cs5412/2018sp 19

20.Remember: Amazon ended up with hundreds of -services / web page! Learn from others who have been down this path before you. The whole game centers on breaking up the task into chunks that are self-contained, but “small” in scope! If you think of this as one big monolithic task, you are certain to be doomed by the complexity of the overall undertaking! http://www.cs.cornell.edu/courses/cs5412/2018sp 20

21.How to create new -Services? We can start with Jim Gray’s suggestion: use key-value sharding from the outset. Within a shard, data will need to be replicated. This leads to what is called the “state machine replication model”, which involves A group of replicas (and a membership service to track the set) Each update occurs as a message delivered to all replicas The updates are in the identical order No matter what happens (failures, restarts) “amnesia” won’t occur. http://www.cs.cornell.edu/courses/cs5412/2018sp 21

22.Will this scale? Jim Gray’s analysis told us that general database transactions won’t scale . So don’t even consider our sharded key-value service as a database. We’ll want to aim for simple key-value operations, or small computations that can somehow be made fault-tolerant and atomic without scaling issues. This was a sweet spot in Jim’s model. http://www.cs.cornell.edu/courses/cs5412/2018sp 22

23.“All Sharded , all the time” In computing classes, we really don’t learn to compute on data that is spread over devices. IoT data will already be sharded when it enters in the system, and all computation needs to be parallel and to keep the work sharded . Sharding is a magic formula for scaling, but how can people to learn to program in an “all- sharded , all the time” manner? http://www.cs.cornell.edu/courses/cs5412/2018sp 23

24.So, back to our FarmBeats Drones http://www.cs.cornell.edu/courses/cs5412/2018sp 24 Azure Function Server Functions: Lightweight, event-triggered programs in containers, “pay for what you use” resource model Message bus or queue -Services: some Azure provided, some “new”

25.Revisit our picture http://www.cs.cornell.edu/courses/cs5412/2018sp 25 Azure Function Server Functions: Lightweight, event-triggered programs in containers, “pay for what you use” resource model Message bus or queue -Services: some Azure provided, some “new” Moment-by-moment operation of the drone is a good fit for the function programming model. A set of -services can own many of our other tasks, each specialized in some sub-task. Divide the job up into distinct kinds of work!

26.Where’s the scale? Our example shows just a few drones monitoring on field. But “at scale” in a full deployment, you want to imagine hundreds of thousands scanning many thousands of fields. And millions more sensors and actuators playing other roles. http://www.cs.cornell.edu/courses/cs5412/2018sp 26

27.Let’s peek inside a microservice The inner structure would depend on design choices the developer would make This particular example has a load-balancer, a cache layer, and a back-end storage layer http://www.cs.cornell.edu/courses/cs5412/2018sp 27 Cache Layer Back-end Store Multicasts used for cache invalidations, updates Load balancer External clients use standard RESTful RPC through a load balancer

28.A Roll-Your-Own -Service of this kind might be hard to build! The solution needs to restart into this configuration after failures, handle process crashes or reboots of individual components. Data has to be stored and reloaded from files (or other - services) We need to manage the service in a consistent manner and program it to self-repair after a crash or disruption. http://www.cs.cornell.edu/courses/cs5412/2018sp 28

29.Derecho can help Derecho is Cornell’s software library for automating those kinds of tasks. The design was created with “intelligent edge” use cases in mind. The developer would attach event handlers in various places, and Derecho automates the remainder of the “life cycle” This greatly simplifies the development challenge http://www.cs.cornell.edu/courses/cs5412/2018sp 29