Traditional data warehousing and data analytics was an offline job and it usually required a lot of time to complete and give some results. These operations gave us some insight into our business but we had a limited possibility to improve or react in a timely manner. One of the efforts to move this approach closer to real-time was described with Lambda architecture where raw data was stored in real-time while also being analyzed in parallel and producing results with a certain delay. This approach eliminated those long running jobs by preparing analytic results on the incoming data but in todays business right information at the right time can mean improving our business and directly leading to more profit. Our latest technologies give us possibility to execute analytics on data streams and react almost instantly. Faster data requires faster reactions especially if its a fraud detection or a mission critical system. NoETL philosophy explains why the traditional approach is no longer valid and why we need these new technologies that we have today. In this presentation we are going to talk about the evolution from monolithic to distributed systems, pros and cons of both approaches and what we are able to do with todays technologies. Our main focus is going to be fast data stack (Spark, Mesos, Akka, Cassandra, Kafka) and how using these technologies we can create scalable and fast data pipeline while running real-time data analytics.
Buy this talk
Buy this video
ConferenceCast.tv — conference video talk archive.
With this service you can find interesting talks especially for you!