Cooperative data exploration

Living in a world of big data comes with a certain challenge. Namely, how to extract value from this ever-growing flow of information that comes our way. There are a lot of great tools that can help us, but they all require a lot of resources. So, how do we ease the burden on this CPU/RAM demand? One way to do it is to share the data we are working on and results of our computations with others.

Read more ›

Exploration of data from iPhone motion coprocessor (2)

Exploration of data from iPhone motion coprocessor (2)

Last week we have downloaded and loaded into R data from fitness tracker (motion coprocessor in iphone). Then with just few lines of R code we decomposed the data into a seasonal weekly component and the trend. Today we are going to see how to plot the number of steps per hour for different days of week. And then same data will be used to check how often there was any activity at given time.

Read more ›

Understanding Apache Spark’s Execution Model Using SparkListeners

When you execute an action on a RDD, Apache Spark runs a job that in turn triggers tasks using DAGScheduler and TaskScheduler, respectively. They are all low-level details that may be often useful to understand when a simple transformation is no longer simple performance-wise and takes ages to complete.

Read more ›


Pin It on Pinterest