Engineering Rapleaf: Goodbye MapReduce, Hello Cascading – “Internally, Cascading translates the pipe assembly into a series of MapReduce jobs. The taps specify the input and output formats along with the input and output paths. Cascading manages all the intermediate data necessary to get a sequence of MapReduce jobs to communicate.“
Cascading – “Cascading is a feature rich API for defining and executing complex and fault tolerant data processing workflows on a Hadoop cluster.“