SAS Will Soon Release Big Data Analytics for Hadoop
SAS announced this week that it is developing an interactive analytics programming environment for the open-source Hadoop framework based on SAS in-memory technology.
SAS In-Memory Statistics for Hadoop will enable multiple users to simultaneously and interactively manage, explore, and analyze data, build and compare models, and score massive amounts of data in Hadoop. It is expected to be released before the second half of this year.
"SAS In-Memory Statistics for Hadoop loads Hadoop data once and keeps it in memory for multiple analyses within a session, across multiple users," said Oliver Schabenberger, SAS senior director of analytic server research and development, in a statement. "Compare that to approaches that require writing data to disk. All that data shuffling is extremely inefficient with big data."
The same in-memory analytics technology that powers the SAS Visual Analytics also underpins SAS In-Memory Statistics for Hadoop.
"Data scientists, modelers, and statisticians no longer need a patchwork of tools because we're eliminating the need for different analytic programming languages. SAS In-Memory Statistics for Hadoop supports the entire range of analytics, providing a fast, powerful and comprehensive means for collaborative analysis," Schabenberger said.
Among the numerous supported statistical and machine learning modeling techniques in SAS In-Memory Statistics for Hadoop are clustering, regression, generalized linear models, analysis of variance, decision trees, random decision forests, text analytics and recommendation systems.