Your verification ID is: guDlT7MCuIOFFHSbB3jPFN5QLaQ Big Computing: Oracle R Enterprise goes primetime

Thursday, February 9, 2012

Oracle R Enterprise goes primetime

Today Oracle announced the release of the commercial version of Oracle R Enterprise. I first heard about this product when it was released as a Beta in 2011. There has often been talk of coupling the wealth of analytics tools in R with the scalability of a database, It is good to see a company like Oracle dips its toe into the water.

Here is the text of the press release:


Oracle R Enterprise

Integrating Open Source R with Oracle Database 11g

Oracle R Enterprise, a component of the Oracle Advanced Analytics Option, makes the open source R statistical programming language and environment ready for the enterprise and big data. Designed for problems involving large amounts of data, Oracle R Enterprise integrates R with the Oracle Database. R users can run R commands and scripts for statistical and graphical analyses on data stored in the Oracle Database. R users can develop, refine and deploy R scripts that leverage the parallelism and scalability of the database to automate data analysis. Data analysts can run R packages and develop and operationalize R scripts for analytical applications in one step—without having to learn SQL. Oracle R Enterprise performs function pushdown for in-database execution of base R and popular R packages. Because it runs as an embedded component of the database, Oracle R Enterprise can run any R package either by function pushdown or via embedded R while the database manages the data served to the R engine.
Here is the post from the Oracle blog on Oracle R Enterprise:

Announcing Oracle R Enterprise 1.0

Analyzing huge data sets presents a challenging opportunity for IT decision makers, driven by the balance between the maintenance and support of existing IT infrastructure with the need to analyze rapidly growing data stores. In many cases, processing this data requires a fresh approach because traditional techniques fail when applied to massive data sets. To extract immediate value from big data, we desire tools that efficiently access, organize, analyze and maintain a variety of data types.
Oracle R Enterprise (ORE), a component in the Oracle Advanced Analytics Option of Oracle Database Enterprise Edition, emerges as the clear solution to these challenges. ORE integrates the popular open-source R statistical programming environment with Oracle Database 11g, Oracle Exadata and the Oracle Big Data Appliance, delivering enterprise-level analytics based on R scripts and parallelized, in-database modeling.
How do R and Oracle R Enterprise work together?
The powerful R programming environment enables the creation of sophisticated graphics, statistical analyses, and simulations. It contains a vast set of built-in functions which may be extended to build custom statistical packages. The R engine is limited by capacity and performance for large data, but with Oracle R Enterprise, R bypasses these contraints by leveraging the database as the analytics engine directly from their R session.
The components that support Oracle R Enterprise include:
1. The Oracle R Enterprise transparency layer - a collection of R packages with functions to connect to Oracle Database and use R functionality in Oracle Database. This enables R users to work with data too large to fit into the memory of a user's desktop system, and leverage the scalable Oracle Database as acomputational engine.
2. The Oracle statistics engine - a collection of statistical functions and procedures corresponding to commonly-used statistical libraries. The statistics engine packages also execute in Oracle Database.
3. SQL extensions supporting embedded R execution through the database on the database server. R users can execute R closures (functions) using an R or SQL API, while taking advantage of data parallelism. Using the SQL API for embedded R execution, sophisticated R graphics and results can be exposed in OBIEE dashboards and BI Publisher documents.
4. Oracle R Connector for Hadoop (ORCH) - an R package that interfaces with the Hadoop Distributed File System (HDFS) and enables executing MapReduce jobs. ORCH enables R users to work directly with an Oracle Hadoop cluster, executing computations from the R environment, written in the R language and working on data resident in HDFS, Oracle Database, or local files.
Using a simple R workflow, R users can seamlessly utilize the parallel processing architecture of ORE and ORCH for scalability and better performance. Analytics and reporting tasks are moved to the Oracle Database, eliminating long approval chains for data movement and dramatically increasing processing speed. R users are not required to learn SQL because the R-to-SQL translation is shipped to the database and processed behind the scenes. The significant benefits to IT include improved data security, data maintenance and audit compliance practices.
My old company Revolution Analytics has been in the business providing commercial R support and tools for almost five years now.  I do not believe the Revolution Analytics model and the Oracle model have anything in common. While I was at Revolution I learned the deep love and strong opinions that the R community have for the project that so selflessly support and grow. It is interesting to read the R-bloggers' perspective on this:

Oracle’s strange understanding of R users

February 8, 2012
By 
(This article was first published on Quantum Forest » rblogs, and kindly contributed to R-bloggers)

After reading David Smith’s tweet on the price of Oracle R Enterprise (actually free, but it requires Oracle Data Mining at $23K/core as pointed out by Joshua Ulrich.) I went to Oracle’s site to see what was all about. Oracle has a very interesting concept of why we use R:
Statisticians and data analysts like R because they typically don’t know SQL and are not familiar with database tasks. R allows them to remain highly productive.
Pardon? It sounds like if we only knew SQL and database tasks we would not need statistical software. File for future reference.

I hope both companies are successful. I believe that the long term survival of commercial software providers hinges on the smart adoption and integration of powerful open source tools.

3 comments:

  1. It may not be very obvious from the press release and the official Oracle Advanced Analytics database option page, but Oracle R Enterprise is available for download from an OTN page: http://www.oracle.com/technetwork/database/options/advanced-analytics/r-enterprise/ore-downloads-1502823.html. For more technical info about the product I suggest looking at six training presentations or the online documentation available from the ORE page http://www.oracle.com/technetwork/database/options/advanced-analytics/r-enterprise/index.html. For all other questions there is a discussion forum also available from the product page above. Once you have the ORE packages installed you can run the demos. For the list of available demos you can do demo(package="ORE").

    Also we would like to thank everyone involved in pointing out the ambiguity in our blog post regarding the "strange understanding of R users". We have clarified the statement to better reflect our position. For all your future comments and questions, please, do not hesitate to post them directly on our blog or discussion forum so that we can respond to them promptly.

    Thank you!

    ReplyDelete
  2. Thanks for the information Denis.

    Kirk

    ReplyDelete