SparkIQ Labs Blog

2015-10-16T04:09:54-07:00

emaasit

Reblogged this on Emaasit's Blog.

LikeLike

Reply

2016-01-05T11:25:42-08:00

ecocarlisle

Hi,
When I try to run the “.\bin\sparkR” command I get the following error:

‘R’ is not recognized as an internal or external command,
operable program or batch file.

Any ideas?

Thanks,
Jon

LikeLike

Reply

2016-01-05T11:35:07-08:00

ecocarlisle

You need to navigate to where R is installed and then execute sparkR from there.

LikeLike

Pingback: Launch Apache Spark on AWS EC2 and Initialize SparkR Using RStudio | Mubashir Qasim

2015-11-16T08:59:13-08:00

Sandor

Hi, great tutorial, I was able to follow it easily, however I tried to create a hive context using (sparkRHive.init) and got the error:

“Spark SQL is not built with Hive support”

Do any tips?

LikeLike

Reply

2015-11-17T14:55:22-08:00

palsumitpal

Should not your 1st line be 1.5.1 – rather than “With the recent release of Apache Spark 1.4.1 on July 15th, 2015”

LikeLike

Reply

2015-11-17T15:03:45-08:00

emaasit

Hi Palsumitpal,
At the time of writing this post, the latest release was Spark 1.4.1. Maybe the word “recent” is confusing. I shall remove it. Thanks.

LikeLike

Reply

	# Set the system environment variables
	Sys.setenv(SPARK_HOME = "C:/Apache/spark-1.4.1")
	.libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))

	#load the Sparkr library
	library(SparkR)

	# Create a spark context and a SQL context
	sc <- sparkR.init(master = "local")
	sqlContext <- sparkRSQL.init(sc)

	#create a sparkR DataFrame
	DF <- createDataFrame(sqlContext, faithful)
	head(DF)

	# Create a simple local data.frame
	localDF <- data.frame(name=c("John", "Smith", "Sarah"), age=c(19, 23, 18))

	# Convert local data frame to a SparkR DataFrame
	df <- createDataFrame(sqlContext, localDF)

	# Print its schema
	printSchema(df)
	# root
	# \|– name: string (nullable = true)
	# \|– age: double (nullable = true)

	# Create a DataFrame from a JSON file
	path <- file.path(Sys.getenv("SPARK_HOME"), "examples/src/main/resources/people.json")
	peopleDF <- jsonFile(sqlContext, path)
	printSchema(peopleDF)

	# Register this DataFrame as a table.
	registerTempTable(peopleDF, "people")

	# SQL statements can be run by using the sql methods provided by sqlContext
	teenagers <- sql(sqlContext, "SELECT name FROM people WHERE age >= 13 AND age <= 19")

	# Call collect to get a local data.frame
	teenagersLocalDF <- collect(teenagers)

	# Print the teenagers in our dataset
	print(teenagersLocalDF)

	# Stop the SparkContext now
	sparkR.stop()

Installing and Starting SparkR Locally on Windows OS and RStudio

Leave a comment Cancel reply

Share this:

Related

7 thoughts on “Installing and Starting SparkR Locally on Windows OS and RStudio”

Leave a comment Cancel reply