How to read and write using spark
- TEXT File
- CSV File
- JSON File
- PARQUET File
- ORC File
- AVRO File
- SEQUENCE File
Reading and Writing several file formats in spark 2.0
After Rdd and Dataframe another abstraction is introduce in spark 2.0 called Dataset is a super set of Dataframe.
In earlier version of spark, Spark Context was the entry point. Now from spark 2.0 onward Spark Session is the main entry point for Dataset and Dataframe.
SparkSession internally has spark context and is a combination of HiveContext,SQLContext
and is available with name spark in spark-shell
Creating SparkSession
val spark = SparkSession
.builder
.master("local")
.appName("Spark Job")
.getOrCreate()
Reading and Writing several file formats in spark 1.6
No comments:
Post a Comment