Skip to content

lokeshkh92/pyspark_sequence_to_orc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

pyspark_sequence_to_orc

Pyspark script to convert sequence file to orc format

Mention the sequence file input_path and orc file output_path in the script and run it using following command

=====================================================

PYTHONSTARTUP=PySpark_data_validation.py pyspark2 --master yarn --deploy-mode client --executor-memory=4g --num-executors=3 --executor-cores=2 --driver-memory=2g --name "Analysis-Spark"

======================================================

About

pyspark script to convert sequence file to orc format

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages