Hortonworks.com
  • Explore
    • All Tags
    • All Questions
    • All Articles
    • All Ideas
    • All Repos
    • All SKB
    • All Users
    • All Badges
    • Leaderboard
  • Create
    • Ask a question
    • Create Article
    • Post Idea
    • Add Repo
  • Tracks
    • All Tracks
    • Community Help
    • Cloud & Operations
    • CyberSecurity
    • Data Ingestion & Streaming
    • Data Processing
    • Data Science & Advanced Analytics
    • Design & Architecture
    • Governance & Lifecycle
    • Hadoop Core
    • Sandbox & Learning
    • Security
    • Solutions
  • Login
HCC Hortonworks Community Connection
  • Home /
  • Solutions /
  • Home /
  • Solutions /
avatar image

Error:"java.net.SocketException: Connection reset and org.apache.spark.SparkException: Python worker exited unexpectedly (crashed)" when running PySpark using RDD and data frames to load images

hkumar created · Feb 17, 2018 at 05:31 PM
avatar image
0

SupportKB

Problem Description:
When running PySpark using RDD and data frames to load images, the following error is displayed in the Spark Executor logs: 
17/09/07 10:37:19 ERROR Executor: Exception in task 1.0 in stage 6.0 (TID 118) 
org.apache.spark.SparkException: Python worker exited unexpectedly (crashed) 
at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRDD.scala:203) 
at org.apache.spark.api.python.PythonRunner$$anon$1.<init>(PythonRDD.scala:207) 
at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:125) 
at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) 
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313) 
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69) 
at org.apache.spark.rdd.RDD.iterator(RDD.scala:275) 
at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) 
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313) 
at org.apache.spark.rdd.RDD.iterator(RDD.scala:277) 
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) 
at org.apache.spark.scheduler.Task.run(Task.scala:89) 
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227) 
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:745) 
Caused by: java.io.EOFException 
at java.io.DataInputStream.readInt(DataInputStream.java:392) 
at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRDD.scala:139) 
... 15 more 
"

Cause:
On reviewing  the Spark application History server, it was found that there were only two task that were run to process the job and it shared the entire RDD data within those 2 tasks.

The Spark workers shuffle the data across the node. Shuffling is the process of data transfer between stages.
In Spark, maximum allowed size of RDD block or Shuffle block is 2GB and this value is not configurable. If block size goes beyond 2GB, tasks will fail and driver will retry that task multiple times based on retry count. Eventually the jog/stage will fail. 

Hence, the issue the above error occurs when the Shuffle block exceeds more than 2 GB.

 
Solution:
To resolve the issue, increase the number of partitions to 100 [rdd.repartition(100)] to distribute the data processing across the node and keep the shuffle data below 2 GB.

About:
This article created by Hortonworks Support (Article: 000006438) on 2018-02-16 09:56
OS: Linux
Type: Executing_Jobs
Version: 2.5.3
Support ID: 000006438
thub.nodes.view.add-new-comment
solutionhwsupportSpark
Add comment
10 |6000 characters needed characters left characters exceeded
▼
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Viewable by all users

Up to 5 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image

Related posts

ERROR: "java.lang.LinkageError: ClassCastException: attempting to castjar:file" when running a Spark job

How to change the default queue to run a job in Spark-interpreter?

SparkThriftServer crashing with Java heap OutOfMemory errors

"org.apache.spark.SparkException: Exception thrown in awaitResult and java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]" while a running Spark SQL job

Renaming existing column in Hive table is displayed incorrectly from Spark or Spark Zeppelin Interpreters related to Spark

"Warning : "Version mismatch between Spark JVM and SparkR package. JVM version was 2.2.0.2.6.4.0-91 while R package version was 2.2.0 " when running SparkR shell

"ERROR MicroBatchExecution: Query [id = 567e4e77-9457-44ef-b5f5-56915bddd224, runId = ff97cd89-f6db-4d7a-815e-17a7e720a07e] terminated with error java.lang.AbstractMethodError" when executing Spark streaming jobs

Unable to use the Hive tables with sub directories in Spark

ERROR:"java.lang.RuntimeException: serious problem at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021)" when running Hive query in Spark Shell

"ERROR: RuntimeError: module compiled against API version 9 but this version of numpy is 7" when running 'ipython' or 'pyspark' interpreter in Zeppelin

This website uses cookies for analytics, personalisation and advertising. To learn more or change your cookie settings, please read our Cookie Policy. By continuing to browse, you agree to our use of cookies.

HCC Guidelines | HCC FAQs | HCC Privacy Policy | Privacy Policy | Terms of Service

© 2011-2019 Hortonworks Inc. All Rights Reserved.

Hadoop, Falcon, Atlas, Sqoop, Flume, Kafka, Pig, Hive, HBase, Accumulo, Storm, Solr, Spark, Ranger, Knox, Ambari, ZooKeeper, Oozie and the Hadoop elephant logo are trademarks of the Apache Software Foundation.

  • Anonymous
  • Login
  • Create
  • Ask a question
  • Create Article
  • Post Idea
  • Add Repo
  • Create SupportKB
  • Tracks
  • Community Help
  • Cloud & Operations
  • CyberSecurity
  • Data Ingestion & Streaming
  • Data Processing
  • Data Science & Advanced Analytics
  • Design & Architecture
  • Governance & Lifecycle
  • Hadoop Core
  • Sandbox & Learning
  • Security
  • Solutions
  • Explore
  • All Tags
  • All Questions
  • All Articles
  • All Ideas
  • All Repos
  • All SKB
  • All Users
  • Leaderboard
  • All Badges