Hortonworks.com
  • Explore
    • All Tags
    • All Questions
    • All Repos
    • All SKB
    • All Articles
    • All Ideas
    • All Users
    • All Badges
    • Leaderboard
  • Create
    • Ask a question
    • Add Repo
    • Create Article
    • Post Idea
  • Tracks
    • All Tracks
    • Community Help
    • Cloud & Operations
    • CyberSecurity
    • Data Ingestion & Streaming
    • Data Processing
    • Data Science & Advanced Analytics
    • Design & Architecture
    • Governance & Lifecycle
    • Hadoop Core
    • Sandbox & Learning
    • Security
    • Solutions
  • Login
HCC Hortonworks Community Connection
  • Home /
  • Hadoop Core /
  • Home /
  • Hadoop Core /
avatar image

Tuning Heap Sizing for a 80 Node HDP cluster

  • Export to PDF
kgautam created · Jul 30, 2018 at 05:15 PM · edited · Jul 30, 2018 at 05:26 PM
6

Article

Heap resigning and Garbage Collection tuning plays a central role in deciding how healthy and efficiently the cluster resources will be utilized.
One of the biggest challenges is to fine tune the heap to make sure neither you are underutilized or over utilizing the resources.


The following heap sizing has been made after a in depth analysis of the health of individual services. Do remember this is the base line you can add more heap to your resources depending on the kind of work load one executes.

Key take away

1. All the services present in HDP are JVM based and all need appropriate heap sizing and GC tuning
2. HDP and YARN are the base of all the services, hence make sure NN, DN, RM and NN are given sufficient heap.
3. Rule of thumb, heap sizing till 20 - 30 GB doesnt require any special tuning, anything above the value need fine tuning. <
4. 99% of time your cluster is not working efficiently because the services are suffering GC and appear RED in the UI.
5. Most of the Services have short lived connections and interactions hence always provide enough space to the young Generation in the order of 5 - 10 GB (depending on load and concurrency).
6. HiveServer2 , Spark Thrift Server, LLAP sever needs special attention as this service interacts with each and every component in cluster, slowness of any of the component will impact the connection Establishment time of these services.
7. Look for "retries/ retry" in your log, to know which services are slow and fine tune it.

HDFS

YARN

MapReduce2

HBase

Oozie


Zookeeper


Ambari Infra


Ambari Metrics - Hbase



Atlas



Spark Thrift Server1

Hive

https://community.hortonworks.com/articles/209789/making-hiveserver2-instance-handle-more-than-500-c.html

Ambari and Ambari views should run with a heap of at least 15 GB for it to be fast.

thub.nodes.view.add-new-comment
faqfaqhdp-2.5.0heapperformance
nn.jpg (48.9 kB)
nn-1.jpg (32.8 kB)
nn-3.jpg (9.0 kB)
nn-4.jpg (10.1 kB)
rm-1.jpg (7.7 kB)
dn.jpg (9.1 kB)
mr2.jpg (8.2 kB)
apptimeline.jpg (9.5 kB)
hbase1.jpg (77.4 kB)
rs3.jpg (21.0 kB)
oozie.jpg (8.4 kB)
zookeeper.jpg (10.0 kB)
ambariinfra.jpg (22.2 kB)
ams-hbase.jpg (75.3 kB)
atlas.jpg (22.5 kB)
sts1.jpg (8.6 kB)
sts12.jpg (158.1 kB)
sts14.jpg (46.6 kB)
Add comment
10 |6000 characters needed characters left characters exceeded
▼
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Viewable by all users

Article

Contributors

avatar image

Follow

Follow

avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image

Navigation

Tuning Heap Sizing for a 80 Node HDP cluster

Related Articles

HDFS Recovery Time from Single DataNode Failure

hive Insert to Dynamic Partition query Generating too many small files

Apache Kafka New Features Journey from HDP 2.2.0 to HDP 2.6.1

My-SQL Hive User creation and grants on it

Compression in HBase

Making single HiveServer2 instance handle more than 500 concurrent connections

HDP 2.5.6.0 is now released and includes about 60 Apache JIRA fixes

HBase Replication - FAQ

HDFS Balancer: Balancing Data Between Disks on a DataNode

What is HDFS Ozone?

This website uses cookies for analytics, personalisation and advertising. To learn more or change your cookie settings, please read our Cookie Policy. By continuing to browse, you agree to our use of cookies.

HCC Guidelines | HCC FAQs | HCC Privacy Policy | Privacy Policy | Terms of Service

© 2011-2019 Hortonworks Inc. All Rights Reserved.

Hadoop, Falcon, Atlas, Sqoop, Flume, Kafka, Pig, Hive, HBase, Accumulo, Storm, Solr, Spark, Ranger, Knox, Ambari, ZooKeeper, Oozie and the Hadoop elephant logo are trademarks of the Apache Software Foundation.

  • Anonymous
  • Login
  • Create
  • Ask a question
  • Add Repo
  • Create SupportKB
  • Create Article
  • Post Idea
  • Tracks
  • Community Help
  • Cloud & Operations
  • CyberSecurity
  • Data Ingestion & Streaming
  • Data Processing
  • Data Science & Advanced Analytics
  • Design & Architecture
  • Governance & Lifecycle
  • Hadoop Core
  • Sandbox & Learning
  • Security
  • Solutions
  • Explore
  • All Tags
  • All Questions
  • All Repos
  • All SKB
  • All Articles
  • All Ideas
  • All Users
  • Leaderboard
  • All Badges