Hortonworks.com
  • Explore
    • All Tags
    • All Questions
    • All Articles
    • All Ideas
    • All Repos
    • All SKB
    • All Users
    • All Badges
    • Leaderboard
  • Create
    • Ask a question
    • Create Article
    • Post Idea
    • Add Repo
  • Tracks
    • All Tracks
    • Community Help
    • Cloud & Operations
    • CyberSecurity
    • Data Ingestion & Streaming
    • Data Processing
    • Data Science & Advanced Analytics
    • Design & Architecture
    • Governance & Lifecycle
    • Hadoop Core
    • Sandbox & Learning
    • Security
    • Solutions
  • Login
HCC Hortonworks Community Connection
  • Home /
  • Hadoop Core /
  • Home /
  • Hadoop Core /
avatar image

Namenode Restart fails during HDP Upgrade.

  • Export to PDF
Article by Gaurav Sharma · Dec 24, 2016 at 03:05 PM
3

Short Description:

Namenode restart fails during upgrade due to a delayed exit from safemode.

Article

SYMPTOMS: During HDP upgrades, namenode restart would fail leading to upgrade failure. Following errors are usually seen:-

Traceback (most recent call last):File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/jmx.py", line 42, in get_value_from_jmxreturn data_dict["beans"][0][property]
IndexError: list index out of range
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 420, in <module>NameNode().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 280, in executemethod(env)
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 720, in restartself.start(env, upgrade_type=upgrade_type)
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 101, in startupgrade_suspended=params.upgrade_suspended, env=env)
File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunkreturn fn(*args, **kwargs)
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 184, in namenodeif is_this_namenode_active() is False:
File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/decorator.py", line 55, in wrapperreturn function(*args, **kwargs)
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 554, in is_this_namenode_active
raise Fail(format("The NameNode {namenode_id} is not listed as Active or Standby, waiting..."))
resource_management.core.exceptions.Fail: The NameNode nn1 is not listed as Active or Standby, waiting...

ROOT CAUSE: Starting from Ambari 2.4, when the cluster is large, HDP upgrade fails during namenode restart.This is because, restart command waits for namenode to come out of safemode and if the cluster size is large, namenode takes more time to leave safemode but Ambari marks this action as failure as the namenode didn't leave safemode within the configured timeout in Ambari scripts.The issue has been reported in AMBARI-18786

SOLUTION: Upgrade to Ambari 2.5

WORKAROUND: Increase the timeout for ambari as follows:-

1. Increase the timeout in

/var/lib/ambari-server/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py

From this:

@retry(times=5, sleep_time=5, backoff_factor=2, err_class=Fail)

To this:

@retry(times=25, sleep_time=25, backoff_factor=2, err_class=Fail)

2. Restart Ambari server

thub.nodes.view.add-new-comment
Issue ResolutionHDFSambari-servernamenode
Add comment · Show 2
10 |6000 characters needed characters left characters exceeded
▼
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Viewable by all users

Up to 5 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

avatar image S Mahapatra · May 11, 2018 at 09:07 AM 0
Share
hi @Gaurav 
I have similar issue on amabri 2.6.1 post upgrade.

My python version is still 2.6

can you help me please?
2018-05-11 17:06:21,660 - Getting jmx metrics from NN failed. URL: http://x.y.z:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/jmx.py", line 42, in get_value_from_jmx
    return data_dict["beans"][0][property]
IndexError: list index out of range
Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 361, in <module>
    NameNode().execute()
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 375, in execute
    method(env)
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 978, in restart
    self.start(env, upgrade_type=upgrade_type)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 99, in start
    upgrade_suspended=params.upgrade_suspended, env=env)
  File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
    return fn(*args, **kwargs)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 203, in namenode
    if is_this_namenode_active() is False:
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/decorator.py", line 62, in wrapper
    return function(*args, **kwargs)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 611, in is_this_namenode_active
    raise Fail(format("The NameNode {namenode_id} is not listed as Active or Standby, waiting..."))
resource_management.core.exceptions.Fail: The NameNode nn1 is not listed as Active or Standby, waiting...
avatar image S Mahapatra · May 13, 2018 at 06:46 PM 0
Share

resolved. For me it was problem with one of the JN

Article

Contributors

avatar image

avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image
avatar image avatar image avatar image avatar image avatar image

Navigation

Namenode Restart fails during HDP Upgrade.
  • Set log level of namenode

Related Articles

Scaling the HDFS NameNode (part 2)

Scaling the HDFS NameNode (part 4) - Avoiding Performance Pitfalls

Scaling the HDFS NameNode (part 3) - RPC scalability features

Set log level of namenode

Namenode crashes with SEGFAULT when using JniBasedUnixGroupsMapping

HDFS Balancer (3): Cluster Balancing Algorithm

NodeManager Web UI connection timeouts; always 5 seconds

How to Recover Accidentally deleted file (with -skipTrash) in HDFS ?

Namenode HA : Namenode enters 'SERVICE_NOT_RESPONDING' state and causes frequent flip over

We had problem with namenode starts

This website uses cookies for analytics, personalisation and advertising. To learn more or change your cookie settings, please read our Cookie Policy. By continuing to browse, you agree to our use of cookies.

HCC Guidelines | HCC FAQs | HCC Privacy Policy | Privacy Policy | Terms of Service

© 2011-2019 Hortonworks Inc. All Rights Reserved.

Hadoop, Falcon, Atlas, Sqoop, Flume, Kafka, Pig, Hive, HBase, Accumulo, Storm, Solr, Spark, Ranger, Knox, Ambari, ZooKeeper, Oozie and the Hadoop elephant logo are trademarks of the Apache Software Foundation.

  • Anonymous
  • Login
  • Create
  • Ask a question
  • Create Article
  • Post Idea
  • Add Repo
  • Create SupportKB
  • Tracks
  • Community Help
  • Cloud & Operations
  • CyberSecurity
  • Data Ingestion & Streaming
  • Data Processing
  • Data Science & Advanced Analytics
  • Design & Architecture
  • Governance & Lifecycle
  • Hadoop Core
  • Sandbox & Learning
  • Security
  • Solutions
  • Explore
  • All Tags
  • All Questions
  • All Articles
  • All Ideas
  • All Repos
  • All SKB
  • All Users
  • Leaderboard
  • All Badges