Note: These two commands should work and has been tested in QA.
ncli host start-remove id=X
ncli host get-remove-status
ncli host remove-finish id=X says that host MARKED_FOR _REMOVAL_B UT_NOT_DET ACHABLE
ncli host start-remove id=X
ncli host get-remove-status
ncli host remove-finish id=X says that host MARKED_FOR
Look at dynamic_ring_changer.out and dynamic_ring_changer.INF), Cassandra Logs for any crashes,
In older version of the NOS, if you have to manually remove the node, follow this procedure with
help from nutanix support
if it does not complete.
Please note that the following explains how a node removal works: (please note that using the procedure other than Nutanix support/Engineering will cause data corruption)
Please note that the following explains how a node removal works: (please note that using the procedure other than Nutanix support/Engineering will cause data corruption)
In older version of the NOS, if you have to manually remove the node, follow this procedure with
help from nutanix support
1. nodetool -h localhost ring - find the token
2. dynamic remove node
ring_changer -node_ring_token= O8yDYNgicJraBlfZrHAsORTseZq0MQ
( you can skip_keyspaces="stats")
ring_changer -node_ring_token="KlbSuLpdEFJDIoUXp1TPVEwmcYrlo9pJlm4yHemOtpmnBowMcyYQAbTcF8Vh" -dynamic_ring_change=true -skip_keyspaces="stats,alerts_keyspace,alerts_index_keyspace,medusa_extentgroupaccessdatamap,pithos"
ring_changer -node_ring_token="KlbSuLpdEFJDIoUXp1TPVEwmcYrlo9pJlm4yHemOtpmnBowMcyYQAbTcF8Vh" -dynamic_ring_change=true -skip_keyspaces="stats,alerts_keyspace,alerts_index_keyspace,medusa_extentgroupaccessdatamap,pithos"
3. Look at dynamic_ring_changer.out and dynamic_ring_changer.INFO, Cassandra Logs for any crashes,
if it does not complete.
4. compact and remove stats if there are cassandra crashes
a. Connect to cassandra using 'cassandra-cli'
cassandra-cli -h localhost -p 9160 -k stats
b. Remove the desired rows:
del stats_realtime['PWhm:vdisk_usage'];
del stats_realtime['nYeQ:vdisk_derived_usage'];
del stats_realtime['ha4h:vdisk_perf'];
del stats_realtime['E0Tl:vdisk_frontend_adapter_perf'];
del stats_realtime['E0Tl:vdisk_frontend_adapter_perf'];
c. Exit
quit;
d.for i in `svmips`; do echo $i;ssh $i "source /etc/profile;nodetool -h localhost compact";done
d.for i in `svmips`; do echo $i;ssh $i "source /etc/profile;nodetool -h localhost compact";done
5. run the ring_changer again
6. Verify all the data has replication 2 (inspite the node to be removed is powered off)
for i in `svmips`; do ssh $i "cd data/stargate-storage/disks; find . -name \*.egroup -print|cut -d/ -f6"; done|sort | uniq -c|grep -v " 2 "
It will print extent groups with replication other than 2 .
7. Now we can verify zeus_config (zeus_config_printer)
- make sure all the disks have data migrated is true and disk marked to remove.(
to_remove: true
data_migrated: true)
data_migrated: true)
- node has kOkToBeRemoved and node_removal_ack: 273 (0x111)
0x100 - zookeeper ok to be removed
0x10 -- cassandra ok to be removed
0x1 -curator ok to be removed.
0x10 -- cassandra ok to be removed
0x1 -curator ok to be removed.
8. Now run
ncli host remove-finish id=X
9. nodetool -h localhost removetoken O8yDYNgicJraBlfZrHAsORTseZq0MQ
This comment has been removed by a blog administrator.
ReplyDeletethank you for the blog visit us forSAN Solutions in Dubai
ReplyDelete