dedup info:
http://nutanix.blogspot.com/2013/08/introducing-nede-nutanix-elastic-dedup.html
Enabling Dedup
1.
- it can be done per container
ncli ctr edit name=xyz fingerprint-on-write=on
- it can be done per vdisk
ncli vdisk edit name NFS:2389 fingerprint-on-write=on
2. When you upgrade to 3.5.1, enable dedup right away if needed, as FP is done on write, we may not
get the benefit of dedup if enable later ( unless you vdisk manipulator tool is used).
Why it works on upgrade is curator converts 16 MB extent group to 4 MB extentgroup, so there
will be a lot write activities which will be used for Finger printing.
3. Increase medusa_extent_group_id_map_cache_size_mb=2048 (stargate gflag) to reduce evictions. Make sure of 24G CVM
4. Content cache and extent cache are allocated based on CVM memory (ENG-10798)
Monitoring Dedup example: - 3.5.2 prism GUI has dedup stats.
http://nutanix.blogspot.com/2013/08/introducing-nede-nutanix-elastic-dedup.html
Enabling Dedup
1.
- it can be done per container
ncli ctr edit name=xyz fingerprint-on-write=on
- it can be done per vdisk
ncli vdisk edit name NFS:2389 fingerprint-on-write=on
2. When you upgrade to 3.5.1, enable dedup right away if needed, as FP is done on write, we may not
get the benefit of dedup if enable later ( unless you vdisk manipulator tool is used).
Why it works on upgrade is curator converts 16 MB extent group to 4 MB extentgroup, so there
will be a lot write activities which will be used for Finger printing.
3. Increase medusa_extent_group_id_map_cache_size_mb=2048 (stargate gflag) to reduce evictions. Make sure of 24G CVM
4. Content cache and extent cache are allocated based on CVM memory (ENG-10798)
Monitoring Dedup example: - 3.5.2 prism GUI has dedup stats.
have to look around in 2009 page and Curator master logs.)
#0. Overall container usage & amount of data which is fingerprinted:
On master curator (data/logs/curator.INFO)
On master curator (data/logs/curator.INFO)
I1203 12:27:53.777602 curator_execute_job_op.cc:2452] ContainerEgroupFileSizeBytes[912] = 4168785526784
I1203 12:27:53.777606 curator_execute_job_op.cc:2452] ContainerEgroupFileSizeBytes[107907] = 1100420612096
I1203 12:27:53.777609 curator_execute_job_op.cc:2452] ContainerUntransformedSizeBytes[912] = 4136075722752
I1203 12:27:53.777613 curator_execute_job_op.cc:2452] ContainerUntransformedSizeBytes[107907] = 1100301139968
I1203 12:27:53.777616 curator_execute_job_op.cc:2452] ContainerInternalGarbageBytes[912] = 32709804032
I1203 12:27:53.777621 curator_execute_job_op.cc:2452] ContainerInternalGarbageBytes[107907] = 119472128
I1203 12:27:53.777623 curator_execute_job_op.cc:2452] ContainerFingerprintedBytes[912] = 3699631161344
From above, looks like customer has two containers (912 & 107907) and 912 is the one for which fingerprint on write has been turned on. Almost 90% of it is fingerprinted (3.6T out of 4.1T).
#1. Read Path Live (30sec) Metrics
wget -O- 'http://localhost:2009/h/vars?regex=stargate.content&format=text'
stargate/content_cache_adds 2921
stargate/content_cache_dedup_ref_count 2.4583 <--- effective RAM/Flash savings right now is ~2.5x
stargate/content_cache_evictions_flash 9815
stargate/content_cache_evictions_memory 15969
stargate/content_cache_flash_page_in_pct 4
stargate/content_cache_flash_spills 9815
stargate/content_cache_hits_pct 98 <---- good
stargate/content_cache_lookups 261701
stargate/content_cache_multi_touch_flash_max 21474836480
stargate/content_cache_multi_touch_flash_usage 21474836480
stargate/content_cache_multi_touch_memory_max 1899180856
stargate/content_cache_multi_touch_memory_usage 1900703744
stargate/content_cache_page_in_from_flash 13062
stargate/content_cache_single_touch_flash_max 0
stargate/content_cache_single_touch_flash_usage 0
stargate/content_cache_single_touch_memory_max 474795208
stargate/content_cache_single_touch_memory_usage 473272320
stargate/content_cache_usage_flash_mb 20480 <---- 20G flash
stargate/content_cache_usage_memory_mb 2264 <---- 2.2G RAM
#2. Medusa ExtentGroupId Map (where SHA1s are kept) hit/miss info:
wget -O- 'http://localhost:2009/h/vars?regex=medusa.cache.extent_group_id_map&format=text'
medusa/cache/extent_group_id_map/current_size_bytes 268430856
medusa/cache/extent_group_id_map/entries 9496
medusa/cache/extent_group_id_map/evictions 293020 <<< to reduce evictions increase the extent_group_id cache.
medusa/cache/extent_group_id_map/hits 28665029
medusa/cache/extent_group_id_map/misses 923660
medusa/cache/extent_group_id_map/insertions 9417038
medusa/cache/extent_group_id_map/max_size_bytes 268435456 ---> 256MB cache size for this map
Medusa Extent ID map hit ratio is 96.8% (28665029/(28665029+923660)).
After increasing medusa extent id cache to 2G:
medusa/cache/extent_group_id_map/current_size_bytes 1644951348
medusa/cache/extent_group_id_map/entries 75678 (increased)
medusa/cache/extent_group_id_map/evictions 0 ( zero evictions)
medusa/cache/extent_group_id_map/hits 71550917
medusa/cache/extent_group_id_map/insertions 24404990
medusa/cache/extent_group_id_map/max_size_bytes 2147483648
medusa/cache/extent_group_id_map/misses 875467 (reduced)
#3. Write Path Live (30sec) Metrics
wget -O- 'http://localhost:2009/h/vars?regex=stargate.dedup&format=text'
stargate/dedup_fingerprint_added_bytes 8355840
stargate/dedup_fingerprint_cleared_bytes 8290304
III. Disable Dedup:
ncli ctr edit name=xyz fingerprint-on-write=off
stargate.gflags: stargate_disable_dedup_on_read=true
III. Disable Dedup:
ncli ctr edit name=xyz fingerprint-on-write=off
stargate.gflags: stargate_disable_dedup_on_read=true