Pages

Tuesday 15 May 2012

Caching in Cassandra 1.1

In Cassandra 1.1 cache tuning has been completely changed to make it easier to use and tune.Each key cache hit saves 1 seek and each row cache hit saves 2 seeks at the minimum, sometimes more. The key cache is fairly tiny for the amount of time it saves, so it's worthwhile to use it at large numbers.The row cache saves even more time, but must store the whole values of its rows, so it is extremely space-intensive. It's best to only use the row cache if you have hot rows or static rows.

Cache configuration setting in "cassandra.yaml"

The main settings are key_cache_size_in_mb and row_cache_size_in_mb in cassandra.yaml.
Here is the part of the cassandra.yaml file where caching related setting has to do.
# Maximum size of the key cache in memory
# Default value is empty to make it "auto" (min(5% of Heap (in MB), 100MB)). Set to 0 to disable key cache.
key_cache_size_in_mb:



# Duration in seconds after which Cassandra should
# safe the keys cache. Caches are saved to saved_caches_directory as
# specified in this configuration file.
#
# Saved caches greatly improve cold-start speeds, and is relatively cheap in
# terms of I/O for the key cache. Row cache saving is much more expensive and
# has limited use.
#
# Default is 14400 or 4 hours.
key_cache_save_period: 14400



# Number of keys from the key cache to save
# Disabled by default, meaning all keys are going to be saved
# key_cache_keys_to_save: 100



# Maximum size of the row cache in memory.
# NOTE: if you reduce the size, you may not get you hottest keys loaded on startup.
#
# Default value is 0, to disable row caching.
row_cache_size_in_mb: 0



# Duration in seconds after which Cassandra should
# safe the row cache. Caches are saved to saved_caches_directory as specified
# in this configuration file.
#
# Saved caches greatly improve cold-start speeds, and is relatively cheap in
# terms of I/O for the key cache. Row cache saving is much more expensive and
# has limited use.
#
# Default is 0 to disable saving the row cache.
row_cache_save_period: 0



# Number of keys from the row cache to save
# Disabled by default, meaning all keys are going to be saved
# row_cache_keys_to_save: 100



# The provider for the row cache to use.
#
# Supported values are: ConcurrentLinkedHashCacheProvider, SerializingCacheProvider
#
# SerializingCacheProvider serialises the contents of the row and stores
# it in native memory, i.e., off the JVM Heap. Serialized rows take
# significantly less memory than "live" rows in the JVM, so you can cache
# more rows in a given memory footprint.  And storing the cache off-heap
# means you can use smaller heap sizes, reducing the impact of GC pauses.
#
# It is also valid to specify the fully-qualified class name to a class
# that implements org.apache.cassandra.cache.IRowCacheProvider.
#
# Defaults to SerializingCacheProvider
row_cache_provider: SerializingCacheProvider



# saved caches
saved_caches_directory: /var/lib/cassandra/saved_caches

How to Enable Per column family caching in cassandra 1.1

 

  1. Create keyspace using cassandra-cli


    create keyspace Keyspace1 with placement_strategy='SimpleStrategy' and strategy_options = {replication_factor:1};   
  2. Authenticate to newly created keyspace 


    use Keyspace1; 
  3.  Now you can create column families in this keyspace with various types of caching enabled

     
     create column family Standard1 with caching = 'ALL';
    
     create column family Standard2 with caching = 'KEYS_ONLY';
    
     create column family Standard2 with caching = 'ROWS_ONLY';
     
     create column family Standard3 with caching = 'NONE';
     

2 comments:

Alex Zarutin said...

Any ideas of why updating the existing CF fails?

ColumnFamily: index
"Values mapped to user ids."
Key Validation Class: org.apache.cassandra.db.marshal.BytesType
Default column value validator: org.apache.cassandra.db.marshal.BytesType
Columns sorted by: org.apache.cassandra.db.marshal.BytesType
GC grace seconds: 864000
Compaction min/max thresholds: 4/32
Read repair chance: 0.1
DC Local Read repair chance: 0.0
Replicate on write: true
Caching: NONE
Bloom Filter FP chance: default
Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
Compression Options:
sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor


[default@XXX] UPDATE COLUMN FAMILY index WITH caching='ROWS_ONLY';
Syntax error at position 21: mismatched input 'index' expecting set null

SAMARTH GAHIRE said...

Are you able to update other column families? Please try creating some test column family and try updating it. I think column family name "index" should not be used.