Class DfsBlockCache


  • public final class DfsBlockCache
    extends Object
    Caches slices of a BlockBasedFile in memory for faster read access.

    The DfsBlockCache serves as a Java based "buffer cache", loading segments of a BlockBasedFile into the JVM heap prior to use. As JGit often wants to do reads of only tiny slices of a file, the DfsBlockCache tries to smooth out these tiny reads into larger block-sized IO operations.

    Whenever a cache miss occurs, loading is invoked by exactly one thread for the given (DfsStreamKey,position) key tuple. This is ensured by an array of locks, with the tuple hashed to a lock instance.

    Its too expensive during object access to be accurate with a least recently used (LRU) algorithm. Strictly ordering every read is a lot of overhead that typically doesn't yield a corresponding benefit to the application. This cache implements a clock replacement algorithm, giving each block at least one chance to have been accessed during a sweep of the cache to save itself from eviction. The number of swipe chances is configurable per pack extension.

    Entities created by the cache are held under hard references, preventing the Java VM from clearing anything. Blocks are discarded by the replacement algorithm when adding a new block would cause the cache to exceed its configured maximum size.

    The key tuple is passed through to methods as a pair of parameters rather than as a single Object, thus reducing the transient memory allocations of callers. It is more efficient to avoid the allocation, as we can't be 100% sure that a JIT would be able to stack-allocate a key tuple.

    The internal hash table does not expand at runtime, instead it is fixed in size at cache creation time. The internal lock table used to gate load invocations is also fixed in size.

    • Method Detail

      • reconfigure

        public static void reconfigure​(DfsBlockCacheConfig cfg)
        Modify the configuration of the window cache.

        The new configuration is applied immediately, and the existing cache is cleared.

        Parameters:
        cfg - the new window cache configuration.
        Throws:
        IllegalArgumentException - the cache configuration contains one or more invalid settings, usually too low of a limit.
      • getInstance

        public static DfsBlockCache getInstance()
        Get the currently active DfsBlockCache.
        Returns:
        the currently active DfsBlockCache.
      • getCurrentSize

        public long[] getCurrentSize()
        Get total number of bytes in the cache, per pack file extension.
        Returns:
        total number of bytes in the cache, per pack file extension.
      • getFillPercentage

        public long getFillPercentage()
        Get 0..100, defining how full the cache is.
        Returns:
        0..100, defining how full the cache is.
      • getHitCount

        public long[] getHitCount()
        Get number of requests for items in the cache, per pack file extension.
        Returns:
        number of requests for items in the cache, per pack file extension.
      • getMissCount

        public long[] getMissCount()
        Get number of requests for items not in the cache, per pack file extension.
        Returns:
        number of requests for items not in the cache, per pack file extension.
      • getTotalRequestCount

        public long[] getTotalRequestCount()
        Get total number of requests (hit + miss), per pack file extension.
        Returns:
        total number of requests (hit + miss), per pack file extension.
      • getHitRatio

        public long[] getHitRatio()
        Get hit ratios
        Returns:
        hit ratios
      • getEvictions

        public long[] getEvictions()
        Get number of evictions performed due to cache being full, per pack file extension.
        Returns:
        number of evictions performed due to cache being full, per pack file extension.
      • hasBlock0

        public boolean hasBlock0​(DfsStreamKey key)
        Quickly check if the cache contains block 0 of the given stream.

        This can be useful for sophisticated pre-read algorithms to quickly determine if a file is likely already in cache, especially small reftables which may be smaller than a typical DFS block size.

        Parameters:
        key - the file to check.
        Returns:
        true if block 0 (the first block) is in the cache.