Class DfsGarbageCollector


  • public class DfsGarbageCollector
    extends Object
    Repack and garbage collect a repository.
    • Constructor Detail

      • DfsGarbageCollector

        public DfsGarbageCollector​(DfsRepository repository)
        Initialize a garbage collector.
        Parameters:
        repository - repository objects to be packed will be read from.
    • Method Detail

      • getPackConfig

        public PackConfig getPackConfig()
        Get configuration used to generate the new pack file.
        Returns:
        configuration used to generate the new pack file.
      • setPackConfig

        public DfsGarbageCollector setPackConfig​(PackConfig newConfig)
        Set the new configuration to use when creating the pack file.
        Parameters:
        newConfig - the new configuration to use when creating the pack file.
        Returns:
        this
      • setReftableConfig

        public DfsGarbageCollector setReftableConfig​(ReftableConfig cfg)
        Set configuration to write a reftable.
        Parameters:
        cfg - configuration to write a reftable. Reftable writing is disabled (default) when cfg is null.
        Returns:
        this
      • setConvertToReftable

        public DfsGarbageCollector setConvertToReftable​(boolean convert)
        Whether the garbage collector should convert references to reftable.
        Parameters:
        convert - if true, setReftableConfig(ReftableConfig) has been set non-null, and a GC reftable doesn't yet exist, the garbage collector will make one by scanning the existing references, and writing a new reftable. Default is true.
        Returns:
        this
      • setIncludeDeletes

        public DfsGarbageCollector setIncludeDeletes​(boolean include)
        Whether the garbage collector will include tombstones for deleted references in the reftable.
        Parameters:
        include - if true, the garbage collector will include tombstones for deleted references in the reftable. Default is false.
        Returns:
        this
      • getCoalesceGarbageLimit

        public long getCoalesceGarbageLimit()
        Get coalesce garbage limit
        Returns:
        coalesce garbage limit, packs smaller than this size will be repacked.
      • setCoalesceGarbageLimit

        public DfsGarbageCollector setCoalesceGarbageLimit​(long limit)
        Set the byte size limit for garbage packs to be repacked.

        Any UNREACHABLE_GARBAGE pack smaller than this limit will be repacked at the end of the run. This allows the garbage collector to coalesce unreachable objects into a single file.

        If an UNREACHABLE_GARBAGE pack is already larger than this limit it will be left alone by the garbage collector. This avoids unnecessary disk IO reading and copying the objects.

        If limit is set to 0 the UNREACHABLE_GARBAGE coalesce is disabled.
        If limit is set to Long.MAX_VALUE, everything is coalesced.

        Keeping unreachable garbage prevents race conditions with repository changes that may suddenly need an object whose only copy was stored in the UNREACHABLE_GARBAGE pack.

        Parameters:
        limit - size in bytes.
        Returns:
        this
      • getGarbageTtlMillis

        public long getGarbageTtlMillis()
        Get time to live for garbage packs.
        Returns:
        garbage packs older than this limit (in milliseconds) will be pruned as part of the garbage collection process if the value is > 0, otherwise garbage packs are retained.
      • setGarbageTtl

        public DfsGarbageCollector setGarbageTtl​(long ttl,
                                                 TimeUnit unit)
        Set the time to live for garbage objects.

        Any UNREACHABLE_GARBAGE older than this limit will be pruned at the end of the run.

        If timeToLiveMillis is set to 0, UNREACHABLE_GARBAGE purging is disabled.

        Parameters:
        ttl - Time to live whatever unit is specified.
        unit - The specified time unit.
        Returns:
        this
      • pack

        public boolean pack​(ProgressMonitor pm)
                     throws IOException
        Create a single new pack file containing all of the live objects.

        This method safely decides which packs can be expired after the new pack is created by validating the references have not been modified in an incompatible way.

        Parameters:
        pm - progress monitor to receive updates on as packing may take a while, depending on the size of the repository.
        Returns:
        true if the repack was successful without race conditions. False if a race condition was detected and the repack should be run again later.
        Throws:
        IOException - a new pack cannot be created.
      • getSourcePacks

        public Set<DfsPackDescription> getSourcePacks()
        Get all of the source packs that fed into this compaction.
        Returns:
        all of the source packs that fed into this compaction.
      • getNewPacks

        public List<DfsPackDescription> getNewPacks()
        Get new packs created by this compaction.
        Returns:
        new packs created by this compaction.
      • getNewPackStatistics

        public List<PackStatistics> getNewPackStatistics()
        Get statistics corresponding to the getNewPacks().

        The elements can be null if the stat is not available for the pack file.

        Returns:
        statistics corresponding to the getNewPacks().