Class PackParser
- java.lang.Object
-
- org.eclipse.jgit.transport.PackParser
-
- Direct Known Subclasses:
DfsPackParser
,FsckPackParser
,ObjectDirectoryPackParser
public abstract class PackParser extends Object
Parses a pack stream and imports it for anObjectInserter
.Applications can acquire an instance of a parser from ObjectInserter's
ObjectInserter.newPackParser(InputStream)
method.Implementations of
ObjectInserter
should subclass this type and provide their own logic for the variouson*()
event methods declared to be abstract.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
PackParser.ObjectTypeAndSize
Type and size information about an object in the database buffer.static class
PackParser.Source
Location data is being obtained from.static class
PackParser.UnresolvedDelta
Information about an unresolved delta in this pack stream.
-
Constructor Summary
Constructors Modifier Constructor Description protected
PackParser(ObjectDatabase odb, InputStream src)
Initialize a pack parser.
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected byte[]
buffer()
Get a temporary byte array for use by the caller.protected abstract boolean
checkCRC(int oldCRC)
Check the current CRC matches the expected value.ObjectIdSubclassMap<ObjectId>
getBaseObjectIds()
Get set of objects the incoming pack assumed for delta purposesString
getLockMessage()
Get the message to record with the pack lock.ObjectIdSubclassMap<ObjectId>
getNewObjectIds()
Get the new objects that were sent by the userPackedObjectInfo
getObject(int nth)
Get the information about the requested object.int
getObjectCount()
Get the number of objects in the stream.long
getPackSize()
Get the size of the newly created pack.ReceivedPackStatistics
getReceivedPackStatistics()
Returns the statistics of the parsed pack.List<PackedObjectInfo>
getSortedObjectList(Comparator<PackedObjectInfo> cmp)
Get all of the objects, sorted by their name.boolean
isAllowThin()
Whether a thin pack (missing base objects) is permitted.boolean
isCheckEofAfterPackFooter()
Whether the EOF should be read from the input after the footer.protected boolean
isCheckObjectCollisions()
Whether received objects are verified to prevent collisions.boolean
isExpectDataAfterPackFooter()
Whether there is data expected after the pack footer.protected PackedObjectInfo
newInfo(AnyObjectId id, PackParser.UnresolvedDelta delta, ObjectId deltaBase)
Construct a PackedObjectInfo instance for this parser.protected abstract boolean
onAppendBase(int typeCode, byte[] data, PackedObjectInfo info)
Provide the implementation with a base that was outside of the pack.protected abstract void
onBeginOfsDelta(long deltaStreamPosition, long baseStreamPosition, long inflatedSize)
Event notifying start of a delta referencing its base by offset.protected abstract void
onBeginRefDelta(long deltaStreamPosition, AnyObjectId baseId, long inflatedSize)
Event notifying start of a delta referencing its base by ObjectId.protected abstract void
onBeginWholeObject(long streamPosition, int type, long inflatedSize)
Event notifying the start of an object stored whole (not as a delta).protected PackParser.UnresolvedDelta
onEndDelta()
Event notifying the current object.protected abstract void
onEndThinPack()
Event indicating a thin pack has been completely processed.protected abstract void
onEndWholeObject(PackedObjectInfo info)
Event notifying the current object.protected abstract void
onInflatedObjectData(PackedObjectInfo obj, int typeCode, byte[] data)
Invoked for commits, trees, tags, and small blobs.protected abstract void
onObjectData(PackParser.Source src, byte[] raw, int pos, int len)
Store (and/or checksum) a portion of an object's data.protected abstract void
onObjectHeader(PackParser.Source src, byte[] raw, int pos, int len)
Store (and/or checksum) an object header.protected abstract void
onPackFooter(byte[] hash)
Provide the implementation with the original stream's pack footer.protected abstract void
onPackHeader(long objCnt)
Provide the implementation with the original stream's pack header.protected abstract void
onStoreStream(byte[] raw, int pos, int len)
Store bytes received from the raw stream.PackLock
parse(ProgressMonitor progress)
Parse the pack stream.PackLock
parse(ProgressMonitor receiving, ProgressMonitor resolving)
Parse the pack stream.protected abstract int
readDatabase(byte[] dst, int pos, int cnt)
Read from the database's current position into the buffer.protected PackParser.ObjectTypeAndSize
readObjectHeader(PackParser.ObjectTypeAndSize info)
Read the header of the current object.protected abstract PackParser.ObjectTypeAndSize
seekDatabase(PackedObjectInfo obj, PackParser.ObjectTypeAndSize info)
Reposition the database to re-read a previously stored object.protected abstract PackParser.ObjectTypeAndSize
seekDatabase(PackParser.UnresolvedDelta delta, PackParser.ObjectTypeAndSize info)
Reposition the database to re-read a previously stored object.void
setAllowThin(boolean allow)
Configure this index pack instance to allow a thin pack.void
setCheckEofAfterPackFooter(boolean b)
Ensure EOF is read from the input stream after the footer.protected void
setCheckObjectCollisions(boolean check)
Enable checking for collisions with existing objects.void
setExpectDataAfterPackFooter(boolean e)
Set if there is additional data in InputStream after pack.protected void
setExpectedObjectCount(long expectedObjectCount)
Set the expected number of objects in the pack stream.void
setLockMessage(String msg)
Set the lock message for the incoming pack data.void
setMaxObjectSizeLimit(long limit)
Set the maximum allowed Git object size.void
setNeedBaseObjectIds(boolean b)
Configure this index pack instance to keep track of the objects assumed for delta bases.void
setNeedNewObjectIds(boolean b)
Configure this index pack instance to keep track of new objects.void
setObjectChecker(ObjectChecker oc)
Configure the checker used to validate received objects.void
setObjectChecking(boolean on)
Configure the checker used to validate received objects.protected void
verifySafeObject(AnyObjectId id, int type, byte[] data)
Verify the integrity of the object.
-
-
-
Constructor Detail
-
PackParser
protected PackParser(ObjectDatabase odb, InputStream src)
Initialize a pack parser.- Parameters:
odb
- database the parser will write its objects into.src
- the stream the parser will read.
-
-
Method Detail
-
isAllowThin
public boolean isAllowThin()
Whether a thin pack (missing base objects) is permitted.- Returns:
true
if a thin pack (missing base objects) is permitted.
-
setAllowThin
public void setAllowThin(boolean allow)
Configure this index pack instance to allow a thin pack.Thin packs are sometimes used during network transfers to allow a delta to be sent without a base object. Such packs are not permitted on disk.
- Parameters:
allow
- true to enable a thin pack.
-
isCheckObjectCollisions
protected boolean isCheckObjectCollisions()
Whether received objects are verified to prevent collisions.- Returns:
- if true received objects are verified to prevent collisions.
- Since:
- 4.1
-
setCheckObjectCollisions
protected void setCheckObjectCollisions(boolean check)
Enable checking for collisions with existing objects.By default PackParser looks for each received object in the repository. If the object already exists, the existing object is compared byte-for-byte with the newly received copy to ensure they are identical. The receive is aborted with an exception if any byte differs. This check is necessary to prevent an evil attacker from supplying a replacement object into this repository in the event that a discovery enabling SHA-1 collisions is made.
This check may be very costly to perform, and some repositories may have other ways to segregate newly received object data. The check is enabled by default, but can be explicitly disabled if the implementation can provide the same guarantee, or is willing to accept the risks associated with bypassing the check.
- Parameters:
check
- true to enable collision checking (strongly encouraged).- Since:
- 4.1
-
setNeedNewObjectIds
public void setNeedNewObjectIds(boolean b)
Configure this index pack instance to keep track of new objects.By default an index pack doesn't save the new objects that were created when it was instantiated. Setting this flag to
true
allows the caller to usegetNewObjectIds()
to retrieve that list.- Parameters:
b
-true
to enable keeping track of new objects.
-
setNeedBaseObjectIds
public void setNeedBaseObjectIds(boolean b)
Configure this index pack instance to keep track of the objects assumed for delta bases.By default an index pack doesn't save the objects that were used as delta bases. Setting this flag to
true
will allow the caller to usegetBaseObjectIds()
to retrieve that list.- Parameters:
b
-true
to enable keeping track of delta bases.
-
isCheckEofAfterPackFooter
public boolean isCheckEofAfterPackFooter()
Whether the EOF should be read from the input after the footer.- Returns:
- true if the EOF should be read from the input after the footer.
-
setCheckEofAfterPackFooter
public void setCheckEofAfterPackFooter(boolean b)
Ensure EOF is read from the input stream after the footer.- Parameters:
b
- true if the EOF should be read; false if it is not checked.
-
isExpectDataAfterPackFooter
public boolean isExpectDataAfterPackFooter()
Whether there is data expected after the pack footer.- Returns:
- true if there is data expected after the pack footer.
-
setExpectDataAfterPackFooter
public void setExpectDataAfterPackFooter(boolean e)
Set if there is additional data in InputStream after pack.- Parameters:
e
- true if there is additional data in InputStream after pack. This requires the InputStream to support the mark and reset functions.
-
getNewObjectIds
public ObjectIdSubclassMap<ObjectId> getNewObjectIds()
Get the new objects that were sent by the user- Returns:
- the new objects that were sent by the user
-
getBaseObjectIds
public ObjectIdSubclassMap<ObjectId> getBaseObjectIds()
Get set of objects the incoming pack assumed for delta purposes- Returns:
- set of objects the incoming pack assumed for delta purposes
-
setObjectChecker
public void setObjectChecker(ObjectChecker oc)
Configure the checker used to validate received objects.Usually object checking isn't necessary, as Git implementations only create valid objects in pack files. However, additional checking may be useful if processing data from an untrusted source.
- Parameters:
oc
- the checker instance; null to disable object checking.
-
setObjectChecking
public void setObjectChecking(boolean on)
Configure the checker used to validate received objects.Usually object checking isn't necessary, as Git implementations only create valid objects in pack files. However, additional checking may be useful if processing data from an untrusted source.
This is shorthand for:
setObjectChecker(on ? new ObjectChecker() : null);
- Parameters:
on
- true to enable the default checker; false to disable it.
-
getLockMessage
public String getLockMessage()
Get the message to record with the pack lock.- Returns:
- the message to record with the pack lock.
-
setLockMessage
public void setLockMessage(String msg)
Set the lock message for the incoming pack data.- Parameters:
msg
- if not null, the message to associate with the incoming data while it is locked to prevent garbage collection.
-
setMaxObjectSizeLimit
public void setMaxObjectSizeLimit(long limit)
Set the maximum allowed Git object size.If an object is larger than the given size the pack-parsing will throw an exception aborting the parsing.
- Parameters:
limit
- the Git object size limit. If zero then there is not limit.
-
getObjectCount
public int getObjectCount()
Get the number of objects in the stream.The object count is only available after
parse(ProgressMonitor)
has returned. The count may have been increased if the stream was a thin pack, and missing bases objects were appending onto it by the subclass.- Returns:
- number of objects parsed out of the stream.
-
getObject
public PackedObjectInfo getObject(int nth)
Get the information about the requested object.The object information is only available after
parse(ProgressMonitor)
has returned.- Parameters:
nth
- index of the object in the stream. Must be between 0 andgetObjectCount()
-1.- Returns:
- the object information.
-
getSortedObjectList
public List<PackedObjectInfo> getSortedObjectList(Comparator<PackedObjectInfo> cmp)
Get all of the objects, sorted by their name.The object information is only available after
parse(ProgressMonitor)
has returned.To maintain lower memory usage and good runtime performance, this method sorts the objects in-place and therefore impacts the ordering presented by
getObject(int)
.- Parameters:
cmp
- comparison function, if null objects are stored by ObjectId.- Returns:
- sorted list of objects in this pack stream.
-
getPackSize
public long getPackSize()
Get the size of the newly created pack.This will also include the pack index size if an index was created. This method should only be called after pack parsing is finished.
- Returns:
- the pack size (including the index size) or -1 if the size cannot be determined
- Since:
- 3.3
-
getReceivedPackStatistics
public ReceivedPackStatistics getReceivedPackStatistics()
Returns the statistics of the parsed pack.This should only be called after pack parsing is finished.
- Returns:
ReceivedPackStatistics
- Since:
- 4.6
-
parse
public final PackLock parse(ProgressMonitor progress) throws IOException
Parse the pack stream.- Parameters:
progress
- callback to provide progress feedback during parsing. If null,NullProgressMonitor
will be used.- Returns:
- the pack lock, if one was requested by setting
setLockMessage(String)
. - Throws:
IOException
- the stream is malformed, or contains corrupt objects.- Since:
- 6.0
-
parse
public PackLock parse(ProgressMonitor receiving, ProgressMonitor resolving) throws IOException
Parse the pack stream.- Parameters:
receiving
- receives progress feedback during the initial receiving objects phase. If null,NullProgressMonitor
will be used.resolving
- receives progress feedback during the resolving objects phase.- Returns:
- the pack lock, if one was requested by setting
setLockMessage(String)
. - Throws:
IOException
- the stream is malformed, or contains corrupt objects.- Since:
- 6.0
-
readObjectHeader
protected PackParser.ObjectTypeAndSize readObjectHeader(PackParser.ObjectTypeAndSize info) throws IOException
Read the header of the current object.After the header has been parsed, this method automatically invokes
onObjectHeader(Source, byte[], int, int)
to allow the implementation to update its internal checksums for the bytes read.When this method returns the database will be positioned on the first byte of the deflated data stream.
- Parameters:
info
- the info object to populate.- Returns:
info
, after populating.- Throws:
IOException
- the size cannot be read.
-
verifySafeObject
protected void verifySafeObject(AnyObjectId id, int type, byte[] data) throws CorruptObjectException
Verify the integrity of the object.- Parameters:
id
- identity of the object to be checked.type
- the type of the object.data
- raw content of the object.- Throws:
CorruptObjectException
- Since:
- 4.9
-
buffer
protected byte[] buffer()
Get a temporary byte array for use by the caller.- Returns:
- a temporary byte array for use by the caller.
-
newInfo
protected PackedObjectInfo newInfo(AnyObjectId id, PackParser.UnresolvedDelta delta, ObjectId deltaBase)
Construct a PackedObjectInfo instance for this parser.- Parameters:
id
- identity of the object to be tracked.delta
- if the object was previously an unresolved delta, this is the delta object that was tracking it. Otherwise null.deltaBase
- if the object was previously an unresolved delta, this is the ObjectId of the base of the delta. The base may be outside of the pack stream if the stream was a thin-pack.- Returns:
- info object containing this object's data.
-
setExpectedObjectCount
protected void setExpectedObjectCount(long expectedObjectCount)
Set the expected number of objects in the pack stream.The object count in the pack header is not always correct for some Dfs pack files. e.g. INSERT pack always assume 1 object in the header since the actual object count is unknown when the pack is written.
If external implementation wants to overwrite the expectedObjectCount, they should call this method during
onPackHeader(long)
.- Parameters:
expectedObjectCount
- a long.- Since:
- 4.9
-
onStoreStream
protected abstract void onStoreStream(byte[] raw, int pos, int len) throws IOException
Store bytes received from the raw stream.This method is invoked during
parse(ProgressMonitor)
as data is consumed from the incoming stream. Implementors may use this event to archive the raw incoming stream to the destination repository in large chunks, without paying attention to object boundaries.The only component of the pack not supplied to this method is the last 20 bytes of the pack that comprise the trailing SHA-1 checksum. Those are passed to
onPackFooter(byte[])
.- Parameters:
raw
- buffer to copy data out of.pos
- first offset within the buffer that is valid.len
- number of bytes in the buffer that are valid.- Throws:
IOException
- the stream cannot be archived.
-
onObjectHeader
protected abstract void onObjectHeader(PackParser.Source src, byte[] raw, int pos, int len) throws IOException
Store (and/or checksum) an object header.Invoked after any of the
onBegin()
events. The entire header is supplied in a single invocation, before any object data is supplied.- Parameters:
src
- where the data came fromraw
- buffer to read data from.pos
- first offset within buffer that is valid.len
- number of bytes in buffer that are valid.- Throws:
IOException
- the stream cannot be archived.
-
onObjectData
protected abstract void onObjectData(PackParser.Source src, byte[] raw, int pos, int len) throws IOException
Store (and/or checksum) a portion of an object's data.This method may be invoked multiple times per object, depending on the size of the object, the size of the parser's internal read buffer, and the alignment of the object relative to the read buffer.
Invoked after
onObjectHeader(Source, byte[], int, int)
.- Parameters:
src
- where the data came fromraw
- buffer to read data from.pos
- first offset within buffer that is valid.len
- number of bytes in buffer that are valid.- Throws:
IOException
- the stream cannot be archived.
-
onInflatedObjectData
protected abstract void onInflatedObjectData(PackedObjectInfo obj, int typeCode, byte[] data) throws IOException
Invoked for commits, trees, tags, and small blobs.- Parameters:
obj
- the object info, populated.typeCode
- the type of the object.data
- inflated data for the object.- Throws:
IOException
- the object cannot be archived.
-
onPackHeader
protected abstract void onPackHeader(long objCnt) throws IOException
Provide the implementation with the original stream's pack header.- Parameters:
objCnt
- number of objects expected in the stream.- Throws:
IOException
- the implementation refuses to work with this many objects.
-
onPackFooter
protected abstract void onPackFooter(byte[] hash) throws IOException
Provide the implementation with the original stream's pack footer.- Parameters:
hash
- the trailing 20 bytes of the pack, this is a SHA-1 checksum of all of the pack data.- Throws:
IOException
- the stream cannot be archived.
-
onAppendBase
protected abstract boolean onAppendBase(int typeCode, byte[] data, PackedObjectInfo info) throws IOException
Provide the implementation with a base that was outside of the pack.This event only occurs on a thin pack for base objects that were outside of the pack and came from the local repository. Usually an implementation uses this event to compress the base and append it onto the end of the pack, so the pack stays self-contained.
- Parameters:
typeCode
- type of the base object.data
- complete content of the base object.info
- packed object information for this base. Implementors must populate the CRC and offset members if returning true.- Returns:
- true if the
info
should be included in the object list returned bygetSortedObjectList(Comparator)
, false if it should not be included. - Throws:
IOException
- the base could not be included into the pack.
-
onEndThinPack
protected abstract void onEndThinPack() throws IOException
Event indicating a thin pack has been completely processed.This event is invoked only if a thin pack has delta references to objects external from the pack. The event is called after all of those deltas have been resolved.
- Throws:
IOException
- the pack cannot be archived.
-
seekDatabase
protected abstract PackParser.ObjectTypeAndSize seekDatabase(PackedObjectInfo obj, PackParser.ObjectTypeAndSize info) throws IOException
Reposition the database to re-read a previously stored object.If the database is computing CRC-32 checksums for object data, it should reset its internal CRC instance during this method call.
- Parameters:
obj
- the object position to begin reading from. This is fromnewInfo(AnyObjectId, UnresolvedDelta, ObjectId)
.info
- object to populate with type and size.- Returns:
- the
info
object. - Throws:
IOException
- the database cannot reposition to this location.
-
seekDatabase
protected abstract PackParser.ObjectTypeAndSize seekDatabase(PackParser.UnresolvedDelta delta, PackParser.ObjectTypeAndSize info) throws IOException
Reposition the database to re-read a previously stored object.If the database is computing CRC-32 checksums for object data, it should reset its internal CRC instance during this method call.
- Parameters:
delta
- the object position to begin reading from. This is an instance previously returned byonEndDelta()
.info
- object to populate with type and size.- Returns:
- the
info
object. - Throws:
IOException
- the database cannot reposition to this location.
-
readDatabase
protected abstract int readDatabase(byte[] dst, int pos, int cnt) throws IOException
Read from the database's current position into the buffer.- Parameters:
dst
- the buffer to copy read data into.pos
- position withindst
to start copying data into.cnt
- ideal target number of bytes to read. Actual read length may be shorter.- Returns:
- number of bytes stored.
- Throws:
IOException
- the database cannot be accessed.
-
checkCRC
protected abstract boolean checkCRC(int oldCRC)
Check the current CRC matches the expected value.This method is invoked when an object is read back in from the database and its data is used during delta resolution. The CRC is validated after the object has been fully read, allowing the parser to verify there was no silent data corruption.
Implementations are free to ignore this check by always returning true if they are performing other data integrity validations at a lower level.
- Parameters:
oldCRC
- the prior CRC that was recorded during the first scan of the object from the pack stream.- Returns:
- true if the CRC matches; false if it does not.
-
onBeginWholeObject
protected abstract void onBeginWholeObject(long streamPosition, int type, long inflatedSize) throws IOException
Event notifying the start of an object stored whole (not as a delta).- Parameters:
streamPosition
- position of this object in the incoming stream.type
- type of the object; one ofConstants.OBJ_COMMIT
,Constants.OBJ_TREE
,Constants.OBJ_BLOB
, orConstants.OBJ_TAG
.inflatedSize
- size of the object when fully inflated. The size stored within the pack may be larger or smaller, and is not yet known.- Throws:
IOException
- the object cannot be recorded.
-
onEndWholeObject
protected abstract void onEndWholeObject(PackedObjectInfo info) throws IOException
Event notifying the current object.- Parameters:
info
- object information.- Throws:
IOException
- the object cannot be recorded.
-
onBeginOfsDelta
protected abstract void onBeginOfsDelta(long deltaStreamPosition, long baseStreamPosition, long inflatedSize) throws IOException
Event notifying start of a delta referencing its base by offset.- Parameters:
deltaStreamPosition
- position of this object in the incoming stream.baseStreamPosition
- position of the base object in the incoming stream. The base must be before the delta, thereforebaseStreamPosition < deltaStreamPosition
. This is not the position returned by a prior end object event.inflatedSize
- size of the delta when fully inflated. The size stored within the pack may be larger or smaller, and is not yet known.- Throws:
IOException
- the object cannot be recorded.
-
onBeginRefDelta
protected abstract void onBeginRefDelta(long deltaStreamPosition, AnyObjectId baseId, long inflatedSize) throws IOException
Event notifying start of a delta referencing its base by ObjectId.- Parameters:
deltaStreamPosition
- position of this object in the incoming stream.baseId
- name of the base object. This object may be later in the stream, or might not appear at all in the stream (in the case of a thin-pack).inflatedSize
- size of the delta when fully inflated. The size stored within the pack may be larger or smaller, and is not yet known.- Throws:
IOException
- the object cannot be recorded.
-
onEndDelta
protected PackParser.UnresolvedDelta onEndDelta() throws IOException
Event notifying the current object.- Returns:
- object information that must be populated with at least the offset.
- Throws:
IOException
- the object cannot be recorded.
-
-