SPLevo/Difference Analysis
New URL: https://github.com/kopl/SPLevo/wiki/Difference-Analysis
Reason: SPLevo has been published on GitHub.
The SPLevo Differnece Analysis identifies the differences between the extracted software models.
It produces a difference model specific to source code changes, implemented based on the Eclipse EMF Compare infrastructure.
Diffing Strategy
The diffing makes use of typical model differencing phases specified by Xing et al.. They have proposed to use:
- MatchingPhase to find corresponding elements in the compared models
- DiffingPhase to find differences (attributes, references, etc.) in the corresponding or unmached elements
- Postprocessor model clean up etc.
False Positives vs. False Negatives
- False Positives
- A difference is detected that can be ignored
- False Negative
- A difference has been filtered but actually is relevant
General rule: False positives are better than false negatives.
The latter carries the risk of invalid differences and downstream consolidation processes. This must be prevented.
False positives potentially lead to more review effort for the product line engineer which reduces the advantage of the approach.
However, this is more acceptable compared to invalid results.
Nested Diffs
The difference analysis will not return any nested diffs.
For example, the following code variants:
try{
return "hello";
} catch (Exception iea){
return "iea";
}
and
try{
return "hello";
} catch (RuntimeException e){
return "e";
}
Will result in a single change for the catch block. The nested change of the return statement will not be registered separately.
This is done to work with plain variation points later on.
If nested changes exist, the place to handle this is the postDiff() hook method of the EMFCompare IPostProcessor interface. For example, in the JaMoPP Cartridge, the class is named JaMoPPPostProcessor
Similarity Decisions (General)
More information about model / technology specific similarity decisions can be found in the according technology cartridges.
if-Statements
If and else if statements are treated to be similar if they are at the same code location (container) and have the same condition (expressions).
Variable Declarations
Variables are treated as similar if their name and location is similar. The variable type is not considered as it is treated as a degree of freedom. The following statements are matched but handled as modified:
String var1 = new String();
Object var1 = new String();
Diffing JaMoPP Cartridge
Configuration
Resources
The JaMoPP extractor loads all Java files contained the selected project directories. It also registers all contained jar archives in it's class path for reference resolving.
During the difference analysis, the JaMoPP scope filter excludes classifier registered in the classpath only (i.e. identified by URI starting with pathmap:/javaclass/. These are typically external libraries or java runtime libraries not available as source code. Hence not reasoanble to diff as part of code consolidation.
In addition, a configuration option is available to exclude specific files based on their file name (e.g. package-info.java).
Classifier Normalization
Option key for classifier normalization. The purpose of this option is to normalize the classifier of the integration variant. This is applied to the Classifier Software Element as well as to the Java source file names.
Example:
If the original code contains a classifier named:
MyClass
and the customized code contains a classifier named:
MyClassCust
the normalization pattern can be used to remove the suffix from the classifier.
One rule is specified per line.
Each line specifies a prefix or suffix to replace.
The arbitrary part of the classifiers name to keep is identified with an astrix '*'.
Examples:
To remove the suffix "Custom" from the name "MyClassCustom" the pattern must be specified as "*Custom".
To remove the prefix "My" from the name "MyBaseClass" the pattern must be specified as "My*".
Diffing MoDisco Cartridge
DifferenceKind
The DifferenceKind for the difference model extension is implemented in Java2KDMDiffExtensionImpl.getKind().
The generated method is customized and marked as @generated not
It checkes the type of the current instance and returns an according DifferenceKind.
Loop Statements
Loop statements are a special case of the statements which need to be treated more specific than other statements. They have no name to identify but are concerned by the ordering of statements. That means, deciding if two loop statements are the same is more challenging. For example:
- Is a while loop still the same when it's condition has changed but not it's body?
- Is a while loop still the same if it's position has changed?
While the answers depend on the specific case, they should both lead to recognized change which must be handled later on.
else if
In MoDisco AST models, IfStatements with multiple "else if" Statements are represented as trees an not as a plain sequence of statements. Having a chain of "else if"s and a change in the middle of such a chain results in a completely changed subtree even if only on statement has been inserted or modified in the middle of the chain.
According to this, if multiple changes have been performed at different locations within the "else-if"-chain, only for the top most one, a StatementDelete and a StatementInsert are created in the difference model.
When one or more "else if"s have been added to the end of the chain, only on StatementInsert is created.