Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing case if extra directory is added #17

Open
chinyeungli opened this issue Nov 8, 2017 · 6 comments
Open

Failing case if extra directory is added #17

chinyeungli opened this issue Nov 8, 2017 · 6 comments
Assignees
Labels
Milestone

Comments

@chinyeungli
Copy link
Contributor

I did some simple tests and here is my finding:

I use "balloontip-1.1.1.jar" as a sample file.

  1. Created 2 directories d1/ and d2/ and put the test file in it and then compare these 2 directories. The output is unchanged which is correct.

  2. Same setup as (1) but create a new subdirectory named test/ under d1/ and put balloontip-1.1.1.jar in it.
    Both the
    d1/balloontip-1.1.1.jar
    d1/test/balloontip-1.1.1.jar
    are returned as added.

and the d2/balloontip-1.1.1.jar is returned as removed

which is not correct as the d1/balloontip-1.1.1.jar and d2/balloontip-1.1.1.jar should return unchanged while the d1/test/balloontip-1.1.1.jar is consider as added.

  1. Same setup as (1) but create a new root/ directory and put the d1/ in it and run the deltacode from root/ to d2/. The output is unchanged which is correct.
@steven-esser
Copy link
Contributor

The functionality you are looking for is relating to another ticket, #4

Since you’ve nested the jar file one level lower the second time, deltacode infers this as a removal followed by an addition, when in reality that file has just been moved to another location.

I’ll keep this ticket open for reference and close it when #4 gets taken care of.

@chinyeungli
Copy link
Contributor Author

@MaJuRG not exactly the same issue.

For instance,
In (1),
I have
/d1/balloontip-1.1.1.jar
/d2/balloontip-1.1.1.jar
Then, the deltacode return d1 and d2 is the same which is correct.

in case (2)
I have
/d1/balloontip-1.1.1.jar
/d1/test/balloontip-1.1.1.jar
/d2/balloontip-1.1.1.jar

which I expect the deltacode will tell me the /d1/balloontip-1.1.1.jar and /d2/balloontip-1.1.1.jar are the same and /d1/test/balloontip-1.1.1.jar is new.
However, the tool tells me BOTH
/d1/balloontip-1.1.1.jar
/d1/test/balloontip-1.1.1.jar
are added
and
/d2/balloontip-1.1.1.jar
is removed.

So, my question is why adding a new directory make the "suppose to be the same" originally become "added/removed"

@johnmhoran
Copy link
Member

@MaJuRG Here's what we have after scanning and running DeltaCode on these 3 pairs of test codebases. The results don't seem to be entirely consistent. Putting aside the fact that we currently treat files as moved only when there's a single identical added and removed file, I think the inconsistent treatment arises at least in part from the way we remove path segments during the fix_trees()/align_trees() process.

  1. Rename the directory from d1 to d2.
    image

DeltaCode treats this as unmodified.

{
  "deltacode_version": "0.0.1.beta",
  "deltacode_stats": {
    "added": 0,
    "modified": 0,
    "moved": 0,
    "removed": 0,
    "unmodified": 1
  },
  "deltas": [
    {
      "category": "unmodified",
      "path": "balloontip-1.1.1.jar",
      "name": "balloontip-1.1.1.jar",
      "type": "file",
      "size": 53842
    }
  ]
}
  1. Add a test subdirectory to d1, also containing balloontip-1.1.1.jar.
    image

DeltaCode treats this as 2 removed, 1 added.

{
  "deltacode_version": "0.0.1.beta",
  "deltacode_stats": {
    "added": 1,
    "modified": 0,
    "moved": 0,
    "removed": 2,
    "unmodified": 0
  },
  "deltas": [
    {
      "category": "added",
      "path": "balloontip_new_test_subdirectory_new/d2/balloontip-1.1.1.jar",
      "name": "balloontip-1.1.1.jar",
      "type": "file",
      "size": 53842
    },
    {
      "category": "removed",
      "path": "balloontip_new_test_subdirectory_old/d1/balloontip-1.1.1.jar",
      "name": "balloontip-1.1.1.jar",
      "type": "file",
      "size": 53842
    },
    {
      "category": "removed",
      "path": "balloontip_new_test_subdirectory_old/d1/test/balloontip-1.1.1.jar",
      "name": "balloontip-1.1.1.jar",
      "type": "file",
      "size": 53842
    }
  ]
}
  1. Add a root directory above d1.
    image

DeltaCode treats this as unmodified.

{
  "deltacode_version": "0.0.1.beta",
  "deltacode_stats": {
    "added": 0,
    "modified": 0,
    "moved": 0,
    "removed": 0,
    "unmodified": 1
  },
  "deltas": [
    {
      "category": "unmodified",
      "path": "balloontip-1.1.1.jar",
      "name": "balloontip-1.1.1.jar",
      "type": "file",
      "size": 53842
    }
  ]
}

@Pratikrocks
Copy link
Collaborator

Pratikrocks commented Apr 11, 2021

@chinyeungli @johnmhoran @MaJuRG how about we ignore the align scans for now, after the integration of VirtualCodebse?
Maybe we could follow up in a separate branch apart from the main branch.

Let's consider a directory structure as :
New Directory
New directory

Old Directory
Old Directory

Now if the a1.py is having the same sha1 we are treating them as unchanged files, we completely ignore that their main directories are different. Mainly owing to (alignscans / fix trees).

But instead of that what if we do not allow changing the main root directory?
Now what I propose is we should also treat them as per their main root directory (not just their sub dir).
If we do so we do not need the extra burden of aligning the scans .

The scans will be aligned as they are loaded from the Virtual Codebase as a resource objects.
We can safely ignore all aligning.

Now for a file to have the status as unmodified it must have the same path(full path) along with the same sha1.
for a file of status moved it must exist in some other subdirs in new_scan along with that it must have the same sha1.
And so on for other status ....

And also the codebase would be a lot cleaner than now

@chinyeungli @johnmhoran @MaJuRG need your views upon this

@steven-esser
Copy link
Contributor

@Pratikrocks I agree with removing/ignoring alignment for the first implementation of adding virtualcodebase.

@Pratikrocks
Copy link
Collaborator

Okay

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants