Uploaded image for project: 'XNAT'
  1. XNAT
  2. XNAT-5951

Duplicate file names cause incomplete download when attempting to include subject/project in file path

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.7.4.1, 1.7.5.1, 1.7.5.2
    • Fix Version/s: 1.7.5.3
    • Component/s: ZIP Downloader
    • Labels:
    • Rank:
      0|0hzz0v:

      Description

      Note: I have verified that this affects 1.7.4.1 and 1.7.5.1.

      To recreate this:

      1. Upload sample1 to a project.
      2. Create a new session resource under the sample1 session.
      3. Upload the attached file to the newly created resource. Confirm that you want to extract the zip.
      4. At this point, the resource should look like the attached screenshot.
      5. Note the existence of both A/A.txt and NESTED/D/A.txt within the resource.
      6. Download > Download Images.
      7. Keep everything checked for download in column 2 on the download page.
      8. Choose "Option 2: ZIP download", and check either/both of the "Include project/subject in file paths" boxes.
      9. Initiate the download.
      10. The zip downloads instantaneously. That's because:
        1. All of the study data is missing.
        2. There's only three files: A.txt, B.txt, C.txt. (One of the A.txt files has been lost)

      Notes:

      1. I have no idea why this only seems to happen if you check "Include project in file paths" or "Include subject in file paths" (or both).
      2. Note that since the duplicate files are in separate directories, it shouldn't even be a problem in the downloaded zip. This is demonstrated by keeping both boxes unchecked for project/subject, and the download works just fine.
      3. When this happens, this is logged:

        2019-02-07 23:52:12,612 [ERROR threadPoolExecutorFactoryBean-26] org.nrg.xnat.web.http.AsyncLifecycleMonitor.logLifecycleEvent():64 - An exception occurred during asynchronous processing. Request for task of type org.springframework.web.servlet.mvc.method.annotation.StreamingResponseBodyReturnValueHandler$StreamingResponseBodyTask: HTTP request to URI /xapi/archive/download/admin-20190207_235211/zip by user 'admin'
        java.util.zip.ZipException: duplicate entry: Sample_Patient/Sample_ID/TEST/A.txt
                at java.util.zip.ZipOutputStream.putNextEntry(ZipOutputStream.java:232)
                at org.nrg.xnat.web.http.AbstractZipStreamingResponseBody.writeTo(AbstractZipStreamingResponseBody.java:191)
                at org.springframework.web.servlet.mvc.method.annotation.StreamingResponseBodyReturnValueHandler$StreamingResponseBodyTask.call(StreamingResponseBodyReturnValueHandler.java:106)
                at org.springframework.web.servlet.mvc.method.annotation.StreamingResponseBodyReturnValueHandler$StreamingResponseBodyTask.call(StreamingResponseBodyReturnValueHandler.java:93)
                at org.springframework.web.context.request.async.WebAsyncManager$5.run(WebAsyncManager.java:332)
                at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
                at java.util.concurrent.FutureTask.run(FutureTask.java:266)
                at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
                at java.lang.Thread.run(Thread.java:748)
        

      1. Screen Shot 2019-02-07 at 5.43.48 PM.png
        61 kB

        Activity

        Hide
        jrherrick@wustl.edu Rick Herrick added a comment -

        The issue was caused when the catalog checked the archive entry path against the back-end archive path in order to find the unique resource path. The problem is when the subject label is included, since that doesn't appear in the archive path. I add a check for the subject label and removed it when matching against the archive path. All permutations of options, including project and subject labels in path and simplified vs. not simplified, give the same result as far as the actual files downloaded, with no issues when file name is not unique.

        The fix for this is in the xnat-web repo in the branch fixes/XNAT-5951.

        Show
        jrherrick@wustl.edu Rick Herrick added a comment - The issue was caused when the catalog checked the archive entry path against the back-end archive path in order to find the unique resource path. The problem is when the subject label is included, since that doesn't appear in the archive path. I add a check for the subject label and removed it when matching against the archive path. All permutations of options, including project and subject labels in path and simplified vs. not simplified, give the same result as far as the actual files downloaded, with no issues when file name is not unique. The fix for this is in the xnat-web repo in the branch fixes/ XNAT-5951 .

          People

          • Assignee:
            moore.c@wustl.edu Charlie Moore
            Reporter:
            moore.c@wustl.edu Charlie Moore
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: