documentviewer slow response due to high amount of files in cache folder
Problem reported by Alessandro Rossetto - January 14 at 2:29 AM
Being Fixed
I'm experiencing slow response time using DocumentViewer.

I found that if I manually delete the DocumentCache folder the application performance return back to normal.
I have ~ 20.000 cached document with CacheMaxDays set to 7.

I reduced it to 1 now, but cannot test it in production right now because I need to wait the next maintenance window to install the update. But with some statistics I will still have ~ 4.000 cached documents per day.

The disk IO performance is not bad at all and doesn't get worst when the files number increases, so this makes me think the problem is in the cache implementation. 
Do you enumerate all the files in DocumentCache folder every time a document preview is requested?

Thanks
Alessandro Rossetto

2 Replies

Reply to Thread
1
Cem Alacayir Replied
Employee Post
Hi,
Yes, it’s probably caused by DocumentCache implementation because it needs to list the folders multiple times to find the cached file. DocumentCache uses a subfolder structure with unique IDs in their names but for reference purposes it also includes the original input file name such as:

 ~DOC Sample File.doc~zt2ple 

This is so that you can find and delete a specific cache file visually/manually in Windows File Explorer when you need, i.e. when you want to force re-conversion and re-caching of a specific input file. As the subfolder name is not exact (the original input file name part is not constant/known as the unique ID, same input files with different file names generate the same unique ID), DocumentCache needs to enumerate all subfolders and then find the subfolder which includes the unique id at the end of its name (ignores original input file name part). This is normally fast on physical file system (not so in Azure or S3) but I guess it may slow down with 20.000 subfolders even on physical disk. 

I think we should give up using non-exact subfolder names (subfolder name should be only the unique ID for fast finding of the subfolder as it was in early versions of DocumentUltimate) and then for reference purpose we can have a dummy file with the original input file name inside the subfolder (so that the user can search for that specific file and find the related cache subfolder).

We will optimize this and let you know soon.


0
Alessandro Rossetto Replied
Thank you very much, Cem.
I look forward to download the updated version.

Regards
Alessandro Rossetto

Reply to Thread